Tuesday, 10 September 2019

Adding joystick control

Foreword

In the previous post we managed to display the Intro screen for our game Dan Dare within our C64 FPGA implementation.

As with many other C64 games, to actually start the game you need to press fire on a joystick. Since our emulator doesn't feature any joystick at the moment, the purpose of this post will be to add functionality to emulate a joystick.

For the joystick we will just use the Numeric Keypad on the USB keyboard attached to the Zybo Board.

We will also just be focusing on implementing joystick port #2 of the C64, since this is the port the game Dan Dare uses.

How Joystick port#2 is wired to the C64

A good start for this post would be to see how Joystick port#2 is connected on a real C64.

The following snippet of a schematic from http://www.zimmers.net:


As you can see, joystick port#2 share wires on Port A of CIA#1 with the keyboard.

This setup of shared wires between joystick and keyboard immediately reminds us of a anomaly of joystick port#1, where moving the joystick also type characters on the screen.

One might tend to wonder why Joystick port#2 doesn't have the same effect. The answer is because we read the keyboard from port B on the CIA, which is connected to the row pins of the keyboard connector.

With no key been pressed on the keyboard, all pins would just remain high on port B of CIA#1. This is obviously by passed by Joystick #1 which is also connected to port B of CIA#1, which can pull down selected lines to zero, which the C64 will read as key presses.

Pulling down selected lines via Joystick port#2, will not have the same effect. With no keys been pressed on the keyboard, these pulled down lines would simply not propagate to port B of CIA#1.

Implementing Joystick port#2 in our C64 module

In the previous section we mentioned the concept of pulling low a line on either port A or port B on CIA#1. Thus, the keyboard and Joysticks on the C64 follows the philosophy of active when low.

Another feature of port A and port B of CIA#1 is that each pin of those ports is bidirectional.

This leaves us with the question: How do you implement a bidirectional pin in an FPGA?

One might think: Sure, instead of declaring a port pin on a module as either input or output, you can just declare the bidirectional port as inout.

You can indeed create a Verilog module with inout ports. However, as soon as you may be try to connect these ports to other Verilog modules in your design, you might end up running in circles.

This is because inout pins is really only meant for pins going to the outside world, for instance if you want to implement a I2C port on your FPGA.

The FPGA synthesis tools doesn't like it at all if you try to utilise inout ports for internal use.

So in our CIA module we would need to split our bidirectional ports into two separate ports each:

module cia(
  output [7:0] port_a_out,
  input [7:0] port_a_in,
  output [7:0] port_b_out,
  input [7:0] port_b_in,
...
    );
...

Next, we need to make a small adjustment when we read from Port A or B:

...
  always @(posedge clk)
  if(!we)
  case (addr)
    0: data_out <=  ~((~slave_reg_0 & slave_reg_2) | 
                     ~port_a_in);
                   
    1: data_out <= ~((~slave_reg_1 & slave_reg_3) | 
                     ~port_b_in);
...

Let us try and understand what is going on here.

When we read from either port A or B, a low value can either be caused by the input port, or via the corresponding output port (e.g. slave_reg_0 or slave_reg_1).

Also, the corresponding output port is enabled by either slave_reg_2 or slave_reg_3.

The combined effect of a input and output port resembles that of an OR operation, with the inputs inverted. For this reason we are doing all the negations.

Next, let us hook up port A and port B of our CIA instances:

...
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_a_in({3'b111, joybits}),
          .port_b_in(keyboard_result),
...
            );
...
    cia cia_2(
          .port_a_out(cia_2_port_a),
          .port_a_in(8'b11111111),
...
            );
...

First, we hook up the five bits of our joystick to port a of CIA#1.

We also connect port A of CIA#2 to eight ones. We use the lower two bits of this port for the VIC-II banking bits. It is therefore crucial that we keep the relevant input bits high, so that the contents of the VIC-II bits doesn't get lost during bitwise operations.

Serving the joystick bits from AXI slave


Currently our AXI Slave block have two slave registers indicating which keys were pressed. Each bit position in these two registers represent the actual C64 key scan code of the key pressed.

In a similar fashion we can add a third register where each bit position represent the current posistion of the joystick, as well as whether the fire button is pressed.

Currently slave register 2 (e. g. address 0x43c0_0008), only have about three bits utilised for tape operation. So, we can just use some unused bits in this register for our joystick bits. 

We will use bits 4 to 8 of this register for the joystick bits. This falls on a nybble boundary, making it convenient to see the joystick bits when you are debugging and you see the register contents in hexadecimal format.

To wire up the joystick bits from the AXI slave to our C64 module, we would follow the same approach as we previously performed to enable keyboard access for our C64 module. I will therefore not be going into detail on this.

Redirecting Numeric Pad as Joystick bits


As mentioned earlier, we will be using the numeric pad of the USB keyboard as a joystick.

You might remember from a previous post that in order to interface a USB keyboard to our C64 module, we basically catch the USB scan codes, convert it to C64 key scan codes, and setting the relevant bit (or bits if more than one pressed simultaneously) at either address 0x43c0_0000 or 0x43c0_0004.

The C64 keyboard can produce key scan codes in the range 0 to 63. We can reuse our USB scan code -> C64 scan code routine by basically using scan codes 64 upwards for our joystick bits:

u32 mapUsbToC64(int usbCode) {
 if (usbCode == 0x4) { //A
  return 0xa;
 } else if (usbCode == 0x5) { //B
  return 0x1c;
 } else if (usbCode == 0x6) { //C
  return 0x14;
 } 

...
        } else if (usbCode == 0x28) { //enter
  return 0x1;
 } else if (usbCode == 0x2c) { //space
  return 0x3c;
 } else if (usbCode == 0x36) { //comma
  return 0x2f;
 } else if (usbCode == 53) { //play key `~
  return 100;
 } else if(usbCode == 96) { //up joystick
  return 64;
 } else if(usbCode == 90) { //down joystick
  return 65;
 } else if(usbCode == 92) { //left joystick
  return 66;
 } else if(usbCode == 94) { //right joystick
  return 67;
 } else if(usbCode == 98) { //fire joystick
  return 68;
 }
}


We invoke this method as follows:

void getC64Words(u32 usbWord0, u32 usbWord1, u32 *c64Word0, u32 *c64Word1, u32 *c64Word2) {
  *c64Word0 = 0;
  *c64Word1 = 0;
  *c64Word2 = 0;

  if (usbWord0 & 2) {
   *c64Word0 = 0x8000;
  }

  usbWord0 = usbWord0 >> 16;

  for (int i = 0; i < 2; i++) {
   int current = usbWord0 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else if (scanCode < 64) {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     } else {
        *c64Word2 = *c64Word2 | (1 << (scanCode - 64));
     }

   }

   usbWord0 = usbWord0 >> 8;
  }

  for (int i = 0; i < 4; i++) {
   int current = usbWord1 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else if(scanCode < 64) {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     } else {
      *c64Word2 = *c64Word2 | (1 << (scanCode - 64));
     }

   }

   usbWord1 = usbWord1 >> 8;
  }

}


We have introduced a third word c64Word2. This will be the word we will use to populate the joystick bits at address 0x43c0_0008.

Next, we need to update our old state_machine() method (our mini USB stack method) as shown by the following snippet:

void state_machine() {
...
  u32 toggle = Xil_In32(qTDAddressCheck+8) & 0x80000000;
  if (!(Xil_In32(qTDAddressCheck + 8) & 0x80)) {
   u32 word0 = Xil_In32(0x305000);
   u32 word1 = Xil_In32(0x305004);
   if (word0 == 0) {
    Xil_Out32(0x43c00000, 0);
    Xil_Out32(0x43c00004, 0);
    u32 joy = Xil_In32(0x43c00008) | 0x1f0;
    Xil_Out32(0x43c00008, joy);
   } else {
    //u32 bit = mapUsbToC64((word0 >> 16) & 0xff);
    //bit = 1 << bit;
    u32 c64Word0 = 0;
    u32 c64Word1 = 0;
    u32 c64Word2 = 0;
    getC64Words(word0, word1, &c64Word0, &c64Word1, &c64Word2);
    c64Word2 = ~c64Word2 & 0x1f;
    c64Word2 = c64Word2 << 4;
    /*if (bit < 32) {
     c64Word0 = 1 << bit;
    } else {
     c64Word1 = 1 << (bit - 32);
    }*/

    Xil_Out32(0x43c00000, c64Word0);
    Xil_Out32(0x43c00004, c64Word1);
    u32 tempJoy = (Xil_In32(0x43c00008) & 0xf) | c64Word2;
    Xil_Out32(0x43c00008, tempJoy);
    //Xil_In32(0x305004);
   }
... 


}


Basically we start with word0 and word1, which show the usb scan codes of the gets that is currently been pressed.

If no key is pressed (e.g. word0 == 0), we just set bit 4 to 8 of address 0x43c0_0008 to ones.

The End Results

The following video shows what happens when we press the fire button when we are at the intro screen of the game Dan Dare:


It faintly resembles the game as I remember, though garbled and frozen!

What we are missing here is implementing Raster interrupts for everything to render correctly, which we will cover in the next post.

In Summary

In this post we managed to implement a joystick in C64 module by utilising the numpad on the USB keyboard.

With our Joystick we managed to transition from the Intro screen to the actual, although our emulator froze at the this point.

In the next post we will be implementing Raster interrupts so that the game screen can render properly.

Till next time!

Thursday, 29 August 2019

IO Area Bank Switching

Foreword

In the previous two posts we did some development that will enable us the display of the associated splash screen while our C64 FPGA implementation loads the game Dan Dare from tape.

This development included implementing Multicolor bitmap mode as well as implementing VIC-II memory banking.

You might have noticed that throughout this Blog Series I am trying to follow more or less the same approach as I did in my Blog series where I created a C64 emulator in JavaScript. In this old Blog Series, I also mentioned at one point, that in order to completely load the game and get to the intro screen, we need to implement IO area banking.

The reason for this is because, during Game loading, we also write to the RAM area below the IO peripheral area (e.g. addresses D000-E000). If you do not implement the Banking of the IO area properly, you might end up with some weird side effects that is painful to debug.

IO Banking for reading

Let us start by implementing IO Banking for reading. That is either we enable reading of IO register in the region D000-DFFF, or we enable reading from the RAM region underneath.

As with the Kernel ROM and BASIC ROM, the banking of the IO region is controlled by memory address 1.

To familiarise ourselves with when the IO region is enabled, we consult the C64 memory map at http://sta.c64.org/cbm64mem.html.

The following extract shows us wehn th IO region is enabled:

Bits #0-#2: %1xx: I/O area visible at $D000-$DFFF. (Except for the value %100, see above.) 
We can convert this required into Verilog:

assign io_enabled = reg_1_6510[2] && !(reg_1_6510[1:0] == 0);

To either read from the IO region or the RAM below it, we write the following code:

    always @*
        casex (addr_delayed)
          16'b1: combined_d_out = {reg_1_6510[7:5], tape_button, reg_1_6510[3:0]};
          16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
          16'b111x_xxxx_xxxx_xxxx : if (reg_1_6510[1])
                                      combined_d_out = kernel_out;
                                    else
                                      combined_d_out = ram_out;
          16'hd020, 16'hd021, 16'hd011,
          16'hd016, 16'hd018 : combined_d_out = io_enabled ? vic_reg_data_out : ram_out;
          16'hd012: combined_d_out = io_enabled ? line_counter : ram_out;
          16'b1101_10xx_xxxx_xxxx: combined_d_out = io_enabled ? color_ram_out : ram_out;
          16'hdcxx: combined_d_out = io_enabled ? ((addr_delayed == 16'hdc00) ? 255 : cia_1_data_out) : ram_out;
          16'hddxx: combined_d_out = io_enabled ? cia_2_data_out : ram_out;
          default: combined_d_out = ram_out;
        endcase

So, if io_enabled is false we just return ram_out for the io register read request.

Writing to IO Registers

Next, let us handle writing to IO register banking:

assign color_ram_write_enable = we & io_enabled & (addr >= 16'hd800 & addr < 16'hdbe8);
    
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_b_in(keyboard_result),
          .addr(addr[3:0]),
          .we(we & io_enabled & (addr[15:8] == 8'hdc)),
          .clk(clk_1_mhz),
          .chip_select(addr[15:8] == 8'hdc & io_enabled),
          .data_in(ram_in),
          .data_out(cia_1_data_out),
          .flag1(flag1 & !flag1_delayed),
          .irq(irq)
            );

    cia cia_2(
          .port_a_out(cia_2_port_a),
          .addr(addr[3:0]),
          .we(we & io_enabled & (addr[15:8] == 8'hdd)),
          .clk(clk_1_mhz),
          .chip_select(addr[15:8] == 8'hdd & io_enabled),
          .data_in(ram_in),
          .data_out(cia_2_data_out)
            );

vic_test_3 vic_inst
    (
        .clk_in(clk),
        .clk_counter(clk_div_counter),
        .clk_2_mhz(clk_2_mhz),
        .blank_signal(blank_signal),
        .frame_sync(frame_sync),
        .data_in({color_ram_out2,vic_combined_d}),
        .c64_reset(c64_reset),
        .addr(vic_addr),
        .out_rgb(out_rgb),
        .clk_1_mhz(clk_1_mhz),
        .addr_in(addr[5:0]),
        .reg_data_in(ram_in),
        .we(we & io_enabled & (addr == 16'hd020 | addr == 16'hd021 | addr == 16'hd011
             | addr == 16'hd016 | addr == 16'hd018)),
        .data_out(vic_reg_data_out)

        );


These code changes will block writes to IO registers if io_enabled is false.

You might have noticed that we unconditionally write to the RAM region underneath the IO region. This causes unpredictable behaviour when we load the game Dan Dare from the Tape image.

We disable writes to this RAM region when io_enabled == false with the following code:

...
assign do_io_write = io_enabled & ((addr >= 16'hd000) && (addr < 16'he000));
...

     always @ (posedge clk_1_mhz)
       begin
        if (we & !do_io_write) 
        begin
         ram[addr] <= ram_in;
         ram_out <= ram_in;
        end
        else 
        begin
         ram_out <= ram[addr];
        end 
       end 
...


You will notice that when io_enabled is false, that we exclude RAM writes for the full IO region, even though we currently have IO register gaps in our current C64 implementation. This have the effect that writing to these gaps will effectively be discarded.

This behaviour will luckily not cause any issues in our scenario.

The End Result

The following video shows the end result of our development


We managed to get to the Intro screen after the loading of the game.

The credits is moving very fast, but we will fix this in a future post.

In Summary

In this post we have implemented banking functionality for the IO region. This enabled our emulator to load the Game Dan Dare completely from a tape image and display the Intro screen.

In the next post we will implement Joystick functionality.

Till next time!

Friday, 23 August 2019

Implementing VIC-II bank switching

Foreword

In the previous post we have implemented Multicolor Bitmap mode within our VIC-II module.

As a test we got the VIC-II to render the Loading Splash Screen of the game Dan Dare. In this test we hardwired the upper two bits of the VIC-II address to 1's because the splash screen data was located in VIC bank 3 (e.g. RAM address 49152-65535).

Obviously if you want to simulate the whole loading sequence from the time the C64 boot up, this hardcoding will not do the trick, and we will need to implement the full VIC-II bank switching with the aid of the DD00 register.

Therefore, in this post we will spend some time to implement this VIC-II bank switching. We will then check when we load the game, that the splash screen will eventually be displayed.

Implementing VIC-II bank switching

VIC-II bank switching is driven by peripheral register DD00, by the lower two bits. This website gives us more information about these bits:

Bits #0-#1: VIC bank. Values:
  • , 0: Bank #3, $C000-$FFFF, 49152-65535.
  • %01, 1: Bank #2, $8000-$BFFF, 32768-49151.
  • %10, 2: Bank #1, $4000-$7FFF, 16384-32767.
  • %11, 3: Bank #0, $0000-$3FFF, 0-16383.
So, these bits actually makes out bits 15 and 14 of the VIC-II address, inverted though.

Address DD00 forms part of CIA#2. So, let us start by instantiating an extra CIA instance and mapping it into the address space of our 6502:

    cia cia_2(
          .port_a_out(cia_2_port_a),
          .addr(addr[3:0]),
          .we(we & (addr[15:8] == 8'hdd)),
          .clk(clk_1_mhz),
          .chip_select(addr[15:8] == 8'hdd),
          .data_in(ram_in),
          .data_out(cia_2_data_out)
            );

    always @*
        casex (addr_delayed)
          16'b1: combined_d_out = {reg_1_6510[7:5], tape_button, reg_1_6510[3:0]};
          16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
          16'b111x_xxxx_xxxx_xxxx : if (reg_1_6510[1])
                                      combined_d_out = kernel_out;
                                    else
                                      combined_d_out = ram_out;
          16'hd020, 16'hd021, 16'hd011,
          16'hd016, 16'hd018 : combined_d_out = vic_reg_data_out;
          16'hd012: combined_d_out = line_counter;
          6'h26: combined_d_out = color_ram_out;
          16'hdcxx: combined_d_out = cia_1_data_out;
          16'hddxx: combined_d_out = cia_2_data_out;
          default: combined_d_out = ram_out;
        endcase


We can now append bits 1 and 0 of DD00 (e.g. cia_2_port_a) to the top of the address generated by the VIC-II. In our current design, the final VIC-II address is called portb_add, and is defined as follows:

assign portb_add = (portb_reset_counter < 3900000) ? 
    portb_reset_counter[15:0] : {2'b0,vic_addr};

As you can see, previously we just padded bits 14 and 15 of the VIC-II address with zeros.

You might remember that the portb_reset_counter maneuver was an ugly hack to get the second port of our Block RAM to work properly. More on this later.

To finish off our memory mapping for our VIC-II, we still need to properly map the character ROM.

From various documentation on the Internet, we see that the Character ROM is visible in Bank 0 and 2 within the address range $1000-$2000 and $9000-$a000 respectively.

The following code takes care of mapping the Character ROM into the address space of the VIC-II:

if ((vic_addr[13:12] == 2'b1) & cia_2_port_a[0])
  vic_combined_d = chargen_out;
else
  vic_combined_d = ram_out_2;

Battle of the Block RAMS

With all the changes done as described in the previous section, I waited with much anticipation while the Bitstream was generated.

When I tested the resulting Bitstream, I was was greeted with a blank screen. What a disappointment!

My first instinct was that something went wrong again with the AXI modules, causing that pixel information couldn't flow between SDRAM and our VIC-II module.

However, when I inspected the framebuffer area in SDRAM (via the XSCT console), I found that the whole framebuffer area was filled with zeroes. This indicates that all AXI modules and our VGA module is functioning properly.

First of all, we know that writing to SDRAM is working ok since the framebuffer area is filled zeroes. If writing to SDRAM wasn't functioning properly, we would have seen random values within the RAM area.

Secondly, we know that reading pixel data from SDRAM is also functioning correctly. A framebuffer filled with zeroes will yield black pixels, which we are getting (e.g. a blank screen).

There is however a very faint clue on what might be the issue. The frame is also colored in black instead of the usual light blue. The rendering of the border pixels should cause the least amount problems and potentially indicates that our 6502 failed to set the Border color register of our VIC-II module.

Has the 6502 perhaps crashed because the contents of the Block RAM has become corrupted? I did after all experienced previously an issue with a dual port Block RAM configuration where the VIC-II simply couldn't get the correct data omn the one port. It was for this  issue I had to do the portb_reset_counter maneuver as mentioned in the previous section.

Going though the Xilinx community forums I found a clue on this AR#: https://www.xilinx.com/support/answers/34699.html

This post start with the heading:

Block Memory Generator v3.3 - Spartan-6 Block RAM Read First mode address space overlap ("collision") issue with simultaneous Read and Write may cause memory corruption
This sounds like something that might be of interest to us. The following paragraph is of particular interest:

A read/write on one port and a write operation from the other port at the same address is not allowed. This restriction on the operation of the block RAM should not be ignored.
In our our case it can easily happen that the 6502 writes to an address while the VIC-II is reading from the same address.

To understand when this can happen, let us look a timing diagram for a typical write operation to block RAM in our C64 module:


The top signal is the 1MHz clock signal and the 2Mhz signal at the bottom. The write enable signal is in the centre.

The write signal is active during pulses 1,2 and 3. During these periods the possibility also exist that the address on both ports can be the same, which can be conditions for memory corruption as described earlier.

To avoid these address collisions, we need to do the following:

  • Ensure the Write Enable signal is low during the 2MHz clock pulses
  • During the 1Mhz clock pulses we should ensure that the addresses on both ports of the Block RAM are different.
We implement these requirements with the following code:

assign we = ((clk_div_counter != 6) && (clk_div_counter != 2)) ? we_raw : 0;

assign portb_raw = {~cia_2_port_a[1:0], vic_addr};
assign portb_add = (clk_div_counter == 7) ? {portb_raw[15:1],~addr[0]} : portb_raw;

We disable the Write Enable signal before every 2MHz clock pulse. The we_raw should be connected to our 6502 module.

The portb_raw and portb_add assignments ensure the addresses of the two ports are different during 1MHz clock pulses. We accomplish this by just substituting the least significant bit of port bit with the inverse of the LSB of port A.

The End Result

With all the code changes, our C64 module now shows the Splash screen correctly during loading.

The following video shows the loading process up to and until the Splash screen appears:


In Summary

In this Blog we created all the glue logic, so that our VIC-II could work with Bank switching.

We also managed to display the Splash screen during the loading of the Game Dan Dare.

In the next post we will check how much further the loading process can get, and fixing issues that might arise.

Till next time!


Wednesday, 7 August 2019

Implementing Multicolor Bitmap Mode

Foreword

In the previous post we added Color RAM to our C64 FPGA design.

In this post we will implement the Multicolor Bitmap Mode within our VIC-II module. This will enable us to render the Splash screen of Dan Dare correctly during the loading process.

To verify the resulting development, I will also be developing a Test Bench in this post to simulate test our VIC-II module in isolation.

Needless to say, the ultimate test would be to see if our Test Bench would be able to render the Dan Dare Splash screen to an image file.

To do this test, we would need to have the image data of the splash screen in our C64 main memory as well as in the Color RAM.

Thus, in this post I will also illustrate how to use the VICE Commodore emulator to extract the image data for the Splash Screen.

Extracting the Image data for the Splash Screen

Let us start by tackling the goal of extracting the image data of the Splash screen.

As mentioned previously we will use the Vice Commodore emulator for this purpose.

We start off by kicking off the loading of the game Dan Dare.

As soon as the Splash screen has been loaded completely, activate the builtin Monitor. We then issue a couple of memory dump commands:


The first memory location to look at is location $DD00. Bits 0 and 1 is the upper two bits of the memory address of the VIC-II, inverted. This result to 11. This results to the last 16KB bank of RAM, which is address range 49152-65535.

Next, we should find out where to look for the image data. The answer to this is memory location D018. The bits in this location is layout as follows:

Bit 7: VM13
Bit 6: VM12
Bit 5: VM11
Bit 4: VM10
Bit 3: CB13
Bit 2: CB12
Bit 1: CB11
Bit 0: -

The bit number staring with VM is the base address of the Video Memory or Screen memory.

The Bit numbers starting with CB is the base address fro the Character Image data. In high resolution modes, which is in our case the case, it is the location for the bitmap data.

From this informstion we see that screen memory starts at $C000 and the bitmap data data at $E000.

The next challenge is to extract the image data into a file that our Test Bench/VIC-II module can use.

The easiest way to this is to to save the state of our running emulator at this point as a snapshot, and to extract the relevant portions from the snapshot.

Vice stores the snapshot as a *.vsf file. When you open this file in a HEX editor, you can identify the relevant sections with header names:



In this example we can see that the 64KB section starts with the header name C64MEM. It is not complete clear where the actual memory starts. We get more info from this in the VICE documention:

Type Name Description
BYTE CPUDATA CPU port data byte
BYTE CPUDIR CPU port direction byte
BYTE EXROM state of the EXROM line (?)
BYTE GAME state of the GAME line (?)
ARRAY RAM 64k RAM dump

Keep in mind that this gets preceded by a header of 22 bytes.

The next piece of information to extract is the Color RAM. The Vice documentation and doing some seraches on the Internet doesn't yield any particular information on where the Color RAM is stored in the vsf file.

Eventually I find the answer by looking into the source code of Vice. Within the file src/vicii/vicii-snapshot.c I found the following:


So, the Color RAM in located indeed within the VIC-II module section in the vsf file.

To convert this info in a format suitable for our Test bench, we just paste the relevant HEX data into a Text Editor and replace all spaces with newlines.

The current plumbing of our VIC-II module

It has been a while since we look into the inner workings of our VIC-II module. I therefore think it would be a good idea for us to do a quick refresher on this so that we can end up with a better idea on how to implement the Multicolor Bitmap mode.

To give us a baseline as a reference, I have created the following diagram:


In this diagram we have basically zoomed into the first three character rows, and on each row I am only showing the first two characters. The area on the left in solid purple represent the border.

You can see that on each line we already start reading while still in the border area.

You can also see that we are only reading the character codes only at the first pixel row of each character row. In fact, when we get the character codes, we are storing it in a 40 byte buffer. For the remaining pixel rows we are getting the character codes from this buffer.

Let us move into a bit more detail on our VIC-II module by looking at some important signals:


If both the visible_vert and visible_horiz signals are high, it means we are in the region of the screen in which we are drawing characters.

Typically we would store the character code to our 40 byte buffer when clk_counter is cycle#3, and load the pixel_shift_register at the end of cycle#7.

You might notice that in this diagram we only only shifting every second clock cycle. Here I am actually trying to show how we would shift the pixels during multi-color mode, in which each pixel is 2 pixels wide.

Implementing Multicolor Bitmap Mode

Let us now continue to implement Multicolor Bitmap Mode.

The following register bits are important to implement for this mode:

  • Register D011: Bit 5: Bitmap mode
  • Register D016: Bit 4: Multicolor Mode
  • Register D018: Memory pointers
To implement these register bits, we would follow more or less the same process as in previous posts, so I will not go into detail on how to implement these register bits.

Next, let us see how to retrieve and dispatch pixel data for Multicolor Bitmap mode.

We start by generating addresses for retrieving pixel data:

...
wire [13:0] bit_data_pointer;
...
assign bit_data_pointer = screen_control_1[5] ?
                   {mem_pointers[3],(char_line_pointer + char_pointer_in_line),char_line_num}
                 : {mem_pointers[3:1],char_buffer_out[7:0],char_line_num};
...

Here we cater for generating pixel data addresses for both Standard Text Mode and High Resolution mode, which is determined by bit 5 of Screen Control Register #1.

For Standard Text mode we use the character codes in screen memory to determine the address within the Character ROM.

In High Resolution mode, however, we use the pointer to the current location in Screen Memory to assemble the address for retrieving pixel data.

We also need to modify the code that shifts out the pixel data:

   always @(posedge clk_in)
   if (clk_counter == 7)
     pixel_shift_reg <= data_in;
   else begin
     if (screen_control_2[4] & (clk_counter[0]))
       pixel_shift_reg <= {pixel_shift_reg[5:0],2'b0};
     else if (!screen_control_2[4]) 
       pixel_shift_reg <= {pixel_shift_reg[6:0],1'b0};
   end


So, for multicolor mode we shift two bits at a time.

To create the actual multicolor pixel, we add the following statement:

always @*
  case (pixel_shift_reg[7:6])
    2'b00: multi_color = background_color;
    2'b01: multi_color = char_buffer_out_delayed[7:4];
    2'b10: multi_color = char_buffer_out_delayed[3:0];
    2'b11: multi_color = char_buffer_out_delayed[11:8];
  endcase


For bit combinations 01, 10 and 11 we use the value we previously retrieved from Color RAM and Screen RAM. It is therefore important that you buffer these values, so it is is available for the full 8 pixels.

Creating the Test Bench

Our Test bench for the VIC-II will look very similar to our existing C64 module's interface with the ViC-II.

Make sure that both the main 64KB RAM and Color RAM is connected to the VIC-II module. Both mentioned memories should also contain the image data of the splash screen, as we discussed earlier on.

Next we should write an initialisation block for setting the VIC-II registers so that it can show the Splash Screen in Multicolor Bitmap mode:

initial begin
  #50 we_vic_ii = 1;
  addr_in = 6'h11;
  reg_data_in = 8'h30;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;
  
  #20; 
  we_vic_ii = 1;
  addr_in = 6'h20;
  reg_data_in = 8'he;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h21;
  reg_data_in = 8'h6;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h16;
  reg_data_in = 8'h10;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h18;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h20;
  reg_data_in = 8'hb;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h21;
  reg_data_in = 8'hd;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

end


Next, we should write an initial block for saving the pixel output of our VIC-II module to an image file:

initial begin  
  f = $fopen("/home/johan/result.ppm","w");
  $fwrite(f,"P3\n");
  $fwrite(f,"404 284\n");
  $fwrite(f,"255\n");
  i = 0;
  while (i < 114736) begin
    @(posedge clk)
    #2;
    if (!blank_signal)
    begin
      $fwrite(f,"%d\n", rgb[23:16]);
      $fwrite(f,"%d\n", rgb[15:8]);
      $fwrite(f,"%d\n", rgb[7:0]);
      i = i + 1;
    end
  end
  $fclose(f);
end

Here we create a PPM, where we store the pixel values in plain text. We precede  the pixel data with the information regarding the resolution of the image and the max value per color component.

The Final Result

Here is a picture of the final simulated result:


This is a bit of motivation that we are on the right track.

In Summary

In this post we implemented the Multicolor Bitmap Mode within our VIC-II module. As a simulation test we checked whether our Test Bench could create the loading Splash screen of the game Dan Dare.

In the next post we will link up our modified VIC-II module to our real FPGA and see if the Splash screen will also be shown when loading the game from the .TAP image.

Till next time!

Monday, 15 July 2019

Adding Color RAM

Foreword

In the previous post we managed to emulate the flashing borders when the game Dan Dare loads on a C64.

Our next goal would be to display the Splash screen while the game loads. To do this we need to add some more functionality to our VIC-II core.

A fundamental block that is missing from our C64 design is color RAM, that is used to give the color for characters. Color RAM also plays an important role in rendering our Splash screen.

So, in this post we will implement color RAM and integrate it to our VIC-II core.

Defining Color RAM as Block RAM

We start off by defining some Verilog code that will synthesise our color RAM into Block RAM elements:

...
    reg [3:0] color_ram [999:0];
    reg [3:0] color_ram_out;
...
     always @ (posedge clk_1_mhz)
       begin
        if (color_ram_write_enable) 
        begin
         color_ram[addr[9:0]] <= ram_in[3:0];
         color_ram_out <= ram_in[3:0];
        end
        else 
        begin
         color_ram_out <= color_ram[addr[9:0]];
        end 
       end 
...

We directly connect to the Data Out pin of our CPU via the ram_in wire.

Since the Color RAM only contain a thousand items, we only need to connect the lowest 10 bits of the address bus. This makes it crucial to only enable writing within the correct address range:

assign color_ram_write_enable = we & (addr >= 16'hd800 & addr < 16'hdbe8);

Next, we should add  second port to our Color RAM for providing information to our VIC-II module:

...
    reg [3:0] color_ram_out2;
...
    always @ (posedge clk_2_mhz)
       begin
         color_ram_out2 <= color_ram[portb_add[9:0]]; 
       end 
...

Finally we need to hook up to the VIC-II module:

vic_ii vic_inst
    (
...
        .data_in({color_ram_out2,vic_combined_d}),
...
        );

The data in bus to a VIC-II is 12 bits wide and the upper 4 bits contains the color information.

Test Results

I did a quick test to see if our C64 module can render the colors stored in Color RAM correctly.

For this exercise I run a couple of POKE commands to write color values directly to COLOR RAM.

The results are as follows:


In Summary

In this post we have added COLOR RAM to our C64 design and verified that our VIC-II module render the colors correctly.

In the next post we will continue to add more functionality to our VIC-II module in order to display the splash screen.

Till next time!

Saturday, 13 July 2019

Flashing Borders

Foreword

In the previous post we managed to load the tape header from a .TAP file and display the file found from it and display the file found on the screen, with all these actions performed by the KERNEL ROM.

This basically proofs that we have implemented out Tape module correctly. If one really want to be nostalgic, one could actually hook up a 1530 datasette to our design, provided you get the level shifting right, and it should work.

Our C64 FPGA module in its current state doesn't really support the full graphical capabilities of the VIC-II. In fact, currently it can only render characters stored in the default memory location (e.g. addresses 1024 to 2023), with hard coded border and background colors, which are light and dark blue.

This means that if we load a classic game within our C64 module, it will probably load, but we not be able to see the fancy colourful graphic effects.

The way to approach this problem is to pick a classic C64 game you like and gradually implement the graphic capabilities the game require, till you can perfectly play the game.

The game I have picked was Dan Dare, Pilot of the Future.

We will start by implementing the graphical capabilities that the loading of the game Dan Dare requires.

When you load the game Dan Dare from tape, as with many other games of the era, you will be presented with flashing borders as well as a splash screen in multi color high resolution mode.

In this post we will look into implementing the flashing borders.

The display of the splash screen we will implement in the next post or two.

VIC-II register access

When the game Dan Dare loads, the effect of flashing borders is achieved by writing alternating colors in a rapid fashion to memory location 53280.

Memory location 53280 is indeed a register within the VIC-II. At this point I need to stop the discussion in its tracks: Our VIC-II module doesn't even provide access register access at the moment!

So, let us start by implementing functionality within our C64 module for accessing registers from the VIC-II.

Firstly, we need to add some extra ports to our VIC-II module:

module vic_ii
(
  input clk_1_mhz,
...
  input [5:0] addr_in,
  input [7:0] reg_data_in,
  input we,
  output [7:0] data_out,
...
    );


For addr_in we take the lower 6 bits of of the address output from our Arlet 6502 core.

For reg_data_in we are taking the Data Out, also from our Arlet 6502 core.

The data_out we need to add as an extra input to our data multiplexer, which we will discuss a bit later.

As mentioned, we are only using the lower six bits of the address bus. For reading this is not a problem at all. The multiplexer will simply ignore data from our VIC-II module if we didn't requested data from the VIC-II module.

Writes are more of a problem. We only want to write to a VIC-II register if it was really the intent, of course. For this we need the need the we port, which we connect as follows:

vic_ii vic_inst
    (
...
        .we(we & (addr == 16'hd020 | addr == 16'hd021 | addr == 16'hd011)),
...
        );


For now we are only doing a write for three of the VIC-II registers: d020, d021 and d011. For the remaining registers, we are just going to write to main RAM. This just makes our life easier for now.

You will also realise that we are sending the 1Mhz clock signal to our VIC-II module. This is because the read/writes will be performed by our 6502 module, which operates at 1Mhz.

You might recall that big chunks of our VIC-II module operates at 8 MHZ, which in turn might let you think for a moment: Two different clock speeds again, so do we need again cater for cross clock domains?

Luckily not! Remember that in our design our 1MHZ clock is not a pure 1MHZ with a 50% duty cycle. We derive our 1MHz by taking our 8MHz, and for each 8 cycles we are masking out 7 cycles and enabling one.

This means that our 1MHz signal pulse will always be in sync with a 8Mhz pulse, although our 1Mhz will have a weird duty cycle of 12.5%

Finally, let us change our multiplexer logic to cater for reads from our VIC-II registers:

    always @*
        casex (addr_delayed)
...
          16'hd020, 16'hd021, 16'hd011: combined_d_out = vic_reg_data_out;
...
          default: combined_d_out = ram_out;
        endcase


Recall that we connect combined_d_out to the Data In on our Arlet 6502 core.

We have now given our CPU the capability to access VIC-II registers. We now need to implement the mentioned registers within the VIC-II.

Equipping our VIC-II with registers

Let us now implement the three registers within our VIC-II module:

...
reg [7:0] data_out_reg;
reg [7:0] screen_control_1 = 0;

reg [3:0] border_color = 0;
reg [3:0] background_color = 0;
...
assign screen_enabled = screen_control_1[4];

assign data_out = data_out_reg;

always @(posedge clk_1_mhz)
     case (addr_in)
       6'h20: data_out_reg <= {4'b0,border_color};
       6'h21: data_out_reg <= {4'b0,background_color};
       6'h11: data_out_reg <= screen_control_1;
     endcase
     
always @(posedge clk_1_mhz)
begin
  if (we & addr_in == 6'h20)
    border_color <= reg_data_in[3:0];
  else if (we & addr_in == 6'h21)  
    background_color <= reg_data_in[3:0];
  else if (we & addr_in == 6'h11)
    screen_control_1 <= reg_data_in[7:0];
end
...

We implement the register data_out_reg for reading purposes. In our C64 module reads are always delayed by one clock cycle, so the data_out_reg performs this task for us.

You might wonder why we have implement the screen control register if, in this post, we only care about border and background color. This is just to cater for the scenario where the screen gets blanked, in which case the full screen gets covered with flashing borders.

Having each VIC-II register declared individually seems quite like a cumbersome process. One might be tempted to think that it would be easier to define all registers together as an array which will resolve to a Block RAM element during synthesis.

This is a very valid point. However, one needs to keep in mind that a Block RAM element can only allow up to two simultaneous memory access operations. The VIC-II needs more than two simultaneous accesses, just to mention a few:

  • Border colour/Background color
  • Screen control
  • X raster pos
  • Y raster pos
It is therefor better to specify register separately.

Finally we need to use these registers for color generation:

   assign color_for_bit = pixel_shift_reg[7] == 1 ? 
             current_front_color : background_color;
   assign final_color = (visible_vert & visible_horiz & screen_enabled) ? 
             color_for_bit : border_color;


Let us quickly refresh ourselves with the above snippet of code.

pixel_shift_reg is a byte shift register where we shift out a bit at a time. Where the current bit is zero, we output the background_color as the color to display.

We use final_color to alternate between the border color and color_for_bit where applicable. I have also added screen_enabled to the mix, where the screen gets blanked totally with the border color if display is disabled.

The need to disable KERNEL ROM

With the registers implemented, I still couldn't see the flashing borders.

I encountered a similar issue when I developed a C64 emulator in JavaScript.

When I read the above mentioned blog post, I remembered that the issue was that the loader Dan Dare override the IRQ vector at address FFFF/FFFE.

The problem comes in that one needs to disable the KERNEL ROM to expose this new vector from RAM, which I haven't implemented yet.

The switching in/out of ROMS from address space is controlled by register 1. We already worked with register 1 in the previous post where we had to implement motor control and tape button status.

I changed the logic a bit for reading and writing to register 1:

...
    reg [7:0] reg_1_6510 = 8'h37;

    assign motor_control = reg_1_6510[5];

    always @(posedge clk_1_mhz)
    if (we & (addr == 1))
      reg_1_6510 <= ram_in;
...
    always @*
        casex (addr_delayed)
          16'b1: combined_d_out = {reg_1_6510[7:5], tape_button, reg_1_6510[3:0]};
...
          16'b111x_xxxx_xxxx_xxxx : if (reg_1_6510[1])
                                      combined_d_out = kernel_out;
                                    else
                                      combined_d_out = ram_out;
...
        endcase
...

I have created an 8-bit register for memory location 1 called reg_1_6510. For starters, I am now feeding the motor control bit from bit 5 of this register.

We are keeping our eyes on bit one of reg_1_6510. If this bit is 1, we return the contents of kernel ROM and the contents of the underlying RAM if this bit is 0.

The End Result

The following video show the end result when we load the game Dan Dare from a .TAP file. Flashing borders appear:


In Summary

In this post we gave the capability for our C64 module to change the border and background color.

This enabled us to emulate the flashing borders when we load the game Dan Dare.

In the next post we will do some development that will enable us to see the Splash screen during the loading process.

Till next time!

Wednesday, 19 June 2019

Integrating Tape Interface with C64 Module: Part 5

Foreword

In the previous post we have implemented interrupts within our CIA module.

At last we are ready to integrate our Tape module to our C64 module, which we will cover in this post.

We will be using the tape image for the game Dan Dare.

The ultimate goal for this post doesn't sound so exciting: To be able to found the Header on the mentioned Tape image and showing Found Dan Dare.

We will continue using the Tape image of Dan Dare in coming posts, and we will work our way through first getting the flashing borders and splash loading screen to display, till actually to a point where we can play the game.

FLAG1 as an edge triggered interrupt

If we were to take the output of our Tape module and connect it directly to the FLAG1 input of our CIA module, our FLAG1 will basically act as a level interrupt.

On a conventional CIA chip, however, all interrupts are edge triggered interrupts. We therefor need to make a small modification, so that our FLAG1 input behaves as an edge interrupt:

...
reg flag1_delayed;
...
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_b_in(keyboard_result),
          .addr(addr[3:0]),
          .we(we & (addr[15:8] == 8'hdc)),
          .clk(clk_1_mhz),
          .chip_select(addr[15:8] == 8'hdc),
          .data_in(ram_in),
          .data_out(cia_1_data_out),
          .flag1(flag1 & !flag1_delayed),
          .irq(irq)
            );

    always @(posedge clk_1_mhz)
      flag1_delayed <= flag1;
...

Tape Button and motor control

If you ever loaded a game from a tape on a real C64, you will know that the C64 has control over the Cassette motor: The tape briefly pauses when a file header is found, as well as when the file has finished loading.

How can the C64 control the tape motor in this way? The answer is via bits 4 and 5 from memory location 1. Quoted from this:


  • Bit #4: Datasette button status; 0 = One or more of PLAY, RECORD, F.FWD or REW pressed; 1 = No button is pressed.
  • Bit #5: Datasette motor control; 0 = On; 1 = Off.

In order for our C64 module to emulate tape loading, we also need to emulate the above mentioned bits.

We will use the ~/` button on the USB keyboard to represent the play button for Bit 4. We will hook up this key to our C64 in a coming section.

To serve above mentioned bots, we need to define an input port and an output port on our c64 module:

module c64(
...
  input wire tape_button,
  output reg motor_control = 1,
...
    );


The motor control bit gets set as follows:

    always @(posedge clk_1_mhz)
    if (we & (addr == 1))
      motor_control <= ram_in[5];


To cater for reads for location 1, we need to adjust our memory read case statement as follows:

    always @*
        casex (addr_delayed)
          16'b1: combined_d_out = {ram_out[7:6], motor_control, tape_button, ram_out[3:0]};
          16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
          16'b111x_xxxx_xxxx_xxxx : combined_d_out = kernel_out;
          16'hd012: combined_d_out = line_counter;
          16'hdcxx: combined_d_out = cia_1_data_out;
          default: combined_d_out = ram_out;
        endcase

We have now defined our motor control functioanlity within our C64 module, but we still need to route it to our Tape module:

module tape(
  input clk,
  input clk_1_mhz,
  input restart,
  input reset,
  output [31:0] ip2bus_mst_addr,
  output [11:0] ip2bus_mst_length,
  input [31:0] ip2bus_mstrd_d,
  output [4:0] ip2bus_inputs,
  input [5:0] ip2bus_otputs,
  input motor_control,
  output pwm
    );
...
tape_pwm t_pwm(
                  .time_val(timer_val),
                  .load_timer(load_timer),
                  .pwm(pwm),
                  .motor_control(motor_control),
                  .clk(clk_1_mhz)
                    );
..

In our pwm module we basically want to pause the generation of pwm pulses when the motor control bit is high:

...
  always @(posedge clk)
  if (timer > 0 & !motor_control)
    timer <= timer - 1;
  else if (timer == 0)
    timer <= load;
..

Assigning a keyboard button as play button

I mentioned earlier that I want to use the ~/` key on the USB keyboard as our play button.

You will recall from a previous post that we interface with a USB keyboard with a minimalistic USB protocol stack, where we convert USB key scan codes to C64 scancodes and then setting the appropriate bit in a key array linked to our C64 module.

We start by adding a mapping for the ~ key to the big if-else statement:

u32 mapUsbToC64(int usbCode) {
 if (usbCode == 0x4) { //A
  return 0xa;
 } else if (usbCode == 0x5) { //B
  return 0x1c;
 } else if (usbCode == 0x6) { //C
  return 0x14;
...
...
 } else if (usbCode == 0x2c) { //space
  return 0x3c;
 } else if (usbCode == 53) { //play key `~
  return 100;
 }


}


The USB key scancode for the required key is 53 and we map this code 100.

But, C64 scan codes only goes up to 63, so what is with the 100?

The point is that the play key is not actually key in the key bit array that we want to set. We rather use it as a special value:

void getC64Words(u32 usbWord0, u32 usbWord1, u32 *c64Word0, u32 *c64Word1) {
  *c64Word0 = 0;
  *c64Word1 = 0;

  if (usbWord0 & 2) {
   *c64Word0 = 0x8000;
  }

  usbWord0 = usbWord0 >> 16;

  for (int i = 0; i < 2; i++) {
   int current = usbWord0 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     }

   }

   usbWord0 = usbWord0 >> 8;
  }

  for (int i = 0; i < 4; i++) {
   int current = usbWord1 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     }

   }

   usbWord1 = usbWord1 >> 8;
  }

}


So if, during the conversion process from USB to C64 scan code we encounter a 100, we set the value zero to memory location 0x43c00008.

Remember that 0x43c00008 is a register mapped to our AXI slave interface. Just to recap from a previous post:

0x43c0_0000: key array 1
0x43c0_0004: key array 2
0x43c0_0008: tape control:
                bit 0: PWM bit (READ ONLY). For debug
                bit 1: Reset tape position to address 0x238270 in memory

For register 0x43c0_0008 we will introduce an extra function for bit 2: Tape button. Our Slave AXI block needs to be modified to accommodate this bit so it can be surfaced to the tape_button input of our C64 module.

Test Run

Time for a test run.

As usual you will open Xilinx SDK, program the FPGA and start our C program to run on the ARM processor. In our case the C program will run a mini USB stack, capturing keystrokes and forwarding to our C64 module.

With the C program running, we need to open the XSCT console, and execute a couple of commands.

First we need to pause our Tape module and reset the read address:

mwr 0x43c00008 6

Next we should load the tape image into memory:

mwr -size b -bin -file "/home/johan/Downloads/Dan Dare.tap" 0x238250 600000

Next we release the reset bit:

mwr 0x43c00008 4

You can now move to the USB keyboard and follow the conventional process for kicking off the loading of a tape on a C64. That is typing LOAD and when prompted to press play on tape, just press ~.

Currently our VIC-II doesn't support screen blanking, as you would see when loading a program from tape. However, we can see the prompts as something is loading:


In Summary

In this post we finally integrated our Tape Module into our C64 Module.

As a test we managed to load a file header from a .tap file and display the file name.

In the next post we will implement the necessary functionality for displaying the loading effects while loading the game Dan Dare.

Till next time!