C64 on an FPGA: September 2019

Tuesday, 24 September 2019

Raster Interrupts and Multicolor Textmode

Foreword

In the previous post we fixed the blank screen by applying the necessary clock constraints.

In this post we will focus on implementing Raster Interrupts and Multicolor Textmode.

Implementing Raster Interrupts

To implement Raster Interrupts, let us start by looking at all the registers that is involved with Raster Interrupts:

Bit 7 of D011: This is bit 8 of the Raster counter
D012: This is bit 7 - 0 of the Raster counter
Bit 0 of D019: Interrupt status bit of the Raster Interrupt
Bit 0 of D01A: If interrupts for Raster is enabled

When we read the bits from the Raster counter, we are basically returning the values of the y_pos registers. So, we can implement the read operation as follows:

always @(posedge clk_1_mhz)
     case (addr_in)
...
       6'h11: data_out_reg <= {y_pos[8],screen_control_1[6:0]};
       6'h12: data_out_reg <= {y_pos[7:0]};
...
     endcase

When we write to the Raster counter bits, we are writing to a compare value register, that the VIC-II uses to compare the Raster count to see when it is time to trigger a Raster Interrupt.

We implement the writing to this register as follows:

...
reg [7:0] rasterline_ref = 0;
...
always @(posedge clk_1_mhz)
begin
...
  else if (we & addr_in == 6'h12)
    rasterline_ref <= reg_data_in[7:0];
...
end
...

For bit 8 of the raster compare register we just use bit 7 of screen_control_1 register, and we just combine when we want to use the value.

The first part of implementing the Raster Interrupt would be to implement the compare operation:

assign is_equal_raster = {screen_control_1[7],rasterline_ref} == y_pos_real[8:0];

Next, let us implement the Status register for the Raster interrupt:

...
reg raster_int = 0;
...
always @(posedge clk_1_mhz)
if (we & (addr_in == 6'h19))
  raster_int <= raster_int & ~reg_data_in[0];
else
  raster_int <= raster_int | (is_equal_raster);
...
always @(posedge clk_1_mhz)
     case (addr_in)
...
       6'h19: data_out_reg <= {7'h0,raster_int};
...
     endcase
...

As you can see in the code above, to clear a raster interrupt that occurred, you would write a one to bit 0 of address D019.

There is small anomaly with the code we have just written. is_equal_raster will be one for the full direction of the line. This means that as soon as we clear the interrupt in raster_int, it will just set the interrupt at the next clock cycle.

To fix this, we need to set the interrupt only on the edge transition of is_equal_raster:

always @(posedge clk_1_mhz)
is_equal_raster_delayed <= is_equal_raster;

always @(posedge clk_1_mhz)
if (we & (addr_in == 6'h19))
  raster_int <= raster_int & ~reg_data_in[0];
else
  raster_int <= raster_int | (is_equal_raster & !is_equal_raster_delayed);

Next, we should output this interrupt from our VIC-II module, gated via an interrupt enable bit:

module vic_test_3
(
...
  output irq, 
...
    );
...
reg [7:0] int_enabled = 0;
...
assign irq = raster_int & int_enabled[0];
...
always @(posedge clk_1_mhz)
     case (addr_in)
...
       6'h1a: data_out_reg <= int_enabled;
     endcase
...
always @(posedge clk_1_mhz)
begin
...
  else if (we & addr_in == 6'h1a)
      int_enabled <= reg_data_in[7:0];
end
...

This interrupt we will hook up to the interrupt port of our CPU module. Since we already have the IRQ of CIA#1 connected to this port, we combine both together with an or operation:

cpu mycpu ( clk_1_mhz, c64_reset, addr, combined_d_out, ram_in, 
                                       we_raw, (irq || irq_vic), 1'b0, 1'b1 );

Multicolor TextMode

In a previous post we have implemented Multicolor Bitmap Mode, so for a start let us summarise the different colors for a pixel pair for both Multicolor Bitmap mode as well as for Multicolor Text Mode.

For Multicolor Bitmap mode the colors are as follows:

00: Color stored at location D021
01: Bits 4-7 of the associated code in Screen memory
10: Bits 0-3 of the associated code in Screen memory
11: Associated value in Color RAM

For Multicolor Text Mode the colors are as follows:

00: Color stored in location D021
01: Color stored in location D022
10: Color stored in location D023
11: Associated value in Color RAM

There is something in addition that we should be aware of for bit value 11 for Multicolor Text mode. From the value from Color RAM in this scenario we can only use bit 0 - 2 for a color code, limiting us to only color codes 0 -7 in multicolor text mode.

Why can't we use bit 3 from color RAM in Multicolor Text mode? In multicolor Text mode bit 3 have a very interesting function. If we set bit 3 to a one, it means that we really want to use multicolor text mode for this character cell.

I am sure many of you have burst out laughing when reading the previous paragraph :-) Why would you need to double confirm when you want to draw a character cell in Multicolor Text mode? This is actually a nice feature if you want to draw some characters in normal text mode, giving your picture a bit more detail where required.

For now we are just going to keep this fact about bit 3 of Color RAM in the back of our heads till a bit later.

For the two Multicolor Modes we can create two combinational logic blocks for outputting the relevant color based on the pixel pair value:

always @*
  case (pixel_shift_reg[7:6])
    2'b00: multi_color_bitmap_mode = background_color;
    2'b01: multi_color_bitmap_mode = char_buffer_out_delayed[7:4];
    2'b10: multi_color_bitmap_mode = char_buffer_out_delayed[3:0];
    2'b11: multi_color_bitmap_mode = char_buffer_out_delayed[11:8];
  endcase

always @*
  case (pixel_shift_reg[7:6])
    2'b00: multi_color_text_mode = background_color;
    2'b01: multi_color_text_mode = extra_background_color_1;
    2'b10: multi_color_text_mode = extra_background_color_2;
    2'b11: multi_color_text_mode = {1'b0, char_buffer_out_delayed[10:8]};
  endcase

Please note, as explained earlier, we are outputting only bits 2 - 0 of the Color RAM for bit value 11 when in Multicolor TextMode.

We can now cmbine the two color values, depending on whether we are in text mode or bitmap mode:

assign multi_color = screen_control_1[5] ? multi_color_bitmap_mode : 
                       multi_color_text_mode;

We are just about done. However, we still need to consider the scenario of bit 3 of Color RAM when we are in Multicolor Text Mode. This is just an extension of the the check of whether we are in Multicolor mode for the current character cell:

assign multicolor_data = screen_control_2[4] && 
                !(!char_buffer_out_delayed[11] && !screen_control_1[5]);

We use this value as follows:

...
   always @(posedge clk_in)
   if (clk_counter == 7)
     pixel_shift_reg <= data_in;
   else begin
     if (multicolor_data & (clk_counter[0]))
       pixel_shift_reg <= {pixel_shift_reg[5:0],2'b0};
     else if (!multicolor_data) 
       pixel_shift_reg <= {pixel_shift_reg[6:0],1'b0};
   end
...
   assign color_for_bit = multicolor_data ? multi_color :    
            (pixel_shift_reg[7] == 1 ? char_buffer_out_delayed[11:8] : background_color);
...

This concludes the code we need to write for implementing Multicolor Text Mode

Test Results

Here is screenshot of the start of the game with all the changes applied so far as described in this post:

We are just about there. Only thing that is strange is a band of random pixels above the Dan Dare Title bar.

These random characters is caused by changing character maps a couple of Raster lines too early. So, the raster counts of our VIC-II module is a bit different than that of a real VIC-II.

It can become quite an exercise to troubleshoot the difference. For now I am only going to fiddle with the Raster count offset till the image looks ok.

The following snippet of code will subtract a given offset from the Raster count:

wire [9:0] y_pos_minus_offset;
wire [9:0] y_pos_real;

assign y_pos_minus_offset = {1'b0,y_pos} - 5;
assign y_pos_real = (y_pos_minus_offset > 400) ? (10'd312 + y_pos_minus_offset) : 
          y_pos_minus_offset;

In this snippet of code we have chosen a offset value of 5 (which was actually my final attempt giving the correct result). In this code we also cater for the scenario where we wrap around, in which case subtracting 5 will yield a two's complement negative number.

To deal with this two's complement manipulation I have also added an extra bit to both y_pos_minus_offset and y_pos_real to avoid overflow conditions.

You will then use y_pos_real in all places where you compare raster counts or where the 6502 read raster counts:

...
assign is_equal_raster = {screen_control_1[7],rasterline_ref} == y_pos_real[8:0];

always @(posedge clk_1_mhz)
     case (addr_in)
...
       6'h11: data_out_reg <= {y_pos_real[8],screen_control_1[6:0]};
       6'h12: data_out_reg <= {y_pos_real[7:0]};
...
     endcase

With this quick fix, our game screen render correctly.

The following video show a quick tour when walk through three screens and fighting with a Treen:

Our characters is still invisible because we haven't implemented sprites yet, but at least see the messages popping up when we encounter the enemy!

In Summary

In this post we have implemented Raster interrupts and Multicolor text mode.

This enabled us to render the background screen properly of the game, as well as moving around between screens.

Our characters is still invisible because we haven't implemented sprites yet.

In the next post we will implement sprites so that our characters can appear!

Till next time!

Saturday, 21 September 2019

Fixing a Blank Screen

Foreword

In the previous post we have implemented joystick control to our C64 FPGA module that enabled us to start the game from the Intro screen.

The game, however, started frozen and the screen looked garbled.

To fix this frozen garbled screen, my next goal was to implement Raster interrupts, as well as multicolor text mode.

While trying to implement Raster Interrupts and Multicolor Text mode, I was presented with a nasty suprise when the Zybo board started up with our C64 design: A Blank screen!

This Blank Screen proved indeed to be a challenge and a half to debug.

The Take Home from this exercise proved to be quite an important one for general FPGA design, so decided to dedicated this post on how this issue was resolved.

Background

You might recall once (quite a number of posts back), that our 6502 core just started crashing without any apparent reason.

After some careful investigation, I found that this crash was caused by a newly implemented clock divider for generating a 1MHz signal from a 8MHz signal by means of a binary counter.

The quirks of this matter was that as soon as you use a binary counter for a clock source, you should ensure you implement the correct constraints in your design. If you don't, you might end up with a noisy, spike clock signal causing all kinds of unexpected behaviour.

Since I fix the issue with the necessary constraints, I never really had the same issue again while I was adding more functionality to the design. That is, up until now.

I was making good progress with Raster interrupts until I was greeted by a blank screen while testing some of my changes on the Zybo board.

From similar issues in the past with our C64 FPGA design, a blank screen tells me that our 6502 module crashed earlier in the process, which in this case was caused again by a clocking constraint violation.

The Solution

I am going to try and explain how I managed to isolate this issue. Before we do, let us just recap on how I solved the clock divider issue originally, the surrounding technical details can make a bit more sense.

Here is the snippet of code for generating a 1MHz clock and a 2MHz clock in our design:

...
reg [2:0] clk_div_counter = 0;
...

    always @(posedge clk)
      clk_div_counter <= clk_div_counter + 1; 

    always @(negedge clk)
      clk_1_enable <= (clk_div_counter == 7);
...
    always @(negedge clk)
      clk_2_enable <= (clk_div_counter == 2) | (clk_div_counter == 6) ;

       BUFGCE BUFGCE_1_mhz (
       .O(clk_1_mhz),   // 1-bit output: Clock output
       .CE(clk_1_enable), // 1-bit input: Clock enable input for I0
       .I(clk)    // 1-bit input: Primary clock
    );

       BUFGCE BUFGCE_2_mhz (
       .O(clk_2_mhz),   // 1-bit output: Clock output
       .CE(clk_2_enable), // 1-bit input: Clock enable input for I0
       .I(clk)    // 1-bit input: Primary clock
    );
...

From the clk_div_counter, we are creating clock enable signals that drives a BUFGCE block, which is essentially a buffer.

A key feature of these two buffers is that is physically close to clocking circuits on the FPGA die. Also, the clock output pulse from these buffers can drive quite number flip flops, while maintaining a clean waveform.

Apart from this code, we still need to define some constraints:

create_generated_clock -name clkdiv1 -source [get_pins design_1_i/block_test_0/inst/BUFGCE_1_mhz/I0] 
                       -edges {1 2 17} [get_pins design_1_i/block_test_0/inst/BUFGCE_1_mhz/O]
create_generated_clock -name clkdiv2 -source [get_pins design_1_i/block_test_0/inst/BUFGCE_2_mhz/I0] 
                       -edges {7 8 15} [get_pins design_1_i/block_test_0/inst/BUFGCE_2_mhz/O]

So, where did things went wrong in our existing design? It all becomes clear when you have a look at the schematic view of the Synthesised design within Vivado. Have a look at the following extract of the schematic:

This corresponds more or less to the Verilog code shown earlier. Needless to say, both the registers clk_1_enable and clk_2_enable ended up as a flip-flops, which can be quickly identified by the triangle next to the C input.

Both the Data inputs is fed from a bit from the clk_div_counter. This is where things starts to get interesting. If we follow the schematic, we see that there is not really a direct path from clk_div_counter to the enable registers.

Instead, clk_div_counter enters the VIC-II module first, and goes past quite a number of Flip-flop and LUT elements. This signal then eventually exits the VIC-II module and it is only at this point where this signal enters the enable registers.

In short, it quite a long path between clk_div_counter and the enable registers. This is not the ideal, considering that we want to generate clock signals.

The question is: How do we shorten the length between clk_div_counter and the enable registers? The short answer is that we should create a duplicate register of clk_div_counter that is serving just the enable registers.

During optimisation, however, Vivado might remove our duplicate register and we will be back at square one. So, we need a way to tell Vivado to keep our duplicate register.

There is indeed an attribute we can use to keep our duplicate register declaration called equivalent_register_removal. We would use this attribute as follows:

...
(* equivalent_register_removal = "no" *)
    reg [2:0] clk_div_counter = 0;
(* equivalent_register_removal = "no" *)
    reg [2:0] clk_div_counter_cycle = 0;
...
    always @(posedge clk)
      clk_div_counter <= clk_div_counter + 1; 

    always @(posedge clk)
      clk_div_counter_cycle <= clk_div_counter_cycle + 1; 
...

So, the one register we would use for our clock divider and the other one for our VIC-II module.

This would solve our blank screen problem.

In Summary

Originally I intended for this post to implement Raster Interrupts as well as Multicolor text mode.

During the implementation of this functionality, I was faced with a blank screen when the Zybo started up with our C64 design.

The troubleshooting this Blank Screen proved to be quite a challenge, and I rather decided to dedicate this post on what the underlying problem was.

All in all this Blank screen was caused by timing constraints for our clock divider.

In the next post we will eventually get to the implementation of Raster interrupts and Multicolor text mode.

Till next time!

Tuesday, 10 September 2019

Adding joystick control

Foreword

In the previous post we managed to display the Intro screen for our game Dan Dare within our C64 FPGA implementation.

As with many other C64 games, to actually start the game you need to press fire on a joystick. Since our emulator doesn't feature any joystick at the moment, the purpose of this post will be to add functionality to emulate a joystick.

For the joystick we will just use the Numeric Keypad on the USB keyboard attached to the Zybo Board.

We will also just be focusing on implementing joystick port #2 of the C64, since this is the port the game Dan Dare uses.

How Joystick port#2 is wired to the C64

A good start for this post would be to see how Joystick port#2 is connected on a real C64.

The following snippet of a schematic from http://www.zimmers.net:

As you can see, joystick port#2 share wires on Port A of CIA#1 with the keyboard.

This setup of shared wires between joystick and keyboard immediately reminds us of a anomaly of joystick port#1, where moving the joystick also type characters on the screen.

One might tend to wonder why Joystick port#2 doesn't have the same effect. The answer is because we read the keyboard from port B on the CIA, which is connected to the row pins of the keyboard connector.

With no key been pressed on the keyboard, all pins would just remain high on port B of CIA#1. This is obviously by passed by Joystick #1 which is also connected to port B of CIA#1, which can pull down selected lines to zero, which the C64 will read as key presses.

Pulling down selected lines via Joystick port#2, will not have the same effect. With no keys been pressed on the keyboard, these pulled down lines would simply not propagate to port B of CIA#1.

Implementing Joystick port#2 in our C64 module

In the previous section we mentioned the concept of pulling low a line on either port A or port B on CIA#1. Thus, the keyboard and Joysticks on the C64 follows the philosophy of active when low.

Another feature of port A and port B of CIA#1 is that each pin of those ports is bidirectional.

This leaves us with the question: How do you implement a bidirectional pin in an FPGA?

One might think: Sure, instead of declaring a port pin on a module as either input or output, you can just declare the bidirectional port as inout.

You can indeed create a Verilog module with inout ports. However, as soon as you may be try to connect these ports to other Verilog modules in your design, you might end up running in circles.

This is because inout pins is really only meant for pins going to the outside world, for instance if you want to implement a I2C port on your FPGA.

The FPGA synthesis tools doesn't like it at all if you try to utilise inout ports for internal use.

So in our CIA module we would need to split our bidirectional ports into two separate ports each:

module cia(
  output [7:0] port_a_out,
  input [7:0] port_a_in,
  output [7:0] port_b_out,
  input [7:0] port_b_in,
...
    );
...

Next, we need to make a small adjustment when we read from Port A or B:

...
  always @(posedge clk)
  if(!we)
  case (addr)
    0: data_out <=  ~((~slave_reg_0 & slave_reg_2) | 
                     ~port_a_in);
                   
    1: data_out <= ~((~slave_reg_1 & slave_reg_3) | 
                     ~port_b_in);
...

Let us try and understand what is going on here.

When we read from either port A or B, a low value can either be caused by the input port, or via the corresponding output port (e.g. slave_reg_0 or slave_reg_1).

Also, the corresponding output port is enabled by either slave_reg_2 or slave_reg_3.

The combined effect of a input and output port resembles that of an OR operation, with the inputs inverted. For this reason we are doing all the negations.

Next, let us hook up port A and port B of our CIA instances:

...
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_a_in({3'b111, joybits}),
          .port_b_in(keyboard_result),
...
            );
...
    cia cia_2(
          .port_a_out(cia_2_port_a),
          .port_a_in(8'b11111111),
...
            );
...

First, we hook up the five bits of our joystick to port a of CIA#1.

We also connect port A of CIA#2 to eight ones. We use the lower two bits of this port for the VIC-II banking bits. It is therefore crucial that we keep the relevant input bits high, so that the contents of the VIC-II bits doesn't get lost during bitwise operations.

Serving the joystick bits from AXI slave

Currently our AXI Slave block have two slave registers indicating which keys were pressed. Each bit position in these two registers represent the actual C64 key scan code of the key pressed.

In a similar fashion we can add a third register where each bit position represent the current posistion of the joystick, as well as whether the fire button is pressed.

Currently slave register 2 (e. g. address 0x43c0_0008), only have about three bits utilised for tape operation. So, we can just use some unused bits in this register for our joystick bits.

We will use bits 4 to 8 of this register for the joystick bits. This falls on a nybble boundary, making it convenient to see the joystick bits when you are debugging and you see the register contents in hexadecimal format.

To wire up the joystick bits from the AXI slave to our C64 module, we would follow the same approach as we previously performed to enable keyboard access for our C64 module. I will therefore not be going into detail on this.

Redirecting Numeric Pad as Joystick bits

As mentioned earlier, we will be using the numeric pad of the USB keyboard as a joystick.

You might remember from a previous post that in order to interface a USB keyboard to our C64 module, we basically catch the USB scan codes, convert it to C64 key scan codes, and setting the relevant bit (or bits if more than one pressed simultaneously) at either address 0x43c0_0000 or 0x43c0_0004.

The C64 keyboard can produce key scan codes in the range 0 to 63. We can reuse our USB scan code -> C64 scan code routine by basically using scan codes 64 upwards for our joystick bits:

u32 mapUsbToC64(int usbCode) {
 if (usbCode == 0x4) { //A
  return 0xa;
 } else if (usbCode == 0x5) { //B
  return 0x1c;
 } else if (usbCode == 0x6) { //C
  return 0x14;
 } 

...
        } else if (usbCode == 0x28) { //enter
  return 0x1;
 } else if (usbCode == 0x2c) { //space
  return 0x3c;
 } else if (usbCode == 0x36) { //comma
  return 0x2f;
 } else if (usbCode == 53) { //play key `~
  return 100;
 } else if(usbCode == 96) { //up joystick
  return 64;
 } else if(usbCode == 90) { //down joystick
  return 65;
 } else if(usbCode == 92) { //left joystick
  return 66;
 } else if(usbCode == 94) { //right joystick
  return 67;
 } else if(usbCode == 98) { //fire joystick
  return 68;
 }
}

We invoke this method as follows:

void getC64Words(u32 usbWord0, u32 usbWord1, u32 *c64Word0, u32 *c64Word1, u32 *c64Word2) {
  *c64Word0 = 0;
  *c64Word1 = 0;
  *c64Word2 = 0;

  if (usbWord0 & 2) {
   *c64Word0 = 0x8000;
  }

  usbWord0 = usbWord0 >> 16;

  for (int i = 0; i < 2; i++) {
   int current = usbWord0 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else if (scanCode < 64) {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     } else {
        *c64Word2 = *c64Word2 | (1 << (scanCode - 64));
     }

   }

   usbWord0 = usbWord0 >> 8;
  }

  for (int i = 0; i < 4; i++) {
   int current = usbWord1 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else if(scanCode < 64) {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     } else {
      *c64Word2 = *c64Word2 | (1 << (scanCode - 64));
     }

   }

   usbWord1 = usbWord1 >> 8;
  }

}

We have introduced a third word c64Word2. This will be the word we will use to populate the joystick bits at address 0x43c0_0008.

Next, we need to update our old state_machine() method (our mini USB stack method) as shown by the following snippet:

void state_machine() {
...
  u32 toggle = Xil_In32(qTDAddressCheck+8) & 0x80000000;
  if (!(Xil_In32(qTDAddressCheck + 8) & 0x80)) {
   u32 word0 = Xil_In32(0x305000);
   u32 word1 = Xil_In32(0x305004);
   if (word0 == 0) {
    Xil_Out32(0x43c00000, 0);
    Xil_Out32(0x43c00004, 0);
    u32 joy = Xil_In32(0x43c00008) | 0x1f0;
    Xil_Out32(0x43c00008, joy);
   } else {
    //u32 bit = mapUsbToC64((word0 >> 16) & 0xff);
    //bit = 1 << bit;
    u32 c64Word0 = 0;
    u32 c64Word1 = 0;
    u32 c64Word2 = 0;
    getC64Words(word0, word1, &c64Word0, &c64Word1, &c64Word2);
    c64Word2 = ~c64Word2 & 0x1f;
    c64Word2 = c64Word2 << 4;
    /*if (bit < 32) {
     c64Word0 = 1 << bit;
    } else {
     c64Word1 = 1 << (bit - 32);
    }*/

    Xil_Out32(0x43c00000, c64Word0);
    Xil_Out32(0x43c00004, c64Word1);
    u32 tempJoy = (Xil_In32(0x43c00008) & 0xf) | c64Word2;
    Xil_Out32(0x43c00008, tempJoy);
    //Xil_In32(0x305004);
   }
... 


}

Basically we start with word0 and word1, which show the usb scan codes of the gets that is currently been pressed.

If no key is pressed (e.g. word0 == 0), we just set bit 4 to 8 of address 0x43c0_0008 to ones.

The End Results

The following video shows what happens when we press the fire button when we are at the intro screen of the game Dan Dare:

It faintly resembles the game as I remember, though garbled and frozen!

What we are missing here is implementing Raster interrupts for everything to render correctly, which we will cover in the next post.

In Summary

In this post we managed to implement a joystick in C64 module by utilising the numpad on the USB keyboard.

With our Joystick we managed to transition from the Intro screen to the actual, although our emulator froze at the this point.

In the next post we will be implementing Raster interrupts so that the game screen can render properly.

Till next time!