Wednesday, 7 August 2019

Implementing Multicolor Bitmap Mode

Foreword

In the previous post we added Color RAM to our C64 FPGA design.

In this post we will implement the Multicolor Bitmap Mode within our VIC-II module. This will enable us to render the Splash screen of Dan Dare correctly during the loading process.

To verify the resulting development, I will also be developing a Test Bench in this post to simulate test our VIC-II module in isolation.

Needless to say, the ultimate test would be to see if our Test Bench would be able to render the Dan Dare Splash screen to an image file.

To do this test, we would need to have the image data of the splash screen in our C64 main memory as well as in the Color RAM.

Thus, in this post I will also illustrate how to use the VICE Commodore emulator to extract the image data for the Splash Screen.

Extracting the Image data for the Splash Screen

Let us start by tackling the goal of extracting the image data of the Splash screen.

As mentioned previously we will use the Vice Commodore emulator for this purpose.

We start off by kicking off the loading of the game Dan Dare.

As soon as the Splash screen has been loaded completely, activate the builtin Monitor. We then issue a couple of memory dump commands:


The first memory location to look at is location $DD00. Bits 0 and 1 is the upper two bits of the memory address of the VIC-II, inverted. This result to 11. This results to the last 16KB bank of RAM, which is address range 49152-65535.

Next, we should find out where to look for the image data. The answer to this is memory location D018. The bits in this location is layout as follows:

Bit 7: VM13
Bit 6: VM12
Bit 5: VM11
Bit 4: VM10
Bit 3: CB13
Bit 2: CB12
Bit 1: CB11
Bit 0: -

The bit number staring with VM is the base address of the Video Memory or Screen memory.

The Bit numbers starting with CB is the base address fro the Character Image data. In high resolution modes, which is in our case the case, it is the location for the bitmap data.

From this informstion we see that screen memory starts at $C000 and the bitmap data data at $E000.

The next challenge is to extract the image data into a file that our Test Bench/VIC-II module can use.

The easiest way to this is to to save the state of our running emulator at this point as a snapshot, and to extract the relevant portions from the snapshot.

Vice stores the snapshot as a *.vsf file. When you open this file in a HEX editor, you can identify the relevant sections with header names:



In this example we can see that the 64KB section starts with the header name C64MEM. It is not complete clear where the actual memory starts. We get more info from this in the VICE documention:

Type Name Description
BYTE CPUDATA CPU port data byte
BYTE CPUDIR CPU port direction byte
BYTE EXROM state of the EXROM line (?)
BYTE GAME state of the GAME line (?)
ARRAY RAM 64k RAM dump

Keep in mind that this gets preceded by a header of 22 bytes.

The next piece of information to extract is the Color RAM. The Vice documentation and doing some seraches on the Internet doesn't yield any particular information on where the Color RAM is stored in the vsf file.

Eventually I find the answer by looking into the source code of Vice. Within the file src/vicii/vicii-snapshot.c I found the following:


So, the Color RAM in located indeed within the VIC-II module section in the vsf file.

To convert this info in a format suitable for our Test bench, we just paste the relevant HEX data into a Text Editor and replace all spaces with newlines.

The current plumbing of our VIC-II module

It has been a while since we look into the inner workings of our VIC-II module. I therefore think it would be a good idea for us to do a quick refresher on this so that we can end up with a better idea on how to implement the Multicolor Bitmap mode.

To give us a baseline as a reference, I have created the following diagram:


In this diagram we have basically zoomed into the first three character rows, and on each row I am only showing the first two characters. The area on the left in solid purple represent the border.

You can see that on each line we already start reading while still in the border area.

You can also see that we are only reading the character codes only at the first pixel row of each character row. In fact, when we get the character codes, we are storing it in a 40 byte buffer. For the remaining pixel rows we are getting the character codes from this buffer.

Let us move into a bit more detail on our VIC-II module by looking at some important signals:


If both the visible_vert and visible_horiz signals are high, it means we are in the region of the screen in which we are drawing characters.

Typically we would store the character code to our 40 byte buffer when clk_counter is cycle#3, and load the pixel_shift_register at the end of cycle#7.

You might notice that in this diagram we only only shifting every second clock cycle. Here I am actually trying to show how we would shift the pixels during multi-color mode, in which each pixel is 2 pixels wide.

Implementing Multicolor Bitmap Mode

Let us now continue to implement Multicolor Bitmap Mode.

The following register bits are important to implement for this mode:

  • Register D011: Bit 5: Bitmap mode
  • Register D016: Bit 4: Multicolor Mode
  • Register D018: Memory pointers
To implement these register bits, we would follow more or less the same process as in previous posts, so I will not go into detail on how to implement these register bits.

Next, let us see how to retrieve and dispatch pixel data for Multicolor Bitmap mode.

We start by generating addresses for retrieving pixel data:

...
wire [13:0] bit_data_pointer;
...
assign bit_data_pointer = screen_control_1[5] ?
                   {mem_pointers[3],(char_line_pointer + char_pointer_in_line),char_line_num}
                 : {mem_pointers[3:1],char_buffer_out[7:0],char_line_num};
...

Here we cater for generating pixel data addresses for both Standard Text Mode and High Resolution mode, which is determined by bit 5 of Screen Control Register #1.

For Standard Text mode we use the character codes in screen memory to determine the address within the Character ROM.

In High Resolution mode, however, we use the pointer to the current location in Screen Memory to assemble the address for retrieving pixel data.

We also need to modify the code that shifts out the pixel data:

   always @(posedge clk_in)
   if (clk_counter == 7)
     pixel_shift_reg <= data_in;
   else begin
     if (screen_control_2[4] & (clk_counter[0]))
       pixel_shift_reg <= {pixel_shift_reg[5:0],2'b0};
     else if (!screen_control_2[4]) 
       pixel_shift_reg <= {pixel_shift_reg[6:0],1'b0};
   end


So, for multicolor mode we shift two bits at a time.

To create the actual multicolor pixel, we add the following statement:

always @*
  case (pixel_shift_reg[7:6])
    2'b00: multi_color = background_color;
    2'b01: multi_color = char_buffer_out_delayed[7:4];
    2'b10: multi_color = char_buffer_out_delayed[3:0];
    2'b11: multi_color = char_buffer_out_delayed[11:8];
  endcase


For bit combinations 01, 10 and 11 we use the value we previously retrieved from Color RAM and Screen RAM. It is therefore important that you buffer these values, so it is is available for the full 8 pixels.

Creating the Test Bench

Our Test bench for the VIC-II will look very similar to our existing C64 module's interface with the ViC-II.

Make sure that both the main 64KB RAM and Color RAM is connected to the VIC-II module. Both mentioned memories should also contain the image data of the splash screen, as we discussed earlier on.

Next we should write an initialisation block for setting the VIC-II registers so that it can show the Splash Screen in Multicolor Bitmap mode:

initial begin
  #50 we_vic_ii = 1;
  addr_in = 6'h11;
  reg_data_in = 8'h30;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;
  
  #20; 
  we_vic_ii = 1;
  addr_in = 6'h20;
  reg_data_in = 8'he;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h21;
  reg_data_in = 8'h6;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h16;
  reg_data_in = 8'h10;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h18;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h20;
  reg_data_in = 8'hb;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

  #20;
  we_vic_ii = 1;
  addr_in = 6'h21;
  reg_data_in = 8'hd;
  @(negedge clk_1_mhz)
  we_vic_ii = 0;

end


Next, we should write an initial block for saving the pixel output of our VIC-II module to an image file:

initial begin  
  f = $fopen("/home/johan/result.ppm","w");
  $fwrite(f,"P3\n");
  $fwrite(f,"404 284\n");
  $fwrite(f,"255\n");
  i = 0;
  while (i < 114736) begin
    @(posedge clk)
    #2;
    if (!blank_signal)
    begin
      $fwrite(f,"%d\n", rgb[23:16]);
      $fwrite(f,"%d\n", rgb[15:8]);
      $fwrite(f,"%d\n", rgb[7:0]);
      i = i + 1;
    end
  end
  $fclose(f);
end

Here we create a PPM, where we store the pixel values in plain text. We precede  the pixel data with the information regarding the resolution of the image and the max value per color component.

The Final Result

Here is a picture of the final simulated result:


This is a bit of motivation that we are on the right track.

In Summary

In this post we implemented the Multicolor Bitmap Mode within our VIC-II module. As a simulation test we checked whether our Test Bench could create the loading Splash screen of the game Dan Dare.

In the next post we will link up our modified VIC-II module to our real FPGA and see if the Splash screen will also be shown when loading the game from the .TAP image.

Till next time!

Monday, 15 July 2019

Adding Color RAM

Foreword

In the previous post we managed to emulate the flashing borders when the game Dan Dare loads on a C64.

Our next goal would be to display the Splash screen while the game loads. To do this we need to add some more functionality to our VIC-II core.

A fundamental block that is missing from our C64 design is color RAM, that is used to give the color for characters. Color RAM also plays an important role in rendering our Splash screen.

So, in this post we will implement color RAM and integrate it to our VIC-II core.

Defining Color RAM as Block RAM

We start off by defining some Verilog code that will synthesise our color RAM into Block RAM elements:

...
    reg [3:0] color_ram [999:0];
    reg [3:0] color_ram_out;
...
     always @ (posedge clk_1_mhz)
       begin
        if (color_ram_write_enable) 
        begin
         color_ram[addr[9:0]] <= ram_in[3:0];
         color_ram_out <= ram_in[3:0];
        end
        else 
        begin
         color_ram_out <= color_ram[addr[9:0]];
        end 
       end 
...

We directly connect to the Data Out pin of our CPU via the ram_in wire.

Since the Color RAM only contain a thousand items, we only need to connect the lowest 10 bits of the address bus. This makes it crucial to only enable writing within the correct address range:

assign color_ram_write_enable = we & (addr >= 16'hd800 & addr < 16'hdbe8);

Next, we should add  second port to our Color RAM for providing information to our VIC-II module:

...
    reg [3:0] color_ram_out2;
...
    always @ (posedge clk_2_mhz)
       begin
         color_ram_out2 <= color_ram[portb_add[9:0]]; 
       end 
...

Finally we need to hook up to the VIC-II module:

vic_ii vic_inst
    (
...
        .data_in({color_ram_out2,vic_combined_d}),
...
        );

The data in bus to a VIC-II is 12 bits wide and the upper 4 bits contains the color information.

Test Results

I did a quick test to see if our C64 module can render the colors stored in Color RAM correctly.

For this exercise I run a couple of POKE commands to write color values directly to COLOR RAM.

The results are as follows:


In Summary

In this post we have added COLOR RAM to our C64 design and verified that our VIC-II module render the colors correctly.

In the next post we will continue to add more functionality to our VIC-II module in order to display the splash screen.

Till next time!

Saturday, 13 July 2019

Flashing Borders

Foreword

In the previous post we managed to load the tape header from a .TAP file and display the file found from it and display the file found on the screen, with all these actions performed by the KERNEL ROM.

This basically proofs that we have implemented out Tape module correctly. If one really want to be nostalgic, one could actually hook up a 1530 datasette to our design, provided you get the level shifting right, and it should work.

Our C64 FPGA module in its current state doesn't really support the full graphical capabilities of the VIC-II. In fact, currently it can only render characters stored in the default memory location (e.g. addresses 1024 to 2023), with hard coded border and background colors, which are light and dark blue.

This means that if we load a classic game within our C64 module, it will probably load, but we not be able to see the fancy colourful graphic effects.

The way to approach this problem is to pick a classic C64 game you like and gradually implement the graphic capabilities the game require, till you can perfectly play the game.

The game I have picked was Dan Dare, Pilot of the Future.

We will start by implementing the graphical capabilities that the loading of the game Dan Dare requires.

When you load the game Dan Dare from tape, as with many other games of the era, you will be presented with flashing borders as well as a splash screen in multi color high resolution mode.

In this post we will look into implementing the flashing borders.

The display of the splash screen we will implement in the next post or two.

VIC-II register access

When the game Dan Dare loads, the effect of flashing borders is achieved by writing alternating colors in a rapid fashion to memory location 53280.

Memory location 53280 is indeed a register within the VIC-II. At this point I need to stop the discussion in its tracks: Our VIC-II module doesn't even provide access register access at the moment!

So, let us start by implementing functionality within our C64 module for accessing registers from the VIC-II.

Firstly, we need to add some extra ports to our VIC-II module:

module vic_ii
(
  input clk_1_mhz,
...
  input [5:0] addr_in,
  input [7:0] reg_data_in,
  input we,
  output [7:0] data_out,
...
    );


For addr_in we take the lower 6 bits of of the address output from our Arlet 6502 core.

For reg_data_in we are taking the Data Out, also from our Arlet 6502 core.

The data_out we need to add as an extra input to our data multiplexer, which we will discuss a bit later.

As mentioned, we are only using the lower six bits of the address bus. For reading this is not a problem at all. The multiplexer will simply ignore data from our VIC-II module if we didn't requested data from the VIC-II module.

Writes are more of a problem. We only want to write to a VIC-II register if it was really the intent, of course. For this we need the need the we port, which we connect as follows:

vic_ii vic_inst
    (
...
        .we(we & (addr == 16'hd020 | addr == 16'hd021 | addr == 16'hd011)),
...
        );


For now we are only doing a write for three of the VIC-II registers: d020, d021 and d011. For the remaining registers, we are just going to write to main RAM. This just makes our life easier for now.

You will also realise that we are sending the 1Mhz clock signal to our VIC-II module. This is because the read/writes will be performed by our 6502 module, which operates at 1Mhz.

You might recall that big chunks of our VIC-II module operates at 8 MHZ, which in turn might let you think for a moment: Two different clock speeds again, so do we need again cater for cross clock domains?

Luckily not! Remember that in our design our 1MHZ clock is not a pure 1MHZ with a 50% duty cycle. We derive our 1MHz by taking our 8MHz, and for each 8 cycles we are masking out 7 cycles and enabling one.

This means that our 1MHz signal pulse will always be in sync with a 8Mhz pulse, although our 1Mhz will have a weird duty cycle of 12.5%

Finally, let us change our multiplexer logic to cater for reads from our VIC-II registers:

    always @*
        casex (addr_delayed)
...
          16'hd020, 16'hd021, 16'hd011: combined_d_out = vic_reg_data_out;
...
          default: combined_d_out = ram_out;
        endcase


Recall that we connect combined_d_out to the Data In on our Arlet 6502 core.

We have now given our CPU the capability to access VIC-II registers. We now need to implement the mentioned registers within the VIC-II.

Equipping our VIC-II with registers

Let us now implement the three registers within our VIC-II module:

...
reg [7:0] data_out_reg;
reg [7:0] screen_control_1 = 0;

reg [3:0] border_color = 0;
reg [3:0] background_color = 0;
...
assign screen_enabled = screen_control_1[4];

assign data_out = data_out_reg;

always @(posedge clk_1_mhz)
     case (addr_in)
       6'h20: data_out_reg <= {4'b0,border_color};
       6'h21: data_out_reg <= {4'b0,background_color};
       6'h11: data_out_reg <= screen_control_1;
     endcase
     
always @(posedge clk_1_mhz)
begin
  if (we & addr_in == 6'h20)
    border_color <= reg_data_in[3:0];
  else if (we & addr_in == 6'h21)  
    background_color <= reg_data_in[3:0];
  else if (we & addr_in == 6'h11)
    screen_control_1 <= reg_data_in[7:0];
end
...

We implement the register data_out_reg for reading purposes. In our C64 module reads are always delayed by one clock cycle, so the data_out_reg performs this task for us.

You might wonder why we have implement the screen control register if, in this post, we only care about border and background color. This is just to cater for the scenario where the screen gets blanked, in which case the full screen gets covered with flashing borders.

Having each VIC-II register declared individually seems quite like a cumbersome process. One might be tempted to think that it would be easier to define all registers together as an array which will resolve to a Block RAM element during synthesis.

This is a very valid point. However, one needs to keep in mind that a Block RAM element can only allow up to two simultaneous memory access operations. The VIC-II needs more than two simultaneous accesses, just to mention a few:

  • Border colour/Background color
  • Screen control
  • X raster pos
  • Y raster pos
It is therefor better to specify register separately.

Finally we need to use these registers for color generation:

   assign color_for_bit = pixel_shift_reg[7] == 1 ? 
             current_front_color : background_color;
   assign final_color = (visible_vert & visible_horiz & screen_enabled) ? 
             color_for_bit : border_color;


Let us quickly refresh ourselves with the above snippet of code.

pixel_shift_reg is a byte shift register where we shift out a bit at a time. Where the current bit is zero, we output the background_color as the color to display.

We use final_color to alternate between the border color and color_for_bit where applicable. I have also added screen_enabled to the mix, where the screen gets blanked totally with the border color if display is disabled.

The need to disable KERNEL ROM

With the registers implemented, I still couldn't see the flashing borders.

I encountered a similar issue when I developed a C64 emulator in JavaScript.

When I read the above mentioned blog post, I remembered that the issue was that the loader Dan Dare override the IRQ vector at address FFFF/FFFE.

The problem comes in that one needs to disable the KERNEL ROM to expose this new vector from RAM, which I haven't implemented yet.

The switching in/out of ROMS from address space is controlled by register 1. We already worked with register 1 in the previous post where we had to implement motor control and tape button status.

I changed the logic a bit for reading and writing to register 1:

...
    reg [7:0] reg_1_6510 = 8'h37;

    assign motor_control = reg_1_6510[5];

    always @(posedge clk_1_mhz)
    if (we & (addr == 1))
      reg_1_6510 <= ram_in;
...
    always @*
        casex (addr_delayed)
          16'b1: combined_d_out = {reg_1_6510[7:5], tape_button, reg_1_6510[3:0]};
...
          16'b111x_xxxx_xxxx_xxxx : if (reg_1_6510[1])
                                      combined_d_out = kernel_out;
                                    else
                                      combined_d_out = ram_out;
...
        endcase
...

I have created an 8-bit register for memory location 1 called reg_1_6510. For starters, I am now feeding the motor control bit from bit 5 of this register.

We are keeping our eyes on bit one of reg_1_6510. If this bit is 1, we return the contents of kernel ROM and the contents of the underlying RAM if this bit is 0.

The End Result

The following video show the end result when we load the game Dan Dare from a .TAP file. Flashing borders appear:


In Summary

In this post we gave the capability for our C64 module to change the border and background color.

This enabled us to emulate the flashing borders when we load the game Dan Dare.

In the next post we will do some development that will enable us to see the Splash screen during the loading process.

Till next time!

Wednesday, 19 June 2019

Integrating Tape Interface with C64 Module: Part 5

Foreword

In the previous post we have implemented interrupts within our CIA module.

At last we are ready to integrate our Tape module to our C64 module, which we will cover in this post.

We will be using the tape image for the game Dan Dare.

The ultimate goal for this post doesn't sound so exciting: To be able to found the Header on the mentioned Tape image and showing Found Dan Dare.

We will continue using the Tape image of Dan Dare in coming posts, and we will work our way through first getting the flashing borders and splash loading screen to display, till actually to a point where we can play the game.

FLAG1 as an edge triggered interrupt

If we were to take the output of our Tape module and connect it directly to the FLAG1 input of our CIA module, our FLAG1 will basically act as a level interrupt.

On a conventional CIA chip, however, all interrupts are edge triggered interrupts. We therefor need to make a small modification, so that our FLAG1 input behaves as an edge interrupt:

...
reg flag1_delayed;
...
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_b_in(keyboard_result),
          .addr(addr[3:0]),
          .we(we & (addr[15:8] == 8'hdc)),
          .clk(clk_1_mhz),
          .chip_select(addr[15:8] == 8'hdc),
          .data_in(ram_in),
          .data_out(cia_1_data_out),
          .flag1(flag1 & !flag1_delayed),
          .irq(irq)
            );

    always @(posedge clk_1_mhz)
      flag1_delayed <= flag1;
...

Tape Button and motor control

If you ever loaded a game from a tape on a real C64, you will know that the C64 has control over the Cassette motor: The tape briefly pauses when a file header is found, as well as when the file has finished loading.

How can the C64 control the tape motor in this way? The answer is via bits 4 and 5 from memory location 1. Quoted from this:


  • Bit #4: Datasette button status; 0 = One or more of PLAY, RECORD, F.FWD or REW pressed; 1 = No button is pressed.
  • Bit #5: Datasette motor control; 0 = On; 1 = Off.

In order for our C64 module to emulate tape loading, we also need to emulate the above mentioned bits.

We will use the ~/` button on the USB keyboard to represent the play button for Bit 4. We will hook up this key to our C64 in a coming section.

To serve above mentioned bots, we need to define an input port and an output port on our c64 module:

module c64(
...
  input wire tape_button,
  output reg motor_control = 1,
...
    );


The motor control bit gets set as follows:

    always @(posedge clk_1_mhz)
    if (we & (addr == 1))
      motor_control <= ram_in[5];


To cater for reads for location 1, we need to adjust our memory read case statement as follows:

    always @*
        casex (addr_delayed)
          16'b1: combined_d_out = {ram_out[7:6], motor_control, tape_button, ram_out[3:0]};
          16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
          16'b111x_xxxx_xxxx_xxxx : combined_d_out = kernel_out;
          16'hd012: combined_d_out = line_counter;
          16'hdcxx: combined_d_out = cia_1_data_out;
          default: combined_d_out = ram_out;
        endcase

We have now defined our motor control functioanlity within our C64 module, but we still need to route it to our Tape module:

module tape(
  input clk,
  input clk_1_mhz,
  input restart,
  input reset,
  output [31:0] ip2bus_mst_addr,
  output [11:0] ip2bus_mst_length,
  input [31:0] ip2bus_mstrd_d,
  output [4:0] ip2bus_inputs,
  input [5:0] ip2bus_otputs,
  input motor_control,
  output pwm
    );
...
tape_pwm t_pwm(
                  .time_val(timer_val),
                  .load_timer(load_timer),
                  .pwm(pwm),
                  .motor_control(motor_control),
                  .clk(clk_1_mhz)
                    );
..

In our pwm module we basically want to pause the generation of pwm pulses when the motor control bit is high:

...
  always @(posedge clk)
  if (timer > 0 & !motor_control)
    timer <= timer - 1;
  else if (timer == 0)
    timer <= load;
..

Assigning a keyboard button as play button

I mentioned earlier that I want to use the ~/` key on the USB keyboard as our play button.

You will recall from a previous post that we interface with a USB keyboard with a minimalistic USB protocol stack, where we convert USB key scan codes to C64 scancodes and then setting the appropriate bit in a key array linked to our C64 module.

We start by adding a mapping for the ~ key to the big if-else statement:

u32 mapUsbToC64(int usbCode) {
 if (usbCode == 0x4) { //A
  return 0xa;
 } else if (usbCode == 0x5) { //B
  return 0x1c;
 } else if (usbCode == 0x6) { //C
  return 0x14;
...
...
 } else if (usbCode == 0x2c) { //space
  return 0x3c;
 } else if (usbCode == 53) { //play key `~
  return 100;
 }


}


The USB key scancode for the required key is 53 and we map this code 100.

But, C64 scan codes only goes up to 63, so what is with the 100?

The point is that the play key is not actually key in the key bit array that we want to set. We rather use it as a special value:

void getC64Words(u32 usbWord0, u32 usbWord1, u32 *c64Word0, u32 *c64Word1) {
  *c64Word0 = 0;
  *c64Word1 = 0;

  if (usbWord0 & 2) {
   *c64Word0 = 0x8000;
  }

  usbWord0 = usbWord0 >> 16;

  for (int i = 0; i < 2; i++) {
   int current = usbWord0 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     }

   }

   usbWord0 = usbWord0 >> 8;
  }

  for (int i = 0; i < 4; i++) {
   int current = usbWord1 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     }

   }

   usbWord1 = usbWord1 >> 8;
  }

}


So if, during the conversion process from USB to C64 scan code we encounter a 100, we set the value zero to memory location 0x43c00008.

Remember that 0x43c00008 is a register mapped to our AXI slave interface. Just to recap from a previous post:

0x43c0_0000: key array 1
0x43c0_0004: key array 2
0x43c0_0008: tape control:
                bit 0: PWM bit (READ ONLY). For debug
                bit 1: Reset tape position to address 0x238270 in memory

For register 0x43c0_0008 we will introduce an extra function for bit 2: Tape button. Our Slave AXI block needs to be modified to accommodate this bit so it can be surfaced to the tape_button input of our C64 module.

Test Run

Time for a test run.

As usual you will open Xilinx SDK, program the FPGA and start our C program to run on the ARM processor. In our case the C program will run a mini USB stack, capturing keystrokes and forwarding to our C64 module.

With the C program running, we need to open the XSCT console, and execute a couple of commands.

First we need to pause our Tape module and reset the read address:

mwr 0x43c00008 6

Next we should load the tape image into memory:

mwr -size b -bin -file "/home/johan/Downloads/Dan Dare.tap" 0x238250 600000

Next we release the reset bit:

mwr 0x43c00008 4

You can now move to the USB keyboard and follow the conventional process for kicking off the loading of a tape on a C64. That is typing LOAD and when prompted to press play on tape, just press ~.

Currently our VIC-II doesn't support screen blanking, as you would see when loading a program from tape. However, we can see the prompts as something is loading:


In Summary

In this post we finally integrated our Tape Module into our C64 Module.

As a test we managed to load a file header from a .tap file and display the file name.

In the next post we will implement the necessary functionality for displaying the loading effects while loading the game Dan Dare.

Till next time! 

Saturday, 15 June 2019

Integrating Tape Interface with C64 Module: Part 4

Foreword

In the previous post we developed and integrated timers to our CIA module.

We also ran a simulation to verify that our timers works more or less as expected within the CIA module.

In this post we will be implementing interrupts within our CIA module.

With interrupts fully implemented within our CIA module, we are one step closer to integrate our tape module to our C64 module.

An overview of Interrupts on the CIA

The easiest way to understand how interrupts work on the CIA, is just to look at the datasheet of a CIA 6526.

A copy can easily obtained from the archives of 6502.orghttp://archive.6502.org/datasheets/mos_6526_cia_recreated.pdf

Within the datasheet, we can see that there is one register taking care of all interrupt functionality:
So, all the interrupt functionality is provided by register number 14 of the CIA. Behind the scenes, however, register 14 is in actual fact two separate registers: When writing to location 14, the value will be saved in an Interrupt Mask register. When reading from this location, the value of the Interrupt Status Register will be returned.

At point a typical question will pop up: If you cannot write to the Interrupt status register, how can you clear this register from interrupts that occurred?

The answer: By reading this register.

You can see that the CIA supports five interrupts. In our whole C64 project, however, we will only be using three interrupts:

  • Timer A
  • Timer B, and
  • FLG
The FLG is the interrupt we will be using to connect our Tape interface in a later post.

Defining Interrupts on the Timers


We will be using timers to test our interrupt functionality.

Our Timer module, as it stands currently, doesn't have an output defined to signal a timer interupt.

So, let us start off by first defining an output port on out timer module for signalling an interrupt.:

module timer(
  input [15:0] reload_val,
  input force_reload,
  input new_started,
  input new_runmode,
  input write_control,
  input clk,
  output [15:0] counter_out,
  output started_status,
  output runmode_status,
  output overflow
    );


This pin will obviously set to a one when our counter reaches zero:

assign overflow = (counter == 0) ? 1 : 0;

We need to wire these pins up for Timer A and Timer B in our CIA module:

...
  wire timer_a_overflow;
  wire timer_b_overflow;
...
  timer timer_a(
    .reload_val({slave_reg_5,slave_reg_4}),
    .force_reload(write_cra & data_in[4]),
    .new_started(data_in[0]),
    .new_runmode(data_in[3]),
    .write_control(write_cra),
    .clk(clk),
    .counter_out(counter_a_val),
    .started_status(started_status_a),
    .runmode_status(runmode_status_a), 
    .overflow(timer_a_overflow)
      );

  timer timer_b(
    .reload_val({slave_reg_7,slave_reg_6}),
    .force_reload(write_crb & data_in[4]),
    .new_started(data_in[0]),
    .new_runmode(data_in[3]),
    .write_control(write_crb),
    .clk(clk),
    .counter_out(counter_b_val),
    .started_status(started_status_b),
    .runmode_status(runmode_status_b), 
    .overflow(timer_b_overflow)
      );
...

We have now defined two interrupts that we can to develop the Interrupts functionality for our CIA.

Developing the Interrupt functionality

Let us now develop the rest of the interrupt functionality.

As mentioned earlier, the ICR is in effect two registers. So, let us start by creating them:

  reg [4:0] int_mask = 0;
  reg [4:0] int_stat = 0;


Since the CIA only supports 5 interrupts, I am only making this register 5 bits wide, instead of the usual 8.

Let us create the logic for setting the contents of the interrupt mask register:

  always @(posedge clk)
  if (we)
  case (addr)
    0: slave_reg_0 <= data_in;
    1: slave_reg_1 <= data_in;
    2: slave_reg_2 <= data_in;
    3: slave_reg_3 <= data_in;
    4: slave_reg_4 <= data_in;
    5: slave_reg_5 <= data_in;
    6: slave_reg_6 <= data_in;
    7: slave_reg_7 <= data_in;
    13: if (data_in[7]) 
          int_mask <= int_mask | data_in[4:0];
        else
          int_mask <= int_mask & ~data_in[4:0];
    14: slave_reg_14 <= data_in;
    15: slave_reg_15 <= data_in;
  endcase

If this looks a bit strange, just remember that the MSB of the value written to the Interrupt Mask register have an import role. If this value is one, it means we are enabling interrupts. If the MSB is zero, it means that we want to mask off interrupts.

Next, let us tackle the Interrupts status register. Before we look into this register, it is important to realise that our CIA module in its current state have a design anomaly.

Our CIA is currently linked up to the lower 4 bits of the address bus, and for all address reads our CIA will return data for the register these four bits represents.

In general, this is not a problem for us, because we some multiplexing logic that will just ignore these values if we didn't targeted a CIA with a read.

This do, however, become a problem with reads from the Interrupt status register, since a read clears the contents of the Interrupt Status register.

We therefor need some way to indicate to the CIA when are targeting a read towards it. We do this by defining an extra port:

module cia(
...
  input chip_select,
...
    );

From the outside, we assign this port as follows:

    cia cia_1(
...
          .chip_select(addr[15:8] == 8'hdc),
...
            );


One might be tempted to think, shouldn't we rather check for addr[15:5]?

The answer is no. On a C64 within the memory range DC00-DCFF it will always assume the lower four bits is meaned for CIA#1.

We can now do the assignment to the Int Mask Register:

  always @(posedge clk)
  if (chip_select & !we & (addr == 13))
    int_stat <= 0;
  else
    int_stat <= int_stat | {flag1, 1'b0, 1'b0, timer_b_overflow, timer_a_overflow};

Flag1 is only shown for sake of completeness.

The case statement for assigning data_out looks as follows:

  always @(posedge clk)
  if(!we)
  case (addr)
    0: data_out <= slave_reg_0;
    1: data_out <= port_b_in;
    2: data_out <= slave_reg_2;
    3: data_out <= slave_reg_3;
    4: data_out <= counter_a_val[7:0];
    5: data_out <= counter_a_val[15:8];
    6: data_out <= counter_b_val[7:0];
    7: data_out <= counter_b_val[15:8];
    13: data_out <= {irq, 1'b0, int_stat};
    14: data_out <= {slave_reg_14[7:5],
                     1'b1,
                     runmode_status_a,
                     slave_reg_14[2],
                     slave_reg_14[1],
                     started_status_a
    };
    15: data_out <= {slave_reg_15[7:5],
                     1'b1,
                     runmode_status_b,
                     slave_reg_15[2],
                     slave_reg_15[1],
                     started_status_b
    };

  endcase


You will see that we make reference to an irq, which we haven't defined yet.

This is the output we will actually hook up to the irq pin of our 6502. We declare this wire within our CIA module as follows:

module cia(
...
  output irq
    );
...
  assign irq = (int_stat & int_mask) ? 1 : 0;
...

An interrupt will only resolve to an IRQ if enable by the Interrupt Mask Register.

The Test Program

To test the interrupt functionality we will again write a 6502 test program that we will run in simulation mode:

      SEI
      LDA #$1F
      STA $DC0D
      LDA $DC0D
      CLI
      LDA #$64
      STA $DC04
      LDA #$00
      STA $DC05
      LDA #$11
      STA $DC0E
      JSR DELAY
      LDA #$81
      STA $DC0D
      JSR DELAY
      LDA #$01
      STA $DC0D
      JSR DELAY 
      NOP
      NOP
      NOP      
DELAY LDX #$14
LOOP DEX
     BNE LOOP
     LDA $DC04
     LDX $DC0D
RTS


With this program we let Timer A expire twice, with interrupts disabled in the first expire and enabled in the second expire.

In this test program I haven't wrote an interrupt handler. It is sufficient for now to confirm in the simulation to verify that a jump to the interrupt vector is performed.

In Summary

In this post we have implemented interrupts within our CIA module.

In the next post we will verify if our FPGA can boot ok with the CIA module.

We will then wire up the Tape module to our C64 module.

Till next time!

Monday, 10 June 2019

Integrating Tape Interface with C64 Module: Part 3

Foreword

In the previous post we started to developed a CIA module for our C64 FPGA design.

At the end of the previous post we have implemented the Port A and Port B of the CIA, which we used for keyboard interfacing.

In this post we are going to implemented the Timers (e.g. Timer A and Timer B) within our CIA module.

Developing the Timer Module

A timer is a crucial component of a CIA chip. So, let us begin by summarising a timer's operation:

  • A timer counts down from a predefined value till it reaches zero.
  • As soon as the timer reaches zero, an underflow condition occurs. With the underflow condition the timer gets reloaded with a predefined value.
  • With the timer reloaded, one of two things can happen:
    • Continuous mode: The timer with continue counting from the predefined value towards zero
    • One-shot mode: With the timer reloaded, the timer will stop. 
As seen from the above, a crucial part of timer operating is deciding when the timer should start, and when it should stop. A START register bit is specifically commissioned for this purpose.

If the START bit is one, it will count. As soon as the START bit is set to zero, the timer will stop counting.

There is two sources that can set the START bit:

  • You can set this bit yourself
  • Overflow condition in One-shot mode: With an overflow condition in one-shot mode, the START bit will be changed from a one to a zero.
With the theory discussed, let us start to create the timer module.

The core of the time module is the timer itself, so let us start by defining it as a register:

  reg [15:0] counter = 0;

As discussed, we need to also store the the state of START and RUNMODE. So, we will define a register for both:

  reg started = 0;
  reg runmode = 0;

At this point, the question is: How do we externally set the state of these two registers?

Obviously we will start by defining input ports for setting these values:

module timer(
...
  input new_started,
  input new_runmode,
  input clk,
...
    );


Also, it would be nice to indicate when the state inputs are valid. We will therefore create a new input port:

module timer(
...
  input write_control,
...
    );

The following snippet shows the setting of the RUNMODE state:

  always @(posedge clk)
  if (write_control)
    runmode <= new_runmode;


The setting of the START state is a bit more complex, since we can either set this state manually, or it can potentially get set during an underflow condition:

  always @(posedge clk)
  if (counter == 0 & runmode)
    started <= 0;
  else if (write_control)
    started <= new_started;


Let us now write some code for updating the counter itself:

module timer(
  input [15:0] reload_val,
  input force_reload,
...
    );

...
  always @(posedge clk)
  if (force_reload & write_control)
      counter <= reload_val;
  else if (counter > 0 & started)
    counter <= counter - 1;
  else if (counter == 0 & started)
    counter <= reload_val;
...

You can see that I have introduced two new ports which I haven't discussed before: reload_val and force_reload.

reload_val contains the predefined starting value for our timer. I have decided not to store this value internally, so with each reload I will fetch the value externally.

After some consideration, I thought it was best to store the reload values in registers on the CIA module itself, so it is not necessary to inspect the write_control value when reloading the counter.

force_reload is another feature of timers on a CIA. At any point, whether the timer is counting down or not, you can reload the timer by asserting the force_reload input.

We are just about done. What we still need to do, is to expose the internal state of our timer to the CIA:

module timer(
...
  output [15:0] counter_out,
  output started_status,
  output runmode_status
    );
...
  assign started_status = started;
  assign runmode_status = runmode;
   
  assign counter_out = counter;
...

Testing the Timer

Before integrating the timer with the CIA module, we need to test the timer module a bit.

The following top module will aid in testing the timer module:

module top(

    );

reg clk = 0;
reg force_reload = 0;
reg started = 0;
reg run_mode = 1;
reg write_control = 0;
wire [15:0] counter_val;

timer timer_a(
  .reload_val(20),
  .force_reload(force_reload),
  .new_started(started),
  .new_runmode(run_mode),
  .write_control(write_control),
  .clk(clk),
  .counter_out(counter_val) 
);

initial begin
  #50
  @(negedge clk)
  force_reload = 1;
  write_control = 1;
  @(negedge clk)
  force_reload = 0;
  write_control = 0;
  #50
  @(negedge clk)
  write_control = 1;
  started = 1;
  @(negedge clk)
  write_control = 0;
  #1000
  @(negedge clk)
  write_control = 1;
  started = 1;
  run_mode = 0;
  @(negedge clk)
  write_control = 0;
//  started = 1;
  
end

always #5 clk = ~clk;
endmodule


Here we test our timer with a force reload, and starting and stopping the timer.

Integration into the CIA module

With our Timer Module tested, it is now time to integrate this module into our CIA module.

We start by adding a couple of new slave registers:

  reg [7:0] slave_reg_0 = 0;
  reg [7:0] slave_reg_1 = 0;
  reg [7:0] slave_reg_2 = 0;
  reg [7:0] slave_reg_3 = 0;
  reg [7:0] slave_reg_4 = 255;
  reg [7:0] slave_reg_5 = 255;
  reg [7:0] slave_reg_6 = 255;
  reg [7:0] slave_reg_7 = 255;  
  reg [7:0] slave_reg_14 = 255;
  reg [7:0] slave_reg_15 = 255;


We then proceed and add some new wires:

  wire write_cra;
  wire write_crb;
  
  wire [15:0] counter_a_val;
  wire started_status_a;
  wire runmode_status_a; 

  wire [15:0] counter_b_val;
  wire started_status_b;
  wire runmode_status_b; 

  
  assign write_cra = we & (addr == 14) ? 1 : 0;
  assign write_crb = we & (addr == 15) ? 1 : 0;


I have defined write_cra and write_crb to signal our timer module when we are setting state, that is a write to register 14 or 15.

Here is how we define the two timer instances (e.g. Timer A and Timer B):

  timer timer_a(
    .reload_val({slave_reg_5,slave_reg_4}),
    .force_reload(write_cra & data_in[4]),
    .new_started(data_in[0]),
    .new_runmode(data_in[3]),
    .write_control(write_cra),
    .clk(clk),
    .counter_out(counter_a_val),
    .started_status(started_status_a),
    .runmode_status(runmode_status_a) 

      );

  timer timer_b(
    .reload_val({slave_reg_7,slave_reg_6}),
    .force_reload(write_crb & data_in[4]),
    .new_started(data_in[0]),
    .new_runmode(data_in[3]),
    .write_control(write_crb),
    .clk(clk),
    .counter_out(counter_b_val),
    .started_status(started_status_b),
    .runmode_status(runmode_status_b) 

      );


What remains to be done is to modify the functionality to write/read to the new CIA registers:

  always @(posedge clk)
  if(!we)
  case (addr)
    0: data_out <= slave_reg_0;
    1: data_out <= port_b_in;
    2: data_out <= slave_reg_2;
    3: data_out <= slave_reg_3;
    4: data_out <= counter_a_val[7:0];
    5: data_out <= counter_a_val[15:8];
    6: data_out <= counter_b_val[7:0];
    7: data_out <= counter_b_val[15:8];
    
    14: data_out <= {slave_reg_14[7:5],
                     1'b1,
                     runmode_status_a,
                     slave_reg_14[2],
                     slave_reg_14[1],
                     started_status_a
    };
    15: data_out <= {slave_reg_15[7:5],
                     1'b1,
                     runmode_status_b,
                     slave_reg_15[2],
                     slave_reg_15[1],
                     started_status_b
    };

  endcase
  
  always @(posedge clk)
  if (we)
  case (addr)
    0: slave_reg_0 <= data_in;
    1: slave_reg_1 <= data_in;
    2: slave_reg_2 <= data_in;
    3: slave_reg_3 <= data_in;
    4: slave_reg_4 <= data_in;
    5: slave_reg_5 <= data_in;
    6: slave_reg_6 <= data_in;
    7: slave_reg_7 <= data_in;
    
    14: slave_reg_14 <= data_in;
    15: slave_reg_15 <= data_in;
  endcase


Testing the CIA

With the timers integrated we can now run a small test program 6502 assembly program to see if the CIA functions correctly as a whole:

LDA #$20
STA $DC04
LDA #$1
STA $DC05
LDA $DC0E
ORA #$8
STA $DC0E
ORA #$1
STA $DC0E
NOP
NOP
NOP
NOP
NOP
NOP
LDA $DC0E
ORA #$10
STA $DC0E
NOP
NOP
NOP
NOP
NOP
NOP

This code will test timer A in One-Shot mode. With little adjustment, the code can be modified to also test timer B.

In Summary

In this post we have implemented timers within our CIA module.

In the next post we will implement the Interrupt functionality of the CIA.

Once we have implemented interrupts on our CIA, we would be able to interface the Tape module to our C64 module.

Till next time!


Monday, 3 June 2019

Integrating Tape Interface with C64 Module: Part 2

Foreword

In the previous post we managed to integrate our tape module into our existing C64 design and verified that our Tape module produced pulses of the correct widths given a particular .TAP file stored in SDRAM.

At this point in time, however, our Tape module is not wired to our C64 module so that it can do something useful for us. By useful I am mean loading the .TAP file into C64 so we can play the game these pulse widths represents 😃

Of course, if we had a fully developed C64 module, we could implement the functionality mentioned in the previous paragraph by simply taking the output of our Tape Module and connecting it to the FLAG input of a CIA#1 module and there you go!

But, in our case we don't have a fully developed C64 module. In this series of blog posts we are adding functionality in an incremental fashion and at this point we don't have a fully functional CIA module.

It is therefore the purpose of this post and a one or two subsequent posts to develop a fully functional CIA module. With this module developed, it will just be a matter of taking our Tape module and Plug & Play.

In the process of developing a CIA module, I will be doing a bit of code refactoring, moving existing code  from the C64 module to the CIA module and into other new modules.

In this whole refactoring exercise, we will potentially introduce many bugs and it may be worthwhile to make use of simulation again to make it easier ironing out these bugs.

You might remember from some previous posts, the process of simulating the booting of a C64 system within a verilog simulator can be quite a time consuming process. You can wait between 20 and 30 minutes till you get to the point where the welcome message is written to screen memory.

This waiting can be quite a nuisance if you want to fix something small and just would like to test if the fixed worked.

I will show you some techniques on how you can drastically trim down on simulation wait time, to make life less frustrating.

Directives

You might have heard about C pre-processor directives, where you make use of defines that gets expanded before compile time.

In Verilog we also can make use of directives. I will be making use of directives in this post so we configure our C64 module to run in simulation mode and in "real" mode.

Directives just makes live easier so simulation mode code and normal code can live together, and you don't need to make so many changes each time between switching modes.

We will start by adding the following define right in the beginning of our C64 module:

`define SIM

This define signals that we want to run in simulation mode. Should we want to disable this mode, you can just comment this line out.

The first place where we would use this define, would be where we create an instance of our CPU:

`ifdef SIM
    cpu mycpu ( clk_1_mhz, proc_rst, addr, combined_d_out, ram_in, we, int_occ/*0*/, 1'b0, 1'b1 );
`else
    cpu mycpu ( clk_1_mhz, c64_reset, addr, combined_d_out, ram_in, we, int_occ/*0*/, 1'b0, 1'b1 );
`endif    


Notice that the only difference between the two declaration is in the second port. In simulation mode we connect this port to proc_reset and in the other mode on c64_reset.

c64_reset is a port we have defined previously on our VIC-II module. The problem with this port is that it is a very time consuming process, cycle wise. Connecting to this port during simulation will indeed cause our simulation to take very long to complete.

For simulation we therefor connect to proc_rst, which is an input port on our C64 module. When simulating we can create a top module and connect the proc_rst port to a simulated reset with a much shorter duration.

To make life easier, we can also disable the creating of a VIC-II module during simulation:

`ifndef SIM
vic_ii vic_inst
    (
        .clk_in(clk),
        .clk_counter(clk_div_counter),
        .clk_2_mhz(clk_2_mhz),
        .blank_signal(blank_signal),
        .frame_sync(frame_sync),
        .data_in({14'd4,vic_combined_d}),
        .c64_reset(c64_reset),
        .addr(vic_addr),
        .out_rgb(out_rgb)
        );
        
    burst_block burst_tst(
            .clk(axi_clk_in),
            .reset(proc_rst),
            .write(write_pin),
            .next_frame(frame_sync),
            .write_data({pixel_16_bit_delay,pixel_16_bit}), 
            .count_in_buf(count_in_buf),
            //output src ready
            //-----------------------------------------
            .ip2bus_mst_addr(ip2bus_mst_addr),
            .ip2bus_mst_length(ip2bus_mst_length),
            .ip2bus_mstwr_d(ip2bus_mstwr_d),
            .ip2bus_inputs(ip2bus_inputs),
            .ip2bus_otputs(ip2bus_otputs),
            .read(read)
              );
`endif


Notice this time we make use of ifndef, meaning if not defined. So these two blocks will only be added if we are not in simulation mode.

You will also see that I am removing the burst_block, required for AXI memory access, when doing simulation.

Moving keyboard functionality into its own module

In our C64 module's current state, we some entangled keyboard functionality and CIA functionality.

It make sense to split this functionality and will also mark the start of our new CIA module.

Let us start with a keyboard module:

`timescale 1ns / 1ps

module keyboard(
  input [31:0] key_matrix_0,
  input [31:0] key_matrix_1,
  input [7:0] keyboard_control_byte,
  output [7:0] keyboard_result_byte
    );
    
    wire [7:0] keyboard_row_0;
    wire [7:0] keyboard_row_1;
    wire [7:0] keyboard_row_2;
    wire [7:0] keyboard_row_3;
    wire [7:0] keyboard_row_4;
    wire [7:0] keyboard_row_5;
    wire [7:0] keyboard_row_6;
    wire [7:0] keyboard_row_7;     

    assign keyboard_row_0 = key_matrix_0[7:0];
    assign keyboard_row_1 = key_matrix_0[15:8];
    assign keyboard_row_2 = key_matrix_0[23:16];
    assign keyboard_row_3 = key_matrix_0[31:24];
    assign keyboard_row_4 = key_matrix_1[7:0];
    assign keyboard_row_5 = key_matrix_1[15:8];
    assign keyboard_row_6 = key_matrix_1[23:16];
    assign keyboard_row_7 = key_matrix_1[31:24];
    
    assign keyboard_result_byte = ~((~keyboard_control_byte[0] ? keyboard_row_0 : 0) |           
                                   (~keyboard_control_byte[1] ? keyboard_row_1 : 0) |
                                   (~keyboard_control_byte[2] ? keyboard_row_2 : 0) |
                                   (~keyboard_control_byte[3] ? keyboard_row_3 : 0) |
                                   (~keyboard_control_byte[4] ? keyboard_row_4 : 0) |
                                   (~keyboard_control_byte[5] ? keyboard_row_5 : 0) |
                                   (~keyboard_control_byte[6] ? keyboard_row_6 : 0) |
                                   (~keyboard_control_byte[7] ? keyboard_row_7 : 0));
    
endmodule


The code almost looks identical to the old code, except that we move the code into its own module.

Let us quickly look at the port into more detail:

  • key_matrix_0 and key_matrix_1: This corresponds to slv_reg_0 and slv_reg_1 which our ARM processor would set a particular key within the keyboard matrix.
  • keyboard_control_byte: This byte will be provided by the Port_A output of CIA#1
  • keyboard_result_byte: This will be fed back to CIA#1 via Port_B

Creating the CIA module

Let us now move unto the creation of the CIA module. Its code look as follows:

`timescale 1ns / 1ps

module cia(
  output [7:0] port_a_out,
  input [7:0] port_b_in,
  input [3:0] addr,
  input we,
  input clk,
  input [7:0] data_in,
  output reg [7:0] data_out
    );
    
  reg [7:0] slave_reg_0 = 0;
  reg [7:0] slave_reg_1 = 0;
  reg [7:0] slave_reg_2 = 0;
  reg [7:0] slave_reg_3 = 0;
  
  assign port_a_out = slave_reg_0;
  
  always @(posedge clk)
  if(!we)
  case (addr)
    0: data_out <= slave_reg_0;
    1: data_out <= port_b_in;
    2: data_out <= slave_reg_2;
    3: data_out <= slave_reg_3;
  endcase
  
  always @(posedge clk)
  if (we)
  case (addr)
    0: slave_reg_0 <= data_in;
    1: slave_reg_1 <= data_in;
    2: slave_reg_2 <= data_in;
    3: slave_reg_3 <= data_in;
  endcase
endmodule


Let us look into the ports:


  • port_a_out: This is the Port A ouput and will be fed to the keyboard module
  • port_b_in: This port will receive the keyboard result byte from the keyboard module
  • addr: Joined to the the address output of the CPU. Note we are only using the lower three bits since the CIA only have 16 registers.
  • we: Write enable. set by the CPU if it wants to write something to one of the CIA registers
  • clk: 1 Mhz clock
  • data_in : data from the cpu
  • data_out: Data from the CIA to CPU.
You will also see that we define a set of slave registers. The CIA have sixteen, but for now we have only defined 4 of them.

We also defined some functionality for reading and writing to these registers.

We have also linked up Port_A and Port_B.

Wiring everything up

Let us wire our two new modules up in the C64 Module:

...
    keyboard key_inst(
      .key_matrix_0(slave_0_reg),
      .key_matrix_1(slave_1_reg),
      .keyboard_control_byte(keyboard_control),
      .keyboard_result_byte(keyboard_result)
        );
        
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_b_in(keyboard_result),
          .addr(addr[3:0]),
          .we(we & (addr[15:8] == 8'hdc)),
          .clk(clk_1_mhz),
          .data_in(ram_in),
          .data_out(cia_1_data_out)
            );
...
    always @*
        casex (addr_delayed)
          16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
          16'b111x_xxxx_xxxx_xxxx : combined_d_out = kernel_out;
          16'hd012: combined_d_out = line_counter;
          16'hdcxx: combined_d_out = cia_1_data_out;
          default: combined_d_out = ram_out;
        endcase


The port assignment is pretty straightforward. I just want to mention that the assignment of we on cia_1, we only set if the address starts with DC.

Similarly, combined_d_out, the data to the CPU, we send the data output of CIA_1 if address starts with DC.

Finally, we need to create a top module for testing our C64 module in simulation mode:

module top(

    );
    
reg clk = 0;
reg reset = 1;
    
block_test my_c64(
      .clk(clk),      
      .proc_rst(reset),
      .slave_0_reg(1),
      .slave_1_reg(0)
        );

always #5 clk = ~clk;

initial begin
#100 reset = 0;
end    
endmodule


Here we do a very brief reset.

Also in the slave registers we only asserts one key.

Testing our new modules

Time to test our new modules.

As mentioned earlier, we can test our modules by booting our normals ROM's, but this would be too time consuming in a simulation.

We speed things up by writing a simple 6502 assembly test program. We will put this code in a copy of kernel ROM.

We start off by looking at the end of our kernel.hex file:

05
E5
4C
0A
E5
4C
00
E5
52
52
42
59
43
FE
E2
FC
48
FF

I have highlighted the reset vector, which is currently FCE2. The idea is to put our test program towards the end of the kernel ROM, so we will need to adjust the reset vector accordingly.

Here is the test program:

LDA #$FE
STA $DC00
LDA $DC01

So, we put the value FE on port a of CIA#1.  Since we are activating the first key in the key matrix in our top module, we are expecting to read back value FE from port B.

We modify the last part of kernel ROM as follows:

A9
FE
8D
00
DC
AD
01
DC
00
00
00
00
F0
FF
48
FF

Here our program starts at address FFF0 and I have adjusted the reset vector accordingly.

In our c64 module we can again make use of directives to make the switching between simulation and normal mode easier:

...
    `define SIM
    `ifdef SIM
      `define kernel_file "/home/johan/roms/kernel_debug.hex"
    `else
      `define kernel_file "/home/johan/Documents/roms/kernel.hex"
    `endif
...
    rom #(
         .ROM_FILE(`kernel_file)
        ) kernel(
          .clk(clk_1_mhz),
          .addr(addr[12:0]),
          .rom_out(kernel_out)
            );
...

So, for simulation, we use the file kernel_debug.hex, that contain our test program. This avoids copying and pasting roms around everytime when we want to switch to simulation mode.

A simulation run

When we run a simulation the result is the following:


The addr field is the addresses the CPU issues. Below the addr field is the data the CPU receives for the relevant addresses.

From the addresses we can see our program gets eventually executed at FFF0.

Marked by the arrows we see at one stage we issue a read for address DC01 and we get the value FE. This is what we expect.

In Summary

In this post we started to develop a full blown CIA block, so that we, in a later post be able to load a .TAP file and execute within our C64 module.

In this post we implemented enough functionality for the CIA for simulating keyboard access.

In the next post we will implement timers within our CIA.

Till next time!