C64 on an FPGA: June 2019

Wednesday, 19 June 2019

Integrating Tape Interface with C64 Module: Part 5

Foreword

In the previous post we have implemented interrupts within our CIA module.

At last we are ready to integrate our Tape module to our C64 module, which we will cover in this post.

We will be using the tape image for the game Dan Dare.

The ultimate goal for this post doesn't sound so exciting: To be able to found the Header on the mentioned Tape image and showing Found Dan Dare.

We will continue using the Tape image of Dan Dare in coming posts, and we will work our way through first getting the flashing borders and splash loading screen to display, till actually to a point where we can play the game.

FLAG1 as an edge triggered interrupt

If we were to take the output of our Tape module and connect it directly to the FLAG1 input of our CIA module, our FLAG1 will basically act as a level interrupt.

On a conventional CIA chip, however, all interrupts are edge triggered interrupts. We therefor need to make a small modification, so that our FLAG1 input behaves as an edge interrupt:

...
reg flag1_delayed;
...
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_b_in(keyboard_result),
          .addr(addr[3:0]),
          .we(we & (addr[15:8] == 8'hdc)),
          .clk(clk_1_mhz),
          .chip_select(addr[15:8] == 8'hdc),
          .data_in(ram_in),
          .data_out(cia_1_data_out),
          .flag1(flag1 & !flag1_delayed),
          .irq(irq)
            );

    always @(posedge clk_1_mhz)
      flag1_delayed <= flag1;
...

Tape Button and motor control

If you ever loaded a game from a tape on a real C64, you will know that the C64 has control over the Cassette motor: The tape briefly pauses when a file header is found, as well as when the file has finished loading.

How can the C64 control the tape motor in this way? The answer is via bits 4 and 5 from memory location 1. Quoted from this:

Bit #4: Datasette button status; 0 = One or more of PLAY, RECORD, F.FWD or REW pressed; 1 = No button is pressed.
Bit #5: Datasette motor control; 0 = On; 1 = Off.

In order for our C64 module to emulate tape loading, we also need to emulate the above mentioned bits.

We will use the ~/` button on the USB keyboard to represent the play button for Bit 4. We will hook up this key to our C64 in a coming section.

To serve above mentioned bots, we need to define an input port and an output port on our c64 module:

module c64(
...
  input wire tape_button,
  output reg motor_control = 1,
...
    );

The motor control bit gets set as follows:

    always @(posedge clk_1_mhz)
    if (we & (addr == 1))
      motor_control <= ram_in[5];

To cater for reads for location 1, we need to adjust our memory read case statement as follows:

    always @*
        casex (addr_delayed)
          16'b1: combined_d_out = {ram_out[7:6], motor_control, tape_button, ram_out[3:0]};
          16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
          16'b111x_xxxx_xxxx_xxxx : combined_d_out = kernel_out;
          16'hd012: combined_d_out = line_counter;
          16'hdcxx: combined_d_out = cia_1_data_out;
          default: combined_d_out = ram_out;
        endcase

We have now defined our motor control functioanlity within our C64 module, but we still need to route it to our Tape module:

module tape(
  input clk,
  input clk_1_mhz,
  input restart,
  input reset,
  output [31:0] ip2bus_mst_addr,
  output [11:0] ip2bus_mst_length,
  input [31:0] ip2bus_mstrd_d,
  output [4:0] ip2bus_inputs,
  input [5:0] ip2bus_otputs,
  input motor_control,
  output pwm
    );
...
tape_pwm t_pwm(
                  .time_val(timer_val),
                  .load_timer(load_timer),
                  .pwm(pwm),
                  .motor_control(motor_control),
                  .clk(clk_1_mhz)
                    );
..

In our pwm module we basically want to pause the generation of pwm pulses when the motor control bit is high:

...
  always @(posedge clk)
  if (timer > 0 & !motor_control)
    timer <= timer - 1;
  else if (timer == 0)
    timer <= load;
..

Assigning a keyboard button as play button

I mentioned earlier that I want to use the ~/` key on the USB keyboard as our play button.

You will recall from a previous post that we interface with a USB keyboard with a minimalistic USB protocol stack, where we convert USB key scan codes to C64 scancodes and then setting the appropriate bit in a key array linked to our C64 module.

We start by adding a mapping for the ~ key to the big if-else statement:

u32 mapUsbToC64(int usbCode) {
 if (usbCode == 0x4) { //A
  return 0xa;
 } else if (usbCode == 0x5) { //B
  return 0x1c;
 } else if (usbCode == 0x6) { //C
  return 0x14;
...
...
 } else if (usbCode == 0x2c) { //space
  return 0x3c;
 } else if (usbCode == 53) { //play key `~
  return 100;
 }


}

The USB key scancode for the required key is 53 and we map this code 100.

But, C64 scan codes only goes up to 63, so what is with the 100?

The point is that the play key is not actually key in the key bit array that we want to set. We rather use it as a special value:

void getC64Words(u32 usbWord0, u32 usbWord1, u32 *c64Word0, u32 *c64Word1) {
  *c64Word0 = 0;
  *c64Word1 = 0;

  if (usbWord0 & 2) {
   *c64Word0 = 0x8000;
  }

  usbWord0 = usbWord0 >> 16;

  for (int i = 0; i < 2; i++) {
   int current = usbWord0 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     }

   }

   usbWord0 = usbWord0 >> 8;
  }

  for (int i = 0; i < 4; i++) {
   int current = usbWord1 & 0xff;
   if (current != 0) {
     int scanCode = mapUsbToC64(current);
     if (scanCode == 100) {
      Xil_Out32(0x43C00008, 0);
     } else if (scanCode < 32) {
     *c64Word0 = *c64Word0 | (1 << scanCode);
     } else {
     *c64Word1 = *c64Word1 | (1 << (scanCode - 32));
     }

   }

   usbWord1 = usbWord1 >> 8;
  }

}

So if, during the conversion process from USB to C64 scan code we encounter a 100, we set the value zero to memory location 0x43c00008.

Remember that 0x43c00008 is a register mapped to our AXI slave interface. Just to recap from a previous post:

0x43c0_0000: key array 1
0x43c0_0004: key array 2
0x43c0_0008: tape control:
                bit 0: PWM bit (READ ONLY). For debug
                bit 1: Reset tape position to address 0x238270 in memory

For register 0x43c0_0008 we will introduce an extra function for bit 2: Tape button. Our Slave AXI block needs to be modified to accommodate this bit so it can be surfaced to the tape_button input of our C64 module.

Test Run

Time for a test run.

As usual you will open Xilinx SDK, program the FPGA and start our C program to run on the ARM processor. In our case the C program will run a mini USB stack, capturing keystrokes and forwarding to our C64 module.

With the C program running, we need to open the XSCT console, and execute a couple of commands.

First we need to pause our Tape module and reset the read address:

mwr 0x43c00008 6

Next we should load the tape image into memory:

mwr -size b -bin -file "/home/johan/Downloads/Dan Dare.tap" 0x238250 600000

Next we release the reset bit:

mwr 0x43c00008 4

You can now move to the USB keyboard and follow the conventional process for kicking off the loading of a tape on a C64. That is typing LOAD and when prompted to press play on tape, just press ~.

Currently our VIC-II doesn't support screen blanking, as you would see when loading a program from tape. However, we can see the prompts as something is loading:

In Summary

In this post we finally integrated our Tape Module into our C64 Module.

As a test we managed to load a file header from a .tap file and display the file name.

In the next post we will implement the necessary functionality for displaying the loading effects while loading the game Dan Dare.

Till next time!

Saturday, 15 June 2019

Integrating Tape Interface with C64 Module: Part 4

Foreword

In the previous post we developed and integrated timers to our CIA module.

We also ran a simulation to verify that our timers works more or less as expected within the CIA module.

In this post we will be implementing interrupts within our CIA module.

With interrupts fully implemented within our CIA module, we are one step closer to integrate our tape module to our C64 module.

An overview of Interrupts on the CIA

The easiest way to understand how interrupts work on the CIA, is just to look at the datasheet of a CIA 6526.

A copy can easily obtained from the archives of 6502.org: http://archive.6502.org/datasheets/mos_6526_cia_recreated.pdf

Within the datasheet, we can see that there is one register taking care of all interrupt functionality:

So, all the interrupt functionality is provided by register number 14 of the CIA. Behind the scenes, however, register 14 is in actual fact two separate registers: When writing to location 14, the value will be saved in an Interrupt Mask register. When reading from this location, the value of the Interrupt Status Register will be returned.

At point a typical question will pop up: If you cannot write to the Interrupt status register, how can you clear this register from interrupts that occurred?

The answer: By reading this register.

You can see that the CIA supports five interrupts. In our whole C64 project, however, we will only be using three interrupts:

Timer A
Timer B, and
FLG

The FLG is the interrupt we will be using to connect our Tape interface in a later post.

Defining Interrupts on the Timers

We will be using timers to test our interrupt functionality.

Our Timer module, as it stands currently, doesn't have an output defined to signal a timer interupt.

So, let us start off by first defining an output port on out timer module for signalling an interrupt.:

module timer(
  input [15:0] reload_val,
  input force_reload,
  input new_started,
  input new_runmode,
  input write_control,
  input clk,
  output [15:0] counter_out,
  output started_status,
  output runmode_status,
  output overflow
    );

This pin will obviously set to a one when our counter reaches zero:

assign overflow = (counter == 0) ? 1 : 0;

We need to wire these pins up for Timer A and Timer B in our CIA module:

...
  wire timer_a_overflow;
  wire timer_b_overflow;
...
  timer timer_a(
    .reload_val({slave_reg_5,slave_reg_4}),
    .force_reload(write_cra & data_in[4]),
    .new_started(data_in[0]),
    .new_runmode(data_in[3]),
    .write_control(write_cra),
    .clk(clk),
    .counter_out(counter_a_val),
    .started_status(started_status_a),
    .runmode_status(runmode_status_a), 
    .overflow(timer_a_overflow)
      );

  timer timer_b(
    .reload_val({slave_reg_7,slave_reg_6}),
    .force_reload(write_crb & data_in[4]),
    .new_started(data_in[0]),
    .new_runmode(data_in[3]),
    .write_control(write_crb),
    .clk(clk),
    .counter_out(counter_b_val),
    .started_status(started_status_b),
    .runmode_status(runmode_status_b), 
    .overflow(timer_b_overflow)
      );
...

We have now defined two interrupts that we can to develop the Interrupts functionality for our CIA.

Developing the Interrupt functionality

Let us now develop the rest of the interrupt functionality.

As mentioned earlier, the ICR is in effect two registers. So, let us start by creating them:

  reg [4:0] int_mask = 0;
  reg [4:0] int_stat = 0;

Since the CIA only supports 5 interrupts, I am only making this register 5 bits wide, instead of the usual 8.

Let us create the logic for setting the contents of the interrupt mask register:

  always @(posedge clk)
  if (we)
  case (addr)
    0: slave_reg_0 <= data_in;
    1: slave_reg_1 <= data_in;
    2: slave_reg_2 <= data_in;
    3: slave_reg_3 <= data_in;
    4: slave_reg_4 <= data_in;
    5: slave_reg_5 <= data_in;
    6: slave_reg_6 <= data_in;
    7: slave_reg_7 <= data_in;
    13: if (data_in[7]) 
          int_mask <= int_mask | data_in[4:0];
        else
          int_mask <= int_mask & ~data_in[4:0];
    14: slave_reg_14 <= data_in;
    15: slave_reg_15 <= data_in;
  endcase

If this looks a bit strange, just remember that the MSB of the value written to the Interrupt Mask register have an import role. If this value is one, it means we are enabling interrupts. If the MSB is zero, it means that we want to mask off interrupts.

Next, let us tackle the Interrupts status register. Before we look into this register, it is important to realise that our CIA module in its current state have a design anomaly.

Our CIA is currently linked up to the lower 4 bits of the address bus, and for all address reads our CIA will return data for the register these four bits represents.

In general, this is not a problem for us, because we some multiplexing logic that will just ignore these values if we didn't targeted a CIA with a read.

This do, however, become a problem with reads from the Interrupt status register, since a read clears the contents of the Interrupt Status register.

We therefor need some way to indicate to the CIA when are targeting a read towards it. We do this by defining an extra port:

module cia(
...
  input chip_select,
...
    );

From the outside, we assign this port as follows:

    cia cia_1(
...
          .chip_select(addr[15:8] == 8'hdc),
...
            );

One might be tempted to think, shouldn't we rather check for addr[15:5]?

The answer is no. On a C64 within the memory range DC00-DCFF it will always assume the lower four bits is meaned for CIA#1.

We can now do the assignment to the Int Mask Register:

  always @(posedge clk)
  if (chip_select & !we & (addr == 13))
    int_stat <= 0;
  else
    int_stat <= int_stat | {flag1, 1'b0, 1'b0, timer_b_overflow, timer_a_overflow};

Flag1 is only shown for sake of completeness.

The case statement for assigning data_out looks as follows:

  always @(posedge clk)
  if(!we)
  case (addr)
    0: data_out <= slave_reg_0;
    1: data_out <= port_b_in;
    2: data_out <= slave_reg_2;
    3: data_out <= slave_reg_3;
    4: data_out <= counter_a_val[7:0];
    5: data_out <= counter_a_val[15:8];
    6: data_out <= counter_b_val[7:0];
    7: data_out <= counter_b_val[15:8];
    13: data_out <= {irq, 1'b0, int_stat};
    14: data_out <= {slave_reg_14[7:5],
                     1'b1,
                     runmode_status_a,
                     slave_reg_14[2],
                     slave_reg_14[1],
                     started_status_a
    };
    15: data_out <= {slave_reg_15[7:5],
                     1'b1,
                     runmode_status_b,
                     slave_reg_15[2],
                     slave_reg_15[1],
                     started_status_b
    };

  endcase

You will see that we make reference to an irq, which we haven't defined yet.

This is the output we will actually hook up to the irq pin of our 6502. We declare this wire within our CIA module as follows:

module cia(
...
  output irq
    );
...
  assign irq = (int_stat & int_mask) ? 1 : 0;
...

An interrupt will only resolve to an IRQ if enable by the Interrupt Mask Register.

The Test Program

To test the interrupt functionality we will again write a 6502 test program that we will run in simulation mode:

      SEI
      LDA #$1F
      STA $DC0D
      LDA $DC0D
      CLI
      LDA #$64
      STA $DC04
      LDA #$00
      STA $DC05
      LDA #$11
      STA $DC0E
      JSR DELAY
      LDA #$81
      STA $DC0D
      JSR DELAY
      LDA #$01
      STA $DC0D
      JSR DELAY 
      NOP
      NOP
      NOP      
DELAY LDX #$14
LOOP DEX
     BNE LOOP
     LDA $DC04
     LDX $DC0D
RTS

With this program we let Timer A expire twice, with interrupts disabled in the first expire and enabled in the second expire.

In this test program I haven't wrote an interrupt handler. It is sufficient for now to confirm in the simulation to verify that a jump to the interrupt vector is performed.

In Summary

In this post we have implemented interrupts within our CIA module.

In the next post we will verify if our FPGA can boot ok with the CIA module.

We will then wire up the Tape module to our C64 module.

Till next time!

Monday, 10 June 2019

Integrating Tape Interface with C64 Module: Part 3

Foreword

In the previous post we started to developed a CIA module for our C64 FPGA design.

At the end of the previous post we have implemented the Port A and Port B of the CIA, which we used for keyboard interfacing.

In this post we are going to implemented the Timers (e.g. Timer A and Timer B) within our CIA module.

Developing the Timer Module

A timer is a crucial component of a CIA chip. So, let us begin by summarising a timer's operation:

A timer counts down from a predefined value till it reaches zero.
As soon as the timer reaches zero, an underflow condition occurs. With the underflow condition the timer gets reloaded with a predefined value.
With the timer reloaded, one of two things can happen:

Continuous mode: The timer with continue counting from the predefined value towards zero
One-shot mode: With the timer reloaded, the timer will stop.

As seen from the above, a crucial part of timer operating is deciding when the timer should start, and when it should stop. A START register bit is specifically commissioned for this purpose.

If the START bit is one, it will count. As soon as the START bit is set to zero, the timer will stop counting.

There is two sources that can set the START bit:

You can set this bit yourself
Overflow condition in One-shot mode: With an overflow condition in one-shot mode, the START bit will be changed from a one to a zero.

With the theory discussed, let us start to create the timer module.

The core of the time module is the timer itself, so let us start by defining it as a register:

  reg [15:0] counter = 0;

As discussed, we need to also store the the state of START and RUNMODE. So, we will define a register for both:

  reg started = 0;
  reg runmode = 0;

At this point, the question is: How do we externally set the state of these two registers?

Obviously we will start by defining input ports for setting these values:

module timer(
...
  input new_started,
  input new_runmode,
  input clk,
...
    );

Also, it would be nice to indicate when the state inputs are valid. We will therefore create a new input port:

module timer(
...
  input write_control,
...
    );

The following snippet shows the setting of the RUNMODE state:

  always @(posedge clk)
  if (write_control)
    runmode <= new_runmode;

The setting of the START state is a bit more complex, since we can either set this state manually, or it can potentially get set during an underflow condition:

  always @(posedge clk)
  if (counter == 0 & runmode)
    started <= 0;
  else if (write_control)
    started <= new_started;

Let us now write some code for updating the counter itself:

module timer(
  input [15:0] reload_val,
  input force_reload,
...
    );

...
  always @(posedge clk)
  if (force_reload & write_control)
      counter <= reload_val;
  else if (counter > 0 & started)
    counter <= counter - 1;
  else if (counter == 0 & started)
    counter <= reload_val;
...

You can see that I have introduced two new ports which I haven't discussed before: reload_val and force_reload.

reload_val contains the predefined starting value for our timer. I have decided not to store this value internally, so with each reload I will fetch the value externally.

After some consideration, I thought it was best to store the reload values in registers on the CIA module itself, so it is not necessary to inspect the write_control value when reloading the counter.

force_reload is another feature of timers on a CIA. At any point, whether the timer is counting down or not, you can reload the timer by asserting the force_reload input.

We are just about done. What we still need to do, is to expose the internal state of our timer to the CIA:

module timer(
...
  output [15:0] counter_out,
  output started_status,
  output runmode_status
    );
...
  assign started_status = started;
  assign runmode_status = runmode;
   
  assign counter_out = counter;
...

Testing the Timer

Before integrating the timer with the CIA module, we need to test the timer module a bit.

The following top module will aid in testing the timer module:

module top(

    );

reg clk = 0;
reg force_reload = 0;
reg started = 0;
reg run_mode = 1;
reg write_control = 0;
wire [15:0] counter_val;

timer timer_a(
  .reload_val(20),
  .force_reload(force_reload),
  .new_started(started),
  .new_runmode(run_mode),
  .write_control(write_control),
  .clk(clk),
  .counter_out(counter_val) 
);

initial begin
  #50
  @(negedge clk)
  force_reload = 1;
  write_control = 1;
  @(negedge clk)
  force_reload = 0;
  write_control = 0;
  #50
  @(negedge clk)
  write_control = 1;
  started = 1;
  @(negedge clk)
  write_control = 0;
  #1000
  @(negedge clk)
  write_control = 1;
  started = 1;
  run_mode = 0;
  @(negedge clk)
  write_control = 0;
//  started = 1;
  
end

always #5 clk = ~clk;
endmodule

Here we test our timer with a force reload, and starting and stopping the timer.

Integration into the CIA module

With our Timer Module tested, it is now time to integrate this module into our CIA module.

We start by adding a couple of new slave registers:

  reg [7:0] slave_reg_0 = 0;
  reg [7:0] slave_reg_1 = 0;
  reg [7:0] slave_reg_2 = 0;
  reg [7:0] slave_reg_3 = 0;
  reg [7:0] slave_reg_4 = 255;
  reg [7:0] slave_reg_5 = 255;
  reg [7:0] slave_reg_6 = 255;
  reg [7:0] slave_reg_7 = 255;  
  reg [7:0] slave_reg_14 = 255;
  reg [7:0] slave_reg_15 = 255;

We then proceed and add some new wires:

  wire write_cra;
  wire write_crb;
  
  wire [15:0] counter_a_val;
  wire started_status_a;
  wire runmode_status_a; 

  wire [15:0] counter_b_val;
  wire started_status_b;
  wire runmode_status_b; 

  
  assign write_cra = we & (addr == 14) ? 1 : 0;
  assign write_crb = we & (addr == 15) ? 1 : 0;

I have defined write_cra and write_crb to signal our timer module when we are setting state, that is a write to register 14 or 15.

Here is how we define the two timer instances (e.g. Timer A and Timer B):

  timer timer_a(
    .reload_val({slave_reg_5,slave_reg_4}),
    .force_reload(write_cra & data_in[4]),
    .new_started(data_in[0]),
    .new_runmode(data_in[3]),
    .write_control(write_cra),
    .clk(clk),
    .counter_out(counter_a_val),
    .started_status(started_status_a),
    .runmode_status(runmode_status_a) 

      );

  timer timer_b(
    .reload_val({slave_reg_7,slave_reg_6}),
    .force_reload(write_crb & data_in[4]),
    .new_started(data_in[0]),
    .new_runmode(data_in[3]),
    .write_control(write_crb),
    .clk(clk),
    .counter_out(counter_b_val),
    .started_status(started_status_b),
    .runmode_status(runmode_status_b) 

      );

What remains to be done is to modify the functionality to write/read to the new CIA registers:

  always @(posedge clk)
  if(!we)
  case (addr)
    0: data_out <= slave_reg_0;
    1: data_out <= port_b_in;
    2: data_out <= slave_reg_2;
    3: data_out <= slave_reg_3;
    4: data_out <= counter_a_val[7:0];
    5: data_out <= counter_a_val[15:8];
    6: data_out <= counter_b_val[7:0];
    7: data_out <= counter_b_val[15:8];
    
    14: data_out <= {slave_reg_14[7:5],
                     1'b1,
                     runmode_status_a,
                     slave_reg_14[2],
                     slave_reg_14[1],
                     started_status_a
    };
    15: data_out <= {slave_reg_15[7:5],
                     1'b1,
                     runmode_status_b,
                     slave_reg_15[2],
                     slave_reg_15[1],
                     started_status_b
    };

  endcase
  
  always @(posedge clk)
  if (we)
  case (addr)
    0: slave_reg_0 <= data_in;
    1: slave_reg_1 <= data_in;
    2: slave_reg_2 <= data_in;
    3: slave_reg_3 <= data_in;
    4: slave_reg_4 <= data_in;
    5: slave_reg_5 <= data_in;
    6: slave_reg_6 <= data_in;
    7: slave_reg_7 <= data_in;
    
    14: slave_reg_14 <= data_in;
    15: slave_reg_15 <= data_in;
  endcase

Testing the CIA

With the timers integrated we can now run a small test program 6502 assembly program to see if the CIA functions correctly as a whole:

LDA #$20
STA $DC04
LDA #$1
STA $DC05
LDA $DC0E
ORA #$8
STA $DC0E
ORA #$1
STA $DC0E
NOP
NOP
NOP
NOP
NOP
NOP
LDA $DC0E
ORA #$10
STA $DC0E
NOP
NOP
NOP
NOP
NOP
NOP

This code will test timer A in One-Shot mode. With little adjustment, the code can be modified to also test timer B.

In Summary

In this post we have implemented timers within our CIA module.

In the next post we will implement the Interrupt functionality of the CIA.

Once we have implemented interrupts on our CIA, we would be able to interface the Tape module to our C64 module.

Till next time!

Monday, 3 June 2019

Integrating Tape Interface with C64 Module: Part 2

Foreword

In the previous post we managed to integrate our tape module into our existing C64 design and verified that our Tape module produced pulses of the correct widths given a particular .TAP file stored in SDRAM.

At this point in time, however, our Tape module is not wired to our C64 module so that it can do something useful for us. By useful I am mean loading the .TAP file into C64 so we can play the game these pulse widths represents 😃

Of course, if we had a fully developed C64 module, we could implement the functionality mentioned in the previous paragraph by simply taking the output of our Tape Module and connecting it to the FLAG input of a CIA#1 module and there you go!

But, in our case we don't have a fully developed C64 module. In this series of blog posts we are adding functionality in an incremental fashion and at this point we don't have a fully functional CIA module.

It is therefore the purpose of this post and a one or two subsequent posts to develop a fully functional CIA module. With this module developed, it will just be a matter of taking our Tape module and Plug & Play.

In the process of developing a CIA module, I will be doing a bit of code refactoring, moving existing code from the C64 module to the CIA module and into other new modules.

In this whole refactoring exercise, we will potentially introduce many bugs and it may be worthwhile to make use of simulation again to make it easier ironing out these bugs.

You might remember from some previous posts, the process of simulating the booting of a C64 system within a verilog simulator can be quite a time consuming process. You can wait between 20 and 30 minutes till you get to the point where the welcome message is written to screen memory.

This waiting can be quite a nuisance if you want to fix something small and just would like to test if the fixed worked.

I will show you some techniques on how you can drastically trim down on simulation wait time, to make life less frustrating.

Directives

You might have heard about C pre-processor directives, where you make use of defines that gets expanded before compile time.

In Verilog we also can make use of directives. I will be making use of directives in this post so we configure our C64 module to run in simulation mode and in "real" mode.

Directives just makes live easier so simulation mode code and normal code can live together, and you don't need to make so many changes each time between switching modes.

We will start by adding the following define right in the beginning of our C64 module:

`define SIM

This define signals that we want to run in simulation mode. Should we want to disable this mode, you can just comment this line out.

The first place where we would use this define, would be where we create an instance of our CPU:

`ifdef SIM
    cpu mycpu ( clk_1_mhz, proc_rst, addr, combined_d_out, ram_in, we, int_occ/*0*/, 1'b0, 1'b1 );
`else
    cpu mycpu ( clk_1_mhz, c64_reset, addr, combined_d_out, ram_in, we, int_occ/*0*/, 1'b0, 1'b1 );
`endif

Notice that the only difference between the two declaration is in the second port. In simulation mode we connect this port to proc_reset and in the other mode on c64_reset.

c64_reset is a port we have defined previously on our VIC-II module. The problem with this port is that it is a very time consuming process, cycle wise. Connecting to this port during simulation will indeed cause our simulation to take very long to complete.

For simulation we therefor connect to proc_rst, which is an input port on our C64 module. When simulating we can create a top module and connect the proc_rst port to a simulated reset with a much shorter duration.

To make life easier, we can also disable the creating of a VIC-II module during simulation:

`ifndef SIM
vic_ii vic_inst
    (
        .clk_in(clk),
        .clk_counter(clk_div_counter),
        .clk_2_mhz(clk_2_mhz),
        .blank_signal(blank_signal),
        .frame_sync(frame_sync),
        .data_in({14'd4,vic_combined_d}),
        .c64_reset(c64_reset),
        .addr(vic_addr),
        .out_rgb(out_rgb)
        );
        
    burst_block burst_tst(
            .clk(axi_clk_in),
            .reset(proc_rst),
            .write(write_pin),
            .next_frame(frame_sync),
            .write_data({pixel_16_bit_delay,pixel_16_bit}), 
            .count_in_buf(count_in_buf),
            //output src ready
            //-----------------------------------------
            .ip2bus_mst_addr(ip2bus_mst_addr),
            .ip2bus_mst_length(ip2bus_mst_length),
            .ip2bus_mstwr_d(ip2bus_mstwr_d),
            .ip2bus_inputs(ip2bus_inputs),
            .ip2bus_otputs(ip2bus_otputs),
            .read(read)
              );
`endif

Notice this time we make use of ifndef, meaning if not defined. So these two blocks will only be added if we are not in simulation mode.

You will also see that I am removing the burst_block, required for AXI memory access, when doing simulation.

Moving keyboard functionality into its own module

In our C64 module's current state, we some entangled keyboard functionality and CIA functionality.

It make sense to split this functionality and will also mark the start of our new CIA module.

Let us start with a keyboard module:

`timescale 1ns / 1ps

module keyboard(
  input [31:0] key_matrix_0,
  input [31:0] key_matrix_1,
  input [7:0] keyboard_control_byte,
  output [7:0] keyboard_result_byte
    );
    
    wire [7:0] keyboard_row_0;
    wire [7:0] keyboard_row_1;
    wire [7:0] keyboard_row_2;
    wire [7:0] keyboard_row_3;
    wire [7:0] keyboard_row_4;
    wire [7:0] keyboard_row_5;
    wire [7:0] keyboard_row_6;
    wire [7:0] keyboard_row_7;     

    assign keyboard_row_0 = key_matrix_0[7:0];
    assign keyboard_row_1 = key_matrix_0[15:8];
    assign keyboard_row_2 = key_matrix_0[23:16];
    assign keyboard_row_3 = key_matrix_0[31:24];
    assign keyboard_row_4 = key_matrix_1[7:0];
    assign keyboard_row_5 = key_matrix_1[15:8];
    assign keyboard_row_6 = key_matrix_1[23:16];
    assign keyboard_row_7 = key_matrix_1[31:24];
    
    assign keyboard_result_byte = ~((~keyboard_control_byte[0] ? keyboard_row_0 : 0) |           
                                   (~keyboard_control_byte[1] ? keyboard_row_1 : 0) |
                                   (~keyboard_control_byte[2] ? keyboard_row_2 : 0) |
                                   (~keyboard_control_byte[3] ? keyboard_row_3 : 0) |
                                   (~keyboard_control_byte[4] ? keyboard_row_4 : 0) |
                                   (~keyboard_control_byte[5] ? keyboard_row_5 : 0) |
                                   (~keyboard_control_byte[6] ? keyboard_row_6 : 0) |
                                   (~keyboard_control_byte[7] ? keyboard_row_7 : 0));
    
endmodule

The code almost looks identical to the old code, except that we move the code into its own module.

Let us quickly look at the port into more detail:

key_matrix_0 and key_matrix_1: This corresponds to slv_reg_0 and slv_reg_1 which our ARM processor would set a particular key within the keyboard matrix.
keyboard_control_byte: This byte will be provided by the Port_A output of CIA#1
keyboard_result_byte: This will be fed back to CIA#1 via Port_B

Creating the CIA module

Let us now move unto the creation of the CIA module. Its code look as follows:

`timescale 1ns / 1ps

module cia(
  output [7:0] port_a_out,
  input [7:0] port_b_in,
  input [3:0] addr,
  input we,
  input clk,
  input [7:0] data_in,
  output reg [7:0] data_out
    );
    
  reg [7:0] slave_reg_0 = 0;
  reg [7:0] slave_reg_1 = 0;
  reg [7:0] slave_reg_2 = 0;
  reg [7:0] slave_reg_3 = 0;
  
  assign port_a_out = slave_reg_0;
  
  always @(posedge clk)
  if(!we)
  case (addr)
    0: data_out <= slave_reg_0;
    1: data_out <= port_b_in;
    2: data_out <= slave_reg_2;
    3: data_out <= slave_reg_3;
  endcase
  
  always @(posedge clk)
  if (we)
  case (addr)
    0: slave_reg_0 <= data_in;
    1: slave_reg_1 <= data_in;
    2: slave_reg_2 <= data_in;
    3: slave_reg_3 <= data_in;
  endcase
endmodule

Let us look into the ports:

port_a_out: This is the Port A ouput and will be fed to the keyboard module
port_b_in: This port will receive the keyboard result byte from the keyboard module
addr: Joined to the the address output of the CPU. Note we are only using the lower three bits since the CIA only have 16 registers.
we: Write enable. set by the CPU if it wants to write something to one of the CIA registers
clk: 1 Mhz clock
data_in : data from the cpu
data_out: Data from the CIA to CPU.

You will also see that we define a set of slave registers. The CIA have sixteen, but for now we have only defined 4 of them.

We also defined some functionality for reading and writing to these registers.

We have also linked up Port_A and Port_B.

Wiring everything up

Let us wire our two new modules up in the C64 Module:

...
    keyboard key_inst(
      .key_matrix_0(slave_0_reg),
      .key_matrix_1(slave_1_reg),
      .keyboard_control_byte(keyboard_control),
      .keyboard_result_byte(keyboard_result)
        );
        
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_b_in(keyboard_result),
          .addr(addr[3:0]),
          .we(we & (addr[15:8] == 8'hdc)),
          .clk(clk_1_mhz),
          .data_in(ram_in),
          .data_out(cia_1_data_out)
            );
...
    always @*
        casex (addr_delayed)
          16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
          16'b111x_xxxx_xxxx_xxxx : combined_d_out = kernel_out;
          16'hd012: combined_d_out = line_counter;
          16'hdcxx: combined_d_out = cia_1_data_out;
          default: combined_d_out = ram_out;
        endcase

The port assignment is pretty straightforward. I just want to mention that the assignment of we on cia_1, we only set if the address starts with DC.

Similarly, combined_d_out, the data to the CPU, we send the data output of CIA_1 if address starts with DC.

Finally, we need to create a top module for testing our C64 module in simulation mode:

module top(

    );
    
reg clk = 0;
reg reset = 1;
    
block_test my_c64(
      .clk(clk),      
      .proc_rst(reset),
      .slave_0_reg(1),
      .slave_1_reg(0)
        );

always #5 clk = ~clk;

initial begin
#100 reset = 0;
end    
endmodule

Here we do a very brief reset.

Also in the slave registers we only asserts one key.

Testing our new modules

Time to test our new modules.

As mentioned earlier, we can test our modules by booting our normals ROM's, but this would be too time consuming in a simulation.

We speed things up by writing a simple 6502 assembly test program. We will put this code in a copy of kernel ROM.

We start off by looking at the end of our kernel.hex file:

05
E5
4C
0A
E5
4C
00
E5
52
52
42
59
43
FE
E2
FC
48
FF

I have highlighted the reset vector, which is currently FCE2. The idea is to put our test program towards the end of the kernel ROM, so we will need to adjust the reset vector accordingly.

Here is the test program:

LDA #$FE
STA $DC00
LDA $DC01

So, we put the value FE on port a of CIA#1. Since we are activating the first key in the key matrix in our top module, we are expecting to read back value FE from port B.

We modify the last part of kernel ROM as follows:

A9
FE
8D
00
DC
AD
01
DC
00
00
00
00
F0
FF
48
FF

Here our program starts at address FFF0 and I have adjusted the reset vector accordingly.

In our c64 module we can again make use of directives to make the switching between simulation and normal mode easier:

...
    `define SIM
    `ifdef SIM
      `define kernel_file "/home/johan/roms/kernel_debug.hex"
    `else
      `define kernel_file "/home/johan/Documents/roms/kernel.hex"
    `endif
...
    rom #(
         .ROM_FILE(`kernel_file)
        ) kernel(
          .clk(clk_1_mhz),
          .addr(addr[12:0]),
          .rom_out(kernel_out)
            );
...

So, for simulation, we use the file kernel_debug.hex, that contain our test program. This avoids copying and pasting roms around everytime when we want to switch to simulation mode.

A simulation run

When we run a simulation the result is the following:

The addr field is the addresses the CPU issues. Below the addr field is the data the CPU receives for the relevant addresses.

From the addresses we can see our program gets eventually executed at FFF0.

Marked by the arrows we see at one stage we issue a read for address DC01 and we get the value FE. This is what we expect.

In Summary

In this post we started to develop a full blown CIA block, so that we, in a later post be able to load a .TAP file and execute within our C64 module.

In this post we implemented enough functionality for the CIA for simulating keyboard access.

In the next post we will implement timers within our CIA.

Till next time!