C64 on an FPGA: Booting the C64 System

Foreword

In the previous post we managed to successfully run Klaus Dormann's Test Suite on the Zybo Board.

In this post we will extend our implementation to boot the C64 system.

At the end of this post we will run the resulting implementation in a simulator and in the next post we will get to running it on the ZYBO board.

Adding the C64 ROMS

In order to boot the C64 system we need to add the two ROMS, e.g. BASIC and KERNEL to our design.

The process would be more or less the same as we did with adding the TestSuite binary in previous posts.

Since we are working with ROMS, however, we will only be adding logic to read data from the Block RAM and no write logic.

Since we are dealing with two ROMS and in later posts three ROMS when adding the Chargen ROM, it make sense to extract the common logic into a module of its own. The signature of this module will look as follows:

module rom#(
 parameter ADDR_WIDTH = 13,
 parameter ROM_FILE = ""

)

(
  input clk,
  input wire [ADDR_WIDTH-1:0] addr,
  output reg [7:0] rom_out
    )

You will notice our signature contains an extra section preceded by a hash, which is a style we haven't use before.

The hash section is basically a parameter section, declaring parameters with default values. The nice thing about these parameters is that you can override these values when you create a module instance with suitable values.

In the parameter section of our rom module we have the parameter ADDR_WIDTH with a a default value of 13. This means that if you instantiate a rom module instance and you don't override the ADDR_WIDTH parameter, your resulting instance can accept addresses of maximum 13 bits.

13 bits gives us 8KB of addressable space. This default is sufficient for both the BASIC ROM and the KERNEL.

In later posts, however, where we will be adding the CharROM which is only 4KB we will need to override the ADDR_WIDTH with a value of 12.

Let us now look at the meat of our rom module:

reg [7:0] rom[2**ADDR_WIDTH-1:0];

 always @ (posedge clk)
    begin
      rom_out <= rom[addr];
    end 

    
initial begin
      $readmemh(ROM_FILE, rom) ;
    end

We begin by defining an array that will contain the contents for the applicable ROM. In defining the size of the array we make use of the ADDR_WIDTH parameter defined previously.

We populate the contents of this array with an initial block similarly as we did in a previous post.

We define a always block for pushing the contents for given address to an output register on the positive transition of the clock pulse.

With our rom module defined, we can now create some instances of it in our main module:

rom #(
 .ROM_FILE("/home/johan/Documents/roms/kernel.hex")
) kernel(
  .clk(clk),
  .addr(addr[12:0]),
  .rom_out(kernel_out)
    );

rom #(
 .ROM_FILE("/home/johan/Documents/roms/basic.hex")
) basic(
  .clk(clk),
  .addr(addr[12:0]),
  .rom_out(basic_out)
    );

For both instances we send as paramater the location to a hex formatted file containing the content for applicable ROM.

For the address we send through the least 13 bits of the address bus.

We are missing some arbitration logic that will ensure, depending on the given address whether we return the contents of the BASIC ROM, KERNEL or our 64KB RAM.

Adding Arbitration Logic

The logic for performing arbitration is as follows:

...
reg [7:0] combined_d_out;
...
always @*
  casex (addr)
    16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
    16'b111x_xxxx_xxxx_xxxx : combined_d_out = kernel_out;
    default: combined_d_out = ram_out;
  endcase
...

The function of this logic can be represented in a diagram as follows:

All our storage elements, BASIC, KERNEL and our RAM gets fed to a multiplexer and we use the address as selector to decide which one gets send to the DI input of the 6502 CPU.

Let us now look at our piece of Verilog code in more detail. This will indeed look familiar to programmers as a case/switch statement.

This case statement, however, starts with casex instead of case. This is a special kind of Verilog statement, where in the selector you can specify Don't care values.

A don't care value you sepcify with an X, and means that this position can be any value.

Strictly speaking, if you look at our case statement, you could have only connected only the most significant three bits to our case statement, since the lower thirteen doesn't serve any purpose. But, as you will see later, we will need to full addresses for a scenario where will check for a specific address.

One thing we haven't consider in our design is the way Block RAMS work. Block RAMS only show the output a clock pulse after the address is asserted. In our design, however, we are multiplexing one clock cycle to early, meaning that by the time the data is ready, we might have switched that block rom out of view with the next address.

The solution would be to delay address input also by one clock cycle. This will result into the following changes:

...
reg [15:0] addr_delayed;
...
 always @ (posedge clk)
    addr_delayed <= addr;
...
always @*
  casex (addr_delayed)
    16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
    16'b111x_xxxx_xxxx_xxxx : combined_d_out = kernel_out;
    default: combined_d_out = ram_out;
  endcase
...

Preparing for Simulation

All our for our 6502 system is currently wrapped in module called c64_core that is contained in Design sources, used for performing synthesis.

We also have a similar module within our simulation sources containing code for assisting a simulation.

With this current setup you would develop in the copy contained in simulation sources, making it is easy run a simulation now and again to check if you are on the right track.

Once finished with your development though, you would need to copy your changes to c64_core in Design sources.

This copy and pasting can be quite error prone. A better approach would be to let both the design and simulation sources share the same c64_core module. Then, within the simulation sources you create a top module surrounding the c64_core module. This top module would then contain all the simulation specific code.

Let us start with this top module. First, let us look again at the signature of c64_core module:

module c64_core(
  input wire clk_in,
  input wire reset,
  input wire debug_clk,
  input wire debug_mode,
  output wire [15:0] addr_out
    );

The resulting top module is quite simple:

reg clk = 0;
reg reset = 1;
wire [15:0] addr_out;

c64_core my_core(
    .clk_in(clk),
    .rst(reset),
    .debug_clk(1'b0),
    .debug_mode(1'b0),
    .addr_out(addr_out),
        );

always #10
clk <= ~clk;        

initial begin
  #100 reset <= 0;
  #100000000 $finish;
end

First Simulation Attempt

With our first simulation our Wave output looks as follows:

If you go through the address requests of addr_out, you will see that the last couple of address requests ranges between ff5e-ff63. If you look at Disassembly listing of the kernel, you will see these addresses corresponds to the following:

FF5E   AD 12 D0   LDA $D012
FF61   D0 FB      BNE $FF5E

This loop rings a clear bell from my previous blogs where I wrote emulators for other platforms. Writing a C64 from scratch, you will most probably always got stuck at this loop for the first time.

This signals good news, since we are on the right track.

What we need to do next, is imitate values for register D012 (which is a VIC-II register) , so we can get past above loop, and see if screen memory get populated with the C64 startup message.

Getting past the $FF5E loop

To get past the $FF5E loop we can just link the memory register to a binary counter counting up at each clock cycle.

The implementation of the binary counter is as simple as follows:

...
reg [7:0] line_counter;
...
always @(posedge clk)
  if (rst)
    line_counter <= 0;
  else
    line_counter <= line_counter + 1;
...

And finally we change our arbitration block:

always @*
  casex (addr_delayed)
    16'b101x_xxxx_xxxx_xxxx : combined_d_out = basic_out;
    16'b111x_xxxx_xxxx_xxxx : combined_d_out = kernel_out;
    16'hd012: combined_d_out = line_counter;
    default: combined_d_out = ram_out;
  endcase

When our run simulation again with above changes, our wave output looks as follows:

If you now compare these addresses to a disassembly listing again, you will get to the following section:

; wait for return for keyboard
E5CA   20 16 E7   JSR $E716
E5CD   A5 C6      LDA $C6
E5CF   85 CC      STA $CC
E5D1   8D 92 02   STA $0292
E5D4   F0 F7      BEQ $E5CD
E5D6   78         SEI
E5D7   A5 CF      LDA $CF
E5D9   F0 0C      BEQ $E5E7
E5DB   A5 CE      LDA $CE
E5DD   AE 87 02   LDX $0287

I got this dissasemmbly listing from ffd2.com

From this we can gather that our simulation got to the point where it is waiting for keyboard input, which just after C64 bootup.

Ok, I am pretty convinced the C64 boot process went fine, but I am itching to check one more thing: Checking whether screen memory at memory location 1024 is populated with the Welcome message.

Checking Screen memory for welcome message

As our FPGA implementation is at the moment, we don't really have a way to inspect the contents of our 64KB RAM. We therefore need to modify our debug mode functionality to return the information we want.

Firstly, let us start to modify the header of our c64_core module for returning the relevant information:

module c64_core(
  input wire clk_in,
  input wire reset,
  input wire debug_clk,
  input wire debug_mode,
  input wire [15:0] addr_in
  output wire [7:0] data_out
    )

We have change our addr_out to addr_in and added an output wire returning data for requested address.

Next thing we should do, is to disconnect our cpu from any clock once our core turns into debug mode. We do this by introducing an extra clocking wire for our CPU:

...
wire cpu_clk;
...
assign cpu_clk = debug_mode ? 1b'0 : clk_in;
...
cpu mycpu ( cpu_clk, rst, addr, combined_d_out, ram_in, WE, 1'b0, 1'b0, 1'b1 );
...

Next up, it is important to give our RAM logic the ability to get an address from two sources, depending on whether debug mode is selected:

...
wire [15:0] addr_ram_in;
...
assign addr_ram_in = debug_mode ? addr_in : addr;
...
assign data_out = ram_out;
...
 always @ (posedge clk)
    begin
     if (WE) 
     begin
      ram[addr] <= ram_in;
      ram_out <= ram_in;
     end
     else 
     begin
      ram_out <= ram[addr_ram_in];
     end 
    end 
...

We are done with our changes within c64_core. Next we some make some modifications to the top_module for our simulation.

First some declaration changes:

...
reg [15:0] index;
wire [7:0] d_out;    
..    
c64_core my_core(
    .clk_in(clk),
    .rst(reset),
    .debug_clk(clk),
    .debug_mode(1'b0),
    .addr_in(index),
    .data_out(d_out)

        );

The index register I have defined will updated by a loop which I will discuss shortly.

We end off by modifying our initial block for our simulation:

initial begin
  #100 reset <= 0;
  #100000000 
  #20 debug_mode < 1;
  for (index=1024; index<1500; index = index +1 ) 
  begin
    #20 $display("%d",d_out);    
  end  
  $finish;
end

We have added a for-loop. For-loops are provided in Verilog to aid in simulation. I have read a couple of sources stating that a for-loop will indeed synthesise to something on an FPGA, but the end result would not be necessary the result that you want. So the golden rule: Only use for-loops in simulations.

In our for-loop we keep increment the register index from 1024 till it reaches 1500. Each time, within the for loop, we wait 20 simulation periods (defined by #20) . This have the effect of executing our for-loop once every clock cycle.

Within our for-loop we have also introduced a new simulation directive called $display. It works very similar to printf in c. In our case we actually outputs the value of d_out at each increment. This loop will in effect output the first half of screen memory to the console.

When running the simulation with our changes, the output of the Tcl console will look as follows:

The output starts with a train of 32's, which is a space if you look at the screencode table. This looks promising. Scrolling down we do eventually see some signs of a message:

Converting these screencodes to the actual characters yield the following:

42 = *
42 = *
42 = *
42 = *
32 = SPACE
3  = C
15 = O
13 = M
13 = M
15 = O
4  = D
15 = O
18 = R
5  = E

This is exactly the first part of the C64 welcome message.

We can conclude our simulation went ok up the point of showing the welcome message.

In Summary

In this post we managed to successfully run a simulation for booting the C64 system and populating screen memory with the welcome message.

In the next post we will attempt to run the C64 boot process on the ZYBO board itself.

Till next time!

10 comments:

Johan Steenkamp10 March 2019 at 09:38
Glad you enjoyed the article!
Eric23 October 2019 at 17:22
Hi, I was hoping to get a little help on this step. I'm at the point in this blog post called "First Simulation Attempt". Unfortunately I can't include a picture of my waveform in this comment. But during execution of the Kernal Rom, The reset vector takes my execution from FFFC, FFFD to FCE2, FCE3, FCE4, FCE5, FCE6, FCE7, FCE8, 01FF, 01FE, FCE9, FD02, FD03, FD04, FD05, FD06, FD14, FD07, FD08, FD09, 8008, FD0A, FD0B, FD0C, FD0D and just stays here endlessly FD0D. I don't get stuck at the FF5E like you do, which makes me believe that the line counter won't work for me.

I thought maybe that there were different Kernal roms you were using, but I tried 3 different versions all with the same result. I got the binary roms from www.zimmers.net/anonftp/pub/cbm/firmware/computers/c64/ and I tried the 3 most common ones: kernal.901227-03.bin, kernal.901227-02.bin, and kernal.901227-01.bin I took the bin file, opened in HxD hexeditor and copied and pasted the hex values replacing the 'spaces' with /n just like we did with the Klaus Test in previous posts.

Thanks for the help!
Eric24 October 2019 at 00:04
I found the disassembly listing for the kernel at http://www.ffd2.com/fridge/docs/c64-diss.html
the site is a bit old and I couldn't find direct links from ffd2.com but I found it indirectly through google

It looks like I am stuck in the execution at FD0D of:
FD02 A2 05 LDX #$05
FD04 BD 0F FD LDA $FD0F,X
FD07 DD 03 80 CMP $8003,X
FD0A D0 03 BNE $FD0F
FD0C CA DEX
FD0D D0 F5 BNE $FD04
FD0F 60 RTS

I also double checked that indeed my rom had the correct data at these address, it definitely does and matches the kernel listing. So I must have another issue. I'll keep plugging away at it =)
Eric24 October 2019 at 16:17
D0 is available on the data_in (DI) of the processor. This is where I'm stumped. The correct data is present, and my cpu was verified with the klaus suite. Here's a link to my waveform https://imgur.com/9TGaMIY

I can't find anything in the log file either
Eric24 October 2019 at 18:27
Ok, I made some more progress. It turns out that the xilinx simulation/cpu/everything had a heart attack because the ram was initialized with XX (don't cares). I added an initial begin block in c64_core.v to run through a for loop and populate the ram with 0x00s as follows:

integer i;
initial begin
for(i = 0; i < 65536; i = i + 1)
begin
ram[i] <= 0;
// $readmemh("C:/WS_Vivado/6502_funct_test.dat", ram) ;
end
end

This allowed me to eventually end up in the

; get character from keyboard buffer section

E5B4 AC 77 02 LDY $0277
E5B7 A2 00 LDX #$00
E5B9 BD 78 02 LDA $0278,X
E5BC 9D 77 02 STA $0277,X
E5BF E8 INX
E5C0 E4 C6 CPX $C6
E5C2 D0 F5 BNE $E5B9
E5C4 C6 C6 DEC $C6
E5C6 98 TYA
E5C7 58 CLI
E5C8 18 CLC
E5C9 60 RTS

without having to implement the "Getting past the $FF5E loop" you have mentioned in this post.

I keep running into:

00cc
e5d1, e5d2, e5d3, 0292, e5d4, e5d5, e5d6, e5cd, e5ce, 00c6 e5cf, e5d0
00cc
e5d1, e5d2, e5d3, 0292, e5d4, e5d5, e5d6, e5cd, e5ce

that being said, I'm not 100% certain that everything is working well as I did a search in my waveform for address FF53 and the search failed to find this address. I guess for now, I'll just try to implement the rest and see if I can finally get to the welcome message as you have. Wish me luck as I think I'll need it. These posts have definitely re-kindled my xilinx/fpga love/hate relationship =) hahaha.

Thank You for still checking up on these comments, I really appreciate it!
Eric24 October 2019 at 23:21
Got it Working!!! Thanks for the help!!
Documenting all your work is such a huge undertaking.
I got my first output:
3-C,15-O,13-M,13-M,15-O,4-D,15-O,18-R,5-E,32-space,54-6,52-4,32-space,2-B,1-A,19-S,9-I,3-C,32-space,22-V,50-2
Eric24 October 2019 at 23:25
I just saw your previous comment after I posted mine, thanks for the kind words =)

Wednesday, 13 December 2017

Booting the C64 System