C64 on an FPGA: November 2019

Foreword

In the previous post we gave some thought on the idea of adding SID sound to our C64 module.

This ended off not to be such a daunting task, since we found an existing SID implementation on Github, written in SystemVerilog by Thomas Kindler.

We tested this SID implementation by capturing a couple of seconds worth of SID register writes from a JavaScript emulator I wrote a couple of years ago. These SID register writes I then supplied to Kindler's SID core and listened to the output.

The result was very pleasing. Initially I spotted a bit of clipping, but subsequently fixed this by reducing the volume of each voice.

In this post we will be adding Kindler's core to our C64 module and see if we can play SID sound in realtime.

The importance of clock locking

I would like to start off this post by talking about an issue of a different kind I had to solve with the C64 module.

As I kept adding more functionality to the C64 module, I ended up once again with a case where this core didn't want to boot up anymore.

After checking the Verilog code of the C64 module time and again, I couldn't find anything wrong.

At one point I started wondering: In the clock wizard generating the 16MHz signal, I am not using the lock signal at all. Can this be a source of issues?

The following post on Xilinx Community forums shines a bit of light on this issue: https://www.xilinx.com/support/answers/52806.html. As quoted from this post:

Until the LOCKED signal is asserted High, the DCM/DLL output clocks are not valid and can exhibit glitches, spikes, or other spurious movement.

So, it is a very good idea to honour the Locked signal.

Of course we need to use this signal when generating the reset signal for the 6502:

...
assign c64_reset = (reset_counter > 8000000) & (reset_counter < 8000020) ? 1 : 0;
...
    always @(posedge clk_in)
     if ((reset_counter < 9000000) & locked)
       reset_counter <= reset_counter + 1;
...

You might remember that we generate this signal within the VIC-II module which is clocked at 8MHz. So, in this code we wait about a second after the 8MHz clock generator is locked, after which we assert the signal for a couple of cycles.

Mapping SID into memory

For now, we will only worry about performing writes to the SID, and not any reading. This will result in the following port assignments of the SID module:

MOS6581 sid(
...
    .addr(addr[4:0]),
    .data(ram_in),       
    .n_cs(!(we & io_enabled & (addr[15:8] == 8'hd4))),
    .rw(0),
    .clk(clk_1_mhz), .clk_en(1), .n_reset(!c64_reset)
);

Since the SID only have 29 registers, we only connect the lower 5 bits of the address bits to the SID module.

We permanent wire this module to write mode (e.g. rw is set to zero).

Also, we enable writing when there is a write within the IO region (e.g. address D000 to DFFFF) and the first eight bits of the address equal to 0xD4.

Outputting samples to the Sound System

Some time ago we played with sound on the Zybo board. The Zybo board can create high quality sound with the help of the Analog Devices SSM2603 Audio Codec.

This codec receives samples in a serial fashion with the I2S protocol.

We implemented a block that generates a monotone and converted the samples to the I2S protocol so the audio codec can receive it.

In this section we extend this block so that can receive audio samples from the SID block.

Let us start by having a look at the port definitions of the I2S block:

module i2s(
  input clk,
  input clk_1_mhz,
  input [15:0] audio_in,
  output clk_1_5_mhz,
  output channel_enable,
  output out_data,
  output mute_en
    );

The output ports are basically the I2S ports that we will connect to the Audio codec.

The audio_in port is the audio samples from the SID module.

We have again a cross clock domain that we need to solve. The SID generate samples at a rate of 1MHz, whereas the audio codec need to receive the samples at a rate of 48KHz.

So, apart from solving the clock domain issue, we also need to discard a number of samples from the SID module to get to the 48KHz sample rate.

Let us start by having a look at the critical point at which we need to inject a sample from the 1MHz clock domain:

    always @(posedge clk)
    if (channel_enable_counter == 15 & neg_edge)
    begin
      shift_reg <= {data_val, data_val};      
    end
    else if (neg_edge)
      shift_reg <= {shift_reg[30:0] , 1'b0};

This is the logic for the shift register that shift out the data to the audio codec. Basically we want a 1MHz sample at the right time within data_val by the time the shift register gets reloaded.

We have written the above snippet quite some ago, so let us familiarise ourselves again what is going on in this snippet.

Within the world of our Audiocodec, there is three clock frequencies:

The master clock: 12.288MHz
The serial data clock: 1.536MHz
The sample clock: 48KHz

To avoid multiple cross clock domain issues, we try to clock all our always blocks (Except the 1MHz bits) at 12.288MHz.

We want our shift register to clock at 1.536MHz, so we introduce a signal neg_edge that gets asserted when we are the negative edge of the 1.536 signal:

    reg [1:0] clk_div_counter = 0;
    
    assign neg_edge = (clk_div_counter == 3) & (bclk_int == 1) ? 1 : 0;

    always @(posedge clk)
      clk_div_counter <= clk_div_counter + 1; 

    always @(posedge clk)
    if (clk_div_counter == 3)
      bclk_int <= ~bclk_int;

So, bclk_int is our 1.536MHz clock, which is generated by toggling it every four clock cycles.

Although bclk_int is a clock signal, we don't use it to clock any @always blocks, so no need to worry about any cross clock domain issues here.

Let us similarly bring the 48KHz sample signal into the picture:

    always @(posedge clk)
    if (neg_edge & channel_enable_counter == 15)
      prclk_int <= ~prclk_int;

This looks very similar to the signal that we use to load data into the shift register. The correct instant to load a value from the SID into data_val is a cycle or two after the reload of the shift_regsiter has occurred.

The following snippet accomplish just that for us:

    (* ASYNC_REG = "TRUE" *) reg sig_48_khz_0, sig_48_khz_1, sig_48_khz_2;

    always @(posedge clk_1_mhz)
    begin
       sig_48_khz_0 <= prclk_int;
       sig_48_khz_1 <= sig_48_khz_0;
       sig_48_khz_2 <= sig_48_khz_1; 
    end

    always @(posedge clk_1_mhz)
    if (!sig_48_khz_1 & sig_48_khz_2)
      data_val <= audio_in;

Here we have a multi-flop synchroniser again to bring the 48KHz signal to the 1MHz domain.

This multi-flop syncroniser has an additional function: Delaying the assignment of a new value until the current value has been handed over to the shift register.

So, with the above setup we are in effectively removing the overlap of fetching a sample from the 1MHz domain and reading it in the 48KHz domain.

Implementing these changes will give us a fully functional SID implementation within our C64 module.

In Summary

In this post we finished off our SID implementation within our C64 FPGA implementation.

Special thanks to Thomas Kindler for sharing the source code on Github for his SID implementation.

Previously I aimed that this post would have been the last one.

However, an extra idea popped up. The C64 FPGA implementation in its current state only fills a small area on the screen, so in the next post we will see if we can scale this image up so it can fill most of the screen.

I want to end off this post with an interesting thought. When doing FPGA programming, every now and again one is faced with Cross Clock domain issues. That makes one realise that although we are working with digital electronics where we always have a discrete state of a zero and a one, the circuit in a chip still exhibit similar behaviour to that of an analogue circuit.

Till next time!

C64 on an FPGA

Friday, 29 November 2019

Implementing Sound: Part 2

Foreword

The importance of clock locking

Mapping SID into memory

Outputting samples to the Sound System

In Summary