C64 on an FPGA: Scaling up the display: Part 1

Foreword

In the previous two posts We have implemented SID sound within our C64 FPGA module.

If we look back to the Introduction post of this Blog series, the purpose of this series was to create a Complete C64 system on an FPGA.

I think we got pretty close to this goal. We have implemented the following:

Integrated Arlet Otten's 6502 core into our design.
Implemented C64 memory banking.
Booting the whole C64 system.
Loading a game from a .TAP image
Implementing VIC-II module capable of displaying sprites together with a couple of its graphics modes, like multicolor bitmap, and multicolor text mode.
SID sound.

Granted, an important item missing from the list is implementing a C64 disk drive like a 1541. I am, however, not entirely sure if I would want to go down that road, since we already utilised the majority of the Block RAM resources of the ZYNQ FPGA, so I doubt if heir would be sufficient resources left for implementing a 1541 module (e.g. the core of a 1541 disk drive is also a 6502 CPU, also requiring RAM and ROM to operate).

There are, however, some other items currently missing in our C64 module, which I thought would be nice to implement and for which I will be writing some blog posts on how to implement them.

The first item is to scale up the frames produced by the VIC-II module. Currently these frames have a resolution of 404x284. With most monitors available on the market today, these frames will just fill a tiny portion of the screen.

So, we will decicate a post or two on how to scale the VIC-II generated frames up, so that it fills most of the screen.

Another issue that is worth looking into is the fact that currently our C64 module cannot operate on its own on a Zybo board. The Zybo board always needs to be connected to a PC to upload a Bitstream image and for kicking off a standalone program in the Xilinx SDK for providing USB keyboard functionality.

I will also write some Blog posts for implementing a solution for above mentioned issue, which would involve booting Linux from a SDCard fitted to the Zybo board and also loading a bitstream image from the same SDCard into the FPGA.

This is more or less what I have planned for future posts in this Blog series.

Let us start and see if we can upscale the frames produced by the VIC-II!

David Kronstein's Video Scalar Core

As the old saying goes: Don't re-invent the wheel. In this series I tried to apply this bit of advice numerous times:

Using Arlet's 6502 core.
Making use of an asynchronous FIFO buffer as suggested on a Xilinx's community forum.
Using Thomas Kindler's SID implementation.

So, is there a Verilog module available that can scale up an image. Indeed there is on OpenCores website: https://opencores.org/projects/video_stream_scaler

The SVN browser on the website allows us to get hold of the source code. Two files are of importance:

Video+Stream+Scaler+Specifications.pdf
scalar.v

The pdf explains very nicely how the scalar works.

The file scalar.v contains the main module, as well as all sub modules, within one file.

Let us start by having a look at the ports of the Video Scaler module:

//---------------------------Module IO-----------------------------------------
//Clock and reset
input wire    clk,
input wire    rst,
 
//User interface
//Input
input wire [DATA_WIDTH*CHANNELS-1:0]dIn,
input wire       dInValid,
output wire       nextDin,
input wire       start,
 
//Output
output reg [DATA_WIDTH*CHANNELS-1:0]  dOut,
output reg         dOutValid,
input wire         nextDout,
 
//Control
input wire [DISCARD_CNT_WIDTH-1:0] inputDiscardCnt, 
input wire [INPUT_X_RES_WIDTH-1:0] inputXRes,
input wire [INPUT_Y_RES_WIDTH-1:0] inputYRes,
input wire [OUTPUT_X_RES_WIDTH-1:0] outputXRes,
input wire [OUTPUT_Y_RES_WIDTH-1:0] outputYRes,
input wire [SCALE_BITS-1:0]   xScale,
input wire [SCALE_BITS-1:0]   yScale,
 
input wire [OUTPUT_X_RES_WIDTH-1+SCALE_FRAC_BITS:0] leftOffset,
input wire [SCALE_FRAC_BITS-1:0] topFracOffset, 
input wire    nearestNeighbor
);

The clk and rst is obvious, so let us skip to the input port section.

The dIn is the pixel data input. For our emulator we will have three channels (e.g. RGB) and each channel will be 8 bits wide. This may sound confusing at first since the Zybo board works with 16 bit pixels in the format RGB565. However, this scaler assumes the same data width for all channels, so for this reason we will just stick with 8 bits per channel.

The next two signals are handshake signals between the pixel data originator and the video scaler. When the pixel data originator has made data available for a new pixel, it will assert the dInValid signal. In return, the video scaler will assert the nextDin signal when it has accepted the data.

Something to keep in mind with our VIC-II module is that it is outputting pixels at a constant rate and cannot be told to pause for a couple of clock signals. The Video scalar in turn can end up time and again in a situation where it is not able to accept data at a particular clock pulse. We will, however, cross this bridge when we get there.

The last port of the input section, start, is used to signal the video scalar that we are about to transmit data for a new frame.

Next we get to the output port section. With these set of ports our Video scaler behaves like a pixel data producer, which is pixel data for the actual upscaled image. Similarly, dOutValid and nextDout are handshake signals.

Let us move onto the final section, the control section. There is couple of ports in this section I am not going to worry about and I am just going to connect them to the value zero. These ports are the following:

inputDiscardCnt,
leftOffset,
topFracOffset and
nearestNeighbor

The other ports are for specifying the input resolution, output resolution and the ratio by which the input frames should be resized by.

We will calculate the values of these ports in the next section.

Calculating the values of the control ports

Let us start by determining the values for inputXRes and inputYRes.

The spec for the video scaler that for each resolution port we should supply a value that is the actual value, minus one.

We know that the frames produced by our VIC-II has a resolution of 404x284, so we should specify a value of 403 for inputXRes and a value of 283 for inputYRes.

Next, we should decide on the output resolution. For this one would be tempted to use the physical resolution of the monitor you are going to use for the display of the output frames.

However, in these times changes are good that the monitor you will be using will be a wide screen, whereas the the output of a VIC-II would be more towards a square aspect ratio. So, one would end up with a stretched image if using the physical screen resolution as the output resolution for the Video Scaler.

So, we need to proportionally scale up the input image till it just fills the height of the screen.

I am going to use my screen as an example, which has a resolution of 1366x768. I am a bit hesitant to use the full height of the screen, since I just to leave a bit of 'buffering' space to account for possible lag by the video scaler before producing output for the next frame.

So, I will be using a height of 758 for our output frame. At this point we need to calculate the ratio by which we will be resizing our input image.

(Output Resolution Y) / (Input Resolution Y)

= 758 / 284

= 2.6690

This is a very important factor, and we will be using it later again for the ports xScale and yScale.

To get the horizontal output resolution, we should multiply the horizontal input resolution by this factor:

(Input Resolution X) * 2.6690

= 404 * 2.6690

= 1078.276 ≈ 1078

Thus, our output image should have resolution of 1078x758, resolving into a value of 1077 for outputXRes and a value of 757 for outputYRes using the minus one constraint specified in the spec of the video scaler.

Next, we should calculate the values for xScale and yScale, which in our case will the same.

According to the spec, xScale gets calculate by (inputSize / Output size). This is different to the way we have calculated our scale factor, which is (Output size / Input Size). So, to get to a valid value for xScale and yScale, we should use the reciprocal of our factor, which yields around 0.374672.

We now need to represent this fraction in binary. Sound like a daunting task, but fear not!

The scale value uses 4 bits for the integer value and 14 bits for the fraction, totalling 18 bits. This can be visually represented as follows:

0.5

0.25

0.125

0.0625

For clarity, I have shown only the first couple of fractions bits.

Through some trail and error, I found that 0.25 and 0.125 gives a good enough estimation for us: 0.375. Let us convert this fraction into hexadecimal:

0000.01100000000000

= 000001100000000000

= 00 0001 1000 0000 0000

=  0   1    8   0    0

So, the value to use for both xScale and yScale is 18'h1800.

Creating a Test Bench

Let us create a test bench to test the Video Scaler.

First thing we should do is to get hold of source image data. An easy way to get this is to run our C64 FPGA design on the Zybo Board, and then do a mrd (e.g. Memory read command) on the XSCT console where from the memory area where the frame is stored and write it to a file.

However, be mindful to the fact that information is stored in memory as 32-bit words in little endian format. The pixels been in the format RGB565, it would mean that every pair of pixels will have there order reversed.

So, it might be necessary to write a program for reversing the order of the pixels for this exercise.

Next, let us write some code to read data from this file and supplying to the video scaler on demand:

reg [15:0] pixel_in_data;

initial begin
  f = $fopen("<file.data>","rb");
  #300
  @(negedge clk)
  start = 1;
  @(negedge clk)
  start = 0;
  #300;
  
  while (data_count < 500000)
  begin
    @(negedge clk)
    if(nextDIn)
    begin
      data_valid_in = 1;
      $fread(pixel_in_data,f);    
    end 
  end
end

We start by triggering the start flag to inform the video scaler that we are at the beginning of the frame and about to send data.

We keep reading pixel data while the video scaler has asserted the next_in port. With the first pixel that we read, we also asserts the data_valid_in port.

Data in the source file is in the format RGB565, and the video scaler expect it in the format 24-bit color, so we convert it like this:

streamScaler #(
.DATA_WIDTH(8),
.CHANNELS(3)
)
  myscaler
(
...
.dIn({pixel_in_data[15:11],3'b0,pixel_in_data[10:5],2'b0,pixel_in_data[4:0],3'b0}),
...
);

Next, we should capture all the generated pixels from the video scaler and save it as an image file, so we can view it in an image viewer.

I would like like the resulting image file to be again in RGB565 format, so I will do a conversion again:

...
wire [15:0] data_out_concat;
...
assign data_out_concat = {data_out[23:19], data_out[15:11],data_out[7:3]};
...

We write the pixel data as follows:

initial begin
fw = $fopen("<outputfile.data>","wb");
while (data_count < 5000000)
begin
  @(posedge clk)
  if (data_valid_out)
  begin
    $fwrite(fw,"%c",data_out_concat[7:0]);
    $fwrite(fw,"%c",data_out_concat[15:8]);
  end
end
end

So, while data_valid_out is asserted we write the pixel data to the resulting image file. Also, we write each pixel in little endian order, which is the format required by the image viewer.

The Result

Let us have a look at the resulting image file.

We are going to use GIMP to view the image file. Gimp can read image raw image data, provided use the file extension .data.

On opening this file, we specify the format RGB565 and the resolution 1078x758:

The resulting rescaled image is as follows:

Admitted, one cannot clearly see the difference between the scaled up output of the video scalar and the original frame.

We will, however, better see the effect of the upscaling when displayed on a monitor, which we will cover in the next post.

In Summary

In this post we started to investigate upscaling for use in our C64 emulator, so we can fill the whole screen instead of a tiny portion.

We identified David Kronstein's Video scaler core as the candidate for use in this task.

We created a test bench for testing David Kronstein's Video Scaler and successfully managed to upscale a test image.

In the next post we will be integrating this core into our C64 emulator.

Till next time!

C64 on an FPGA

Sunday, 15 December 2019

Scaling up the display: Part 1