Friday 20 December 2019

Scaling up the display: Part 2

Foreword

In the previous post we started to investigate the possibility of scaling up the images produced by the VIC-II module, so that it can fill the whole display.

For this purpose we used David Kronstein's Video scaler core. So, in the previous post we tested this core with a test bench to see how the image looks like that is produced by this core.

I was quite satidfied by the results produced by Kronstein's core, so in this post we will integrate Kronstein's core within our C64 FPGA design.

Overview

The following diagram gives an overview of what we want to accomplish in this post:


The flow of the diagram starts off more or less the same as our current design which displays the VIC-II frames on a VGA screen.

We retrieve pixel data from SDRAM via AXI and buffer it. As these data is words of 32-bits, thus containing two pixels per word, we need to split the word into individual pixels.

We also buffer these individual pixels into a FIFO buffer. This FIFO buffer has an additional function of moving data from the AXI clock domain (100MHZ) to the VGA clock domain (84MHz).

In our previous design we directly output pixels from this FIFO to the VGA display.

In this post, however, we introduce two new blocks, the Video Scaler and a FIFO for buffering the effect of potential lag from the Video Scaler.

The most tricky scenario in this design is when we reset all the components in preparation for the next frame. With the video scaler been reset for the next frame, it will immediately start requesting data from the asynchronous FIFO when it becomes available. This can potentially lead some empty conditions in our asynchronous FIFO.

In practice, however, I found that our Asynchronous FIFO doesn't handle these intermittent empty states very well.

It is far better on frame reset to rather give the Asynchronous FIFO to fill up a bit, before starting to read from it. In this why we avoid the asynchronous buffer running empty. We will cover this in a bit more detail in a coming section.

Supplying input data to the Video Scaler

Let us connect the necessary ports so that we can supply input data to the Video Scalar.

First, let us cater for the scenario where we need to reset all the blocks upon a new frame.

The trigger_restart_state register indicate when we are about to start with a new frame. However, this register is clocked within the AXI clock domain, but we need it within the VGA domain, so let us create a two flip-flop synchroniser to take care of the scenario:

(* ASYNC_REG = "TRUE" *) reg state_1, state_2, state_3, state_4, state_5;

always @(posedge clk)
begin
  state_1 <= trigger_restart_state == RESTART_STATE_RESTART;
  state_2 <= state_1;
  state_3 <= state_2;
  state_4 <= state_3;
  state_5 <= state_4;
end;

streamScaler #(
//---------------------------Parameters----------------------------------------
.DATA_WIDTH(8),  //Width of input/output data
.CHANNELS(3)  //Number of channels of DATA_WIDTH, for color images
//---------------------Non-user-definable parameters----------------------------
)
  myscaler
(
...
.start(state_5),
...
);


We receive pixel data from the asynchronous FIFO relaying 16-bit pixel values from the AXI domain to the VGA domain. As mentioned in the previous post, the Video scaler expects 24-bit samples, so let us do a conversion:

streamScaler #(
//---------------------------Parameters----------------------------------------
.DATA_WIDTH(8),  //Width of input/output data
.CHANNELS(3)  //Number of channels of DATA_WIDTH, for color images
//---------------------Non-user-definable parameters----------------------------
)
  myscaler
(
...
.dIn({out_pixel_buffer[15:11],3'b0,out_pixel_buffer[10:5],2'b0,out_pixel_buffer[4:0],3'b0}),
...
);


The next port to focus on is the port on the video_scaler, signalling it that the data is valid. For this, lets start off simple, saying that the data is valid if the asynchronous buffer is not empty and it is not the start of the frame:

assign data_valid_in = !state_5 && !async_empty;

You might remember that in the previous section I mentioned that it is preferable to give the asynchronous buffer some time to fill up before reading from it. It would be indeed the data_valid_in port we need to cater for this:

assign data_valid_in = !state_5 && !async_empty && scalar_init;

always @(posedge clk)
begin
  if (state_5)
    scalar_init <= 0;    
  else if (!async_empty && (count_till_read == 60))
    scalar_init <= 1;

  if (state_5)
    count_till_read <= 0;
  else if ((count_till_read < 60) && !async_empty)
    count_till_read <= count_till_read + 1;
end

So, we we hold back asserting the data_valid_in port till our async buffer has been non empty for about 60 clock cycles.

Next, we need to connect the read port on the async fifo:

aFifo
  #(.DATA_WIDTH(16))
  my_fifo
     
    (
...
           .ReadEn_in(nextDIn & data_valid_in),
...
     );


You might recall from quite a number of posts that we have enabled reading from this port when the vga raster was within the visible range. This port is now controlled by the video_scaler (nextDIn). We hold the read back by means of data_valid_in, giving the aFifo a chance to fill up.

Buffering the output of the Video Scaler

As mentioned in the Overview section, we need to buffer the output of the Video scaler.

So, let us start by by defining another FIFO instance:

fifo #(
  .DATA_WIDTH(16),
  .ADDRESS_WIDTH(4)
)

   data_buf_vga (
            .clk(clk), 
            .reset(state_5),
        );


This buffer has a capacity of 16 elements of 16 bits each. Since the Video Scaler outputs samples of 24 bits, we need to connect the write_data port of the FIFO as follows:

fifo #(
  .DATA_WIDTH(16),
  .ADDRESS_WIDTH(4)
)

   data_buf_vga (
...
            .write_data({data_out[23:19],data_out[15:10],data_out[7:3]}),
...
        );


Now, the nextDout port of the Video Scaler need to be in sync with the write port of the FIFO:

...
fifo #(
  .DATA_WIDTH(16),
  .ADDRESS_WIDTH(4)
)

   data_buf_vga (
...
            .write((vert_pos > 10)  & (vert_pos < 760) & data_valid_out & !full_vga_fifo),
            .full(full_vga_fifo),
...
        );
...
streamScaler #(
//---------------------------Parameters----------------------------------------
.DATA_WIDTH(8),  //Width of input/output data
.CHANNELS(3)  //Number of channels of DATA_WIDTH, for color images
//---------------------Non-user-definable parameters----------------------------
)
  myscaler
(
...
.dOutValid(data_valid_out_debug),
.nextDout((vert_pos > 10)  & (vert_pos < 760) & !full_vga_fifo),
...
);
...


The actual idea is to start streaming data out to the screen at line number 20, so we start pre-filling the buffer at line 10.

Streaming the data out to the screen

In the previous section we buffered data from the Video scalar. In this section we will output the buffered data to the VGA port.

As the first step, let us connect all the read ports:

fifo #(
  .DATA_WIDTH(16),
  .ADDRESS_WIDTH(4)
)

   data_buf_vga (
...
            .read((vert_pos > 20)  & (vert_pos < 760) &
                                            (horiz_pos > 100) & (horiz_pos < 1175)),
            .read_data(fifo_data_read)
...
        );


As seen here the visible portion of the screen is between line 20 and 760. On each line the visible portion is between pixel 100 and 1175.

The invisible portions of the screen why want to fill with a black border. To do this we need need to block out the read_data when we are within the invisible regions:

 assign out_pixel_buffer_final = (vert_pos > 20)  & (vert_pos < 760) &
                                (horiz_pos > 100) & (horiz_pos < 1175)
                                ? fifo_data_read : 0;


This out_pixel_buffer_final signal we need to split into the indivudual red, green, blue signals that go to the VGA port:

assign red = out_pixel_buffer_final[15:11];
assign green = out_pixel_buffer_final[10:5];
assign blue = out_pixel_buffer_final[4:0];


Results

I created the following video to demonstrate how the C64 module renders on the VGA screen with the help of video upscaling:


For this demo I loaded the game Blue Max from a tape image. It starts off with the last couple of seconds playing the music of the loader, then the intro tune of Blue Max. I then briefly play the game for a couple of seconds.

In Summary

In this post we integrated David Kronstein's core within our C64 module.

Up to this point we always fired up the Zybo board attached to a PC. It would actually be nice if we could fire up the Zybo board on its own, with an external power supply.

So in the next post we will start investigating how to boot the Zybo from a SDCARD. To kick off this investigation, we will see if we can boot Linux on the Zybo board.

Till next time!




No comments:

Post a Comment