Tuesday 12 June 2018

Displaying Frames from SDRAM to VGA

Foreword

In the previous post we managed to read a long stream of data sequentially from SDRAM via the AXI sub system.

In the blog post prior that we played around passing data over a cross clock domain (e.g. generating data in a 100MHz domain and passed it to the 85MHz pixel clock domain). Specifically for this, we used the Asynchronous FIFO implementation of Alex Claros F published on the AsicWorld website. 

In this post we will try and bring together the contents of the previous posts, that is reading a video video frame from SDRAM, and passing to the VGA pixel clock domain in order to display it on a VGA LCD monitor.

Also, in this post I started playing the idea to also include a Youtube video outlining the contents covert in the Blog post.

I am going to be frank and admit that I am a total Rookie in making Youtube Videos, so please forgive me for blunders made in the video 😊

To view a Video of this Blog...

This video gives an overview of this Blog Post, as well as a practical session in Vivado on the changes needed to the existing design.




If you rather prefer the written version together with a discussion on the actual changes to the Verilog code, please continue reading...

Overview

The following diagram shows an overview of what we want to achieve in this post:



We will be receiving the frame data from SDRAM via the AXI subsystem.

We will buffer the data coming from the AXI subsystem in a FIFO. This is the same FIFO we used as mentioned in the previous post.

I might have not put emphases on this previously, but a special feature of this FIFO is that it provides a half full indicator in addition to the Full and Empty indicators you will find with most FIFO implementations.

I could probably have build the Half full functionality into out existing Asynchronous Pixel FIFO and thereby eliminated the need for a second FIFO.

However, this would mean more functionality living in the cross clock domain, which would potentially add to more potential frustrations, debugging setup and hold violations.

You will also see a kind of a shift register living between the two FIFO's. The reason for this is beacuse data comes in as 32-bit words, whereas every pixel is only 16-bits is size. So in effect we have two pixels per 32-bit word.

It is the sole responsibility of this shift register to split the 32-bit words into individual pixels.

The shift register shifts 16-bits to the left at a time. The upper 16-bits provides our pixel data, which is buffered within our Asynchrous FIFO.

To Read/Write or not to Read/Write

Reading and writing to and from our two FIFO's at the right moment is crucial to avoid buffer overflows or buffer underflows.

In addition, these reading and writing patterns also effect the operation of our shift register between these FIFO's.

All in all we should attempt to always keep our AXI buffer at least half full and the asynchronous FIFO completely full.

Let us have a quick look at a outline of Verilog code that effects the reading and writing to these two FIFO buffers:

assign read_from_axi = !axi_buffer_empty & !buffer_full & ? 1 : 0;

aFifo
  #(.DATA_WIDTH(16))
  my_fifo
     //Reading port
    (
...
     .Full_out(buffer_full),
     .WriteEn_in(!buffer_full),
... 
     );
...
burst_read_block my_read_block(
...
          .empty(axi_buffer_empty),
          .read(read_from_axi)
...
            );

...

The read_from_axi wire comtrols reads from our AXI FIFO. Obviously we only read from this FIFO if it is not empty and our Asynchrous FIFO is not full.

As mentioned, the shift register also plays an important role in reading and writing to the FIFO buffers, so let us have a look at the implementation of this shift register:

reg [31:0] shift_reg_16_bit;

parameter STATE_16_SHIFT_IDLE = 2'd0;
parameter STATE_16_SHIFT_STORED = 2'd1;
parameter STATE_16_SHIFT_SHIFTED = 2'd2;

always @(posedge clk_axi)
  if (read_from_axi)
    shift_reg_16_bit <= {axi_read_data[15:0], axi_read_data[31:16]};
  else if (state_shift_reg == STATE_16_SHIFT_STORED & !buffer_full)
    shift_reg_16_bit <= {shift_reg_16_bit[15:0], 16'b0};

always @(posedge clk_axi)
  case (state_shift_reg)
    2'd0: state_shift_reg <= !axi_buffer_empty & !buffer_full ? STATE_16_SHIFT_STORED : STATE_16_SHIFT_IDLE;      
    2'd1: state_shift_reg <= buffer_full ? STATE_16_SHIFT_STORED : STATE_16_SHIFT_SHIFTED;
    2'd2: 
                   if (buffer_full)
                     state_shift_reg <= STATE_16_SHIFT_SHIFTED;
                   else if (axi_buffer_empty)
                     state_shift_reg <= STATE_16_SHIFT_IDLE;
                   else
                     state_shift_reg <= STATE_16_SHIFT_STORED;      
  endcase


As you can see, the state machine plays an important role in the operation of the shift register. We start off with an IDLE state. Once our AXI FIFO has data and our Asynchrous buffer is not full we transition to the SHIFT stored state. This instructs the Shift register to store a value from the AXI FIFO rather than shifting.

The Shift register performs a shift operation when we transition to the SHIFTED state.

While our two buffers is in the right state our SHift register will continue the operation of loading a value from AXI buffer followed by a shifting operation.

This state machine is also crucial for the opeation of other parts in the system as highlighted below:

assign read_from_axi = !axi_buffer_empty & !buffer_full & (state_shift_reg == STATE_16_SHIFT_IDLE | state_shift_reg == STATE_16_SHIFT_SHIFTED) ? 1 : 0;

aFifo
  #(.DATA_WIDTH(16))
  my_fifo
     //Reading port
    (
...
     .WriteEn_in(state_shift_reg == STATE_16_SHIFT_STORED | state_shift_reg == STATE_16_SHIFT_SHIFTED) & !buffer_full),
...  

     );


Preparing for the next Frame at VSYNC

When we finished drawing a frame on the screen, it is time for us to prepare our system for the next frame.

This preparation consists out of a couple of things:

  • Clearing all FIFO buffers
  • Allow time to finish all AXI transactions currently busy
  • Reset the pointer of the next address to be read from SDRAM to the beginning og the frame
  • Refilling the FIFO's with pixel data
We should allow sufficient time for this preparation. An ideal event to trigger this preparation is when a VSYNC signal occurs. This will allow more than enough to finish all prep for the next frame. 


This whole prep process will also be driven by a state machine:

always @(posedge clk_axi)
  if (trigger_restart_state == RESTART_STATE_WAIT)
    restart_counter <= 400;
  else if (trigger_restart_state == RESTART_STATE_RESTART)
    restart_counter <= restart_counter == 0 ? 0 : restart_counter - 1;
  
always @(posedge clk_axi)
  case (trigger_restart_state)
    RESTART_STATE_WAIT : trigger_restart_state <= vert_sync_delayed_5 ? RESTART_STATE_RESTART : RESTART_STATE_WAIT;
    RESTART_STATE_RESTART : trigger_restart_state <= restart_counter == 0 ? RESTART_STATE_END : RESTART_STATE_RESTART;   
    RESTART_STATE_END : trigger_restart_state <= vert_sync_delayed_5 ? RESTART_STATE_END : RESTART_STATE_WAIT;   
  endcase   


When the VSYNC pulse is encountered the prep prcess will last for 400 AXI clock cycles. This is enough time for initialisation for the next frame as well as enough time for any AXI transactions that is in process to complete.

Let us have a quick look at what is effect by the prep for next frame:

aFifo
  #(.DATA_WIDTH(16))
  my_fifo
     //Reading port
    (  
...
     .Clear_in(trigger_restart_state == RESTART_STATE_RESTART),
...
     );


burst_read_block my_read_block(
...
          .restart(trigger_restart_state == RESTART_STATE_RESTART),
...
            );


The End Results

A simple test for our design is just to start up the Zybo board without any picture frame loaded in SDRAM.

When you power SDRAM without initialising the contents, it will contain a random sequence of bytes which will result in a noisy pattern on screen.

The big test here is that the noise pattern should be static and not alternating. An alternating noise pattern would indicate some buffer underflow conditions happening.

Here is a frame I have captured from the video:


There is a random pattern indeed. However, it is alternating!

So, we need to spend some time troubleshooting.

We will leave this investigation for the next post.

In Summary

In this post we have attempted to display a frame stored in SDRAM to VGA.

Our ultimate test was to display a static noise pattern on screen. In the end we managed to get a noise pattern displayed on screen, but in an alternating fashion.

In the next post we are going to investigate what causes the noise pattern to alternate and attempt to fix it.

Till next time!