Tuesday, 10 July 2018

Fixing the Non static frame

Foreword

In the previous post we attempted to view random data in SDRAM as a static frame on the VGA screen.

We, however, ended with a random alternating pattern displayed instead of a random static pattern.

In this post we will attempt to fix this anomaly.

To view a Video of this Blog...

This video explains with the help of a Xilinx community post that the cause of the non static frame displayed was likely caused by the asynchronous FIFO implementation used. I also show how I apply the suggestions from the Community post to my existing post in order to fix the problem...


If you rather prefer the written version together with a discussion on the actual changes to the Verilog code, please continue reading...

Some help from a old community post

The anomaly encountered in the previous post really baffled me, and I didn't know where to actually start looking for the cause of the problem. So, I consulted the Internet...

In my searching I came across the following  post on a Xilinx Community forum:


Interesting thing here is that the community member  that posted the query used exactly the same implementation I used from the Asic-World website:


The member was experiencing some serious timing violations when trying  to synthesise the design.

The key to the solution was provided by the community member with the nickname Avrumw. He pointed out that in the comments of the mentioned design it was suggested that the design follows some recommendations from a Xilinx article.  Avrumw, however had some serious doubts whether Xilinx would make some of these suggestions at all because of the following (I am quoting from Awrumw's answer):
  - It uses a latch
  - It uses the asynchronous preset/clear inputs of flip-flops for part of its functionality
The suggestions Awrum gave to avoid these practices was the following:

  - infer the RAM
  - use Gray counts for bringing addresses between domains
  - use standard "two back to back flip-flop" synchronizers (with the ASYNC_REG property set) to move the Gray coded read pointer into the write domain (for generating full) and the Gray coded write pointer into the read domain (for generating empty)
Admittedly, the design on Asic World did indeed made use of Gray Counters.

Let us know proceed and see if we can apply these suggestions to our design

Applying the suggestions to our design

From the suggestions, the first thing I am going to do, is make use of back-back flip synchronizers.

We will need a set of two of these back-back synchronizers. One for passing pNextWordToWrite to the read side and another one for passing pNextWordToRead to the write side.

These synchronizers will be defined as follows:

   (* ASYNC_REG = "TRUE" *)  reg [3:0] synchro_write_side_0, synchro_write_side_1; 
   (* ASYNC_REG = "TRUE" *) reg [3:0] synchro_read_side_0, synchro_read_side_1;

The ASYNC_REG annotation will ensure that the flip-flops for each synchroniser set will be placed closed to each other when synthesising the design.

These flip-flops will be assigned as follows:

//write synchroniser
//--------------------------------------------------------------------------------------
     always @(posedge WClk) 
     begin
       synchro_write_side_0 <= pNextWordToRead;
       synchro_write_side_1 <= synchro_write_side_0;
     end
//---------------------------------------------------------------------------------------     

//read synchroniser
//--------------------------------------------------------------------------------------
     always @(posedge RClk) 
     begin
       synchro_read_side_0 <= pNextWordToWrite;
       synchro_read_side_1 <= synchro_read_side_0;
     end
//---------------------------------------------------------------------------------------     


Please take note that each synchroniser gets clocked by a different clock.

Let us now see where these synchronisers will get used. Before we continue, I would just like to mention that I had to deuplicate the code for tboth the write side and the read side. So, let us first  start with the code on the write side:

//Empty/Full Handling on Write Side
//----------------------------------------------------------------------------------------------------
    //'EqualAddresses' logic:
    assign EqualAddresses_write_side = (pNextWordToWrite == synchro_write_side_1);

    //'Quadrant selectors' logic:
    assign Set_Status_write_side = (pNextWordToWrite[ADDRESS_WIDTH-2] ~^ synchro_write_side_1[ADDRESS_WIDTH-1]) &
                         (pNextWordToWrite[ADDRESS_WIDTH-1] ^  synchro_write_side_1[ADDRESS_WIDTH-2]);
                            
    assign Rst_Status_write_side = (pNextWordToWrite[ADDRESS_WIDTH-2] ^  synchro_write_side_1[ADDRESS_WIDTH-1]) &
                         (pNextWordToWrite[ADDRESS_WIDTH-1] ~^ synchro_write_side_1[ADDRESS_WIDTH-2]);


Here we have replaced all instances of pNextWordToRead with synchro_write_side_1.

Next, let us get rid of the transparent latch. First, let us have a look at the original code that inferred a transparent latch:

    //'Status' latch logic:
    always @ (Set_Status, Rst_Status, Clear_in) //D Latch w/ Asynchronous Clear & Preset.
        if (Rst_Status | Clear_in)
            Status = 0;  //Going 'Empty'.
        else if (Set_Status)
            Status = 1;  //Going 'Full'.

If you look closely at the code, you will identify many scenarios where there will be no assignment. In those scenarios we need to revert to one or other previous stored state. For this reason the above code will be inferred as a transparent latch.

To eliminate the need for a transparent latch we need to split the above into pieces that will infer into a pure computational logic block and a storage element. The result is as follows:

    always @*            
        if (Rst_Status_write_side | Clear_in)
          Status_write_side = 0;  //Going 'Empty'.
        else if (Set_Status_write_side)
          Status_write_side = 1;  //Going 'Full'.
        else
          Status_write_side = Status_write_prev_side; 
          
    always @(posedge WClk)
         Status_write_prev_side <= Status_write_side; 


So, we have a pure storage element Status_write_prev_side that store the contents of the computational block  Status_write_side at each clock cycle. So, in the case where there is no assignment happening for Status_write_side, we can just output the value of Status_write_prev_side.

Next, let us see what we can do to eliminate the need for a flip flop with an asynchronous preset. First, let us look again at the original code that will infer a flip-flop with an asynchronous preset:

    //'Full_out' logic for the writing port:
    assign PresetFull = Status & EqualAddresses;  //'Full' Fifo.
    
    always @ (posedge WClk, posedge PresetFull) //D Flip-Flop w/ Asynchronous Preset.
        if (PresetFull)
            Full_out <= 1;
        else
            Full_out <= 0;
            

Looking at this piece of code, one can immediately see why they needed to use an asynchronous flip-flop. In deriving PresetFull we had to use some values that gets assigned in the read clock domain. So, it would make sense to trigger the assignment the moment PresetFull transitions from a zero to a one rather than waiting for the Wclk to transition. In this way we can avoid a setup and hold violation.

However, with Xilinx FPGA's we still try and avoid these asynchronous presets. Since we safely moved over pNextWordToRead from the read domain to the write domain, we don't need such manoeuvres. So, the assignment of Full_out, just simplifies to:

   assign PresetFull_write_side = Status_write_side & EqualAddresses_write_side;  //'Full' Fifo.
            
   assign Full_out = PresetFull_write_side;             


This takes care of the Full indicator on the write side. For the empty indicator that is used on the read side, we have a similar set of code:

//----------------------------------------------------------------------------------------------------            
//Empty/Full Handling on Read Side
//----------------------------------------------------------------------------------------------------
    //'EqualAddresses' logic:
assign EqualAddresses_read_side = (synchro_read_side_1 == pNextWordToRead);

//'Quadrant selectors' logic:
assign Set_Status_read_side = (synchro_read_side_1[ADDRESS_WIDTH-2] ~^ pNextWordToRead[ADDRESS_WIDTH-1]) &
                     (synchro_read_side_1[ADDRESS_WIDTH-1] ^  pNextWordToRead[ADDRESS_WIDTH-2]);
                        
assign Rst_Status_read_side = (synchro_read_side_1[ADDRESS_WIDTH-2] ^  pNextWordToRead[ADDRESS_WIDTH-1]) &
                     (synchro_read_side_1[ADDRESS_WIDTH-1] ~^ pNextWordToRead[ADDRESS_WIDTH-2]);
                     
                     //reg                                 Status_write_side, Status_write_prev_side;
//'Status' latch logic:
        
always @*            
    if (Rst_Status_read_side | Clear_in)
      Status_read_side = 0;  //Going 'Empty'.
    else if (Set_Status_read_side)
      Status_read_side = 1;  //Going 'Full'.
    else
      Status_read_side = Status_read_prev_side; 
      
always @(posedge RClk)
     Status_read_prev_side <= Status_read_side; 
         
//'Full_out' logic for the writing port:
assign PresetEmpty_read_side = ~Status_read_side & EqualAddresses_read_side;  //'Full' Fifo.

assign Empty_out = PresetEmpty_read_side;

        
//----------------------------------------------------------------------------------------------------------

This is all the changes required to our design

The Results

I can confirm that the mentioned changes did in fact solve my issue and a static random pattern was displayed on screen.

I wanted to show a picture in this post on how the screen looks like with these changes, but the photo is not very clear. I better exercise would be to display a meaningful photo on the VGA screen.

To do this exercise we will make use of the XSCT console to write the contents of a image file  to the SDRAM of the ZYBO board.

Needless to say, this image file will need to contain raw pixel data in the format RGB565. The file format that comes close this is Microsoft's BMP format. Interesting enough, GIMP allows us to create a BMP file in the RGB565 format.

To do this open up the image you want to convert in GIMP and then select File/Export As.

Give a filename, suffix it with a .bmp extension and hit export. Specify the options in the option window as follows and hit the export button again:



We will then use this file and write its contents to the SDRAM of the Zybo board. You should remember though that the image file doesn't start with raw image straight away, but rather from byte offset 0x46 as deduced via the following article on Wikipedia:


So, because our image frame starts at address 0x200000 in Zybo SDRAM we should write our file at the address starting at 0x200000 - 0x46 to account for the header. Thus, we should write our file to SDRAM starting at address  0x1fffba.

With our Zybo board programmed and a program been kicked off via the Xilinx SDK, we should issue the following command via the XSCT console:

mwr -size b -bin -file /home/johan/Downloads/bm1360.bmp 0x1fffba 3000000

Obviously you need to specify your own file name.

With the image data written our VGA display looks as follows:


Static image indeed.

In Summary

In this post we fixed the issue where a non static image was shown onscreen.

This issue was caused by the following unsafe practices :

  • Using transparent latches
  • Using flip-flops with asynchronous presets.
In the next post we will attempt to display the output from our VIC-II to the VGA screen.

Till next time!