Thursday, 26 May 2022

Starting with a memory tester on the Arty

Foreword

In the previous post we managed to write a value to a memory location, and read the same value back.

In this post we will create a very simple memory tester, just to stress test the memory a tiny bit and see if there is some obvious setup and hold hold timing violations, resulting in data corruption.

This is kind of habit I have grown into, since your design might operate ok with a few clock cycles, but you might experience a weird glitch after a couple of thousands of clock cycles, due to a setup and hold violation. So, it is always good to stress test bits of your design as soon as possible, to avoid a lot of rework.

For this memory controller I will covering in this post, I will be writing test data to a couple of rows to DDR RAM, wait for about 20 seconds, and see if I can read back the correct value from a particular memory location.

Obviously, for this test I will also need to implement some refresh logic, so that the data doesn't leak away from the tiny capacitors in the DDR RAM during the 20 seconds of waiting. 

Abstracting the memory tester from Technical details

The Memory tester we will be developing in this post issues a series of write and read commands. In the future this memory tester will be eventually replaced by the Amiga core, which will be issuing this commands.

The Amiga core doesn't understand the technical details of DDR memory, like splitting an address into a separate row address and column address. The Amiga core also doesn't know that for any memory read/write command you first need to activate row and afterwards pre-charge it.

For all these reasons, we need to abstract the technical details of DDR memory from our memory tester.

The abstracted interface for our Memory tester looks as follows:

    
module mem_tester (
    output reg select,
    output reg write,
    output [15:0] address_out,
    );
	
endmodule
Address_out is a linear address. Outside this module it will be converted to row and column addresses. For now, I am only going to make the with of this output 16 bits.

The Memory tester will use the select output to assert a command and indicate if it is a read/write command with the write output.

We need to add some more ports to our memory tester:

module mem_tester(
...
    input clk,
    input [2:0] cmd_status,
    output reg refresh = 0,
    output wire [127:0] data_out,
    input [127:0] data_in
...
    );
	
endmodule
For our input clock, I want to use a frequency of 20MHz, which is close to the frequency used by the Amiga core.

I can hear a couple of screams at this point: "Cross clock domains!". Indeed, cross clock domains is always a pain to work with 😀.

However, in the past couple of months I have discovered when working with a Mixed Mode Clock Manager (MMCM) in Xilinx FPGA's, working with cross domain clocks is not so bad. With a MMCM you can align the rising edges of different output clocks. Provided that the frequency of the slower clock is a multiple of the faster clock, these edges will always line up.

The next port, cmd_status, will indicate for our memory tester when memory is ready to accept the next command.

You will note that I also have a refresh output port, indicating that our Memory tester is also responsible for performing memory refresh. This goes a bit against our goal of abstracting the technical details off DDR RAM, but I found it difficult to orchestrate a refresh from the outside with the different clock domain.

 Finally, we are sending and receiving bits of 128 bits at a time. This is because the DDR RAM works with 8 bursts at a time. Again, this is a bit of a mismatch with the Amiga core, which works with only 16 bits at a time, but we will handle it in future when we get there.

Coding the Memory Tester

With the ports defined for our memory tester, let us start with some code. We start with a state machine:

    always @(posedge clk)
    begin
        case (state)
              0: begin
                      if (cmd_status == 1)
                      begin
                          if (refresh_underflow)
                          begin
                            refresh <= 1;
                            select <= 1;
                            state <= 2;
                          end else if (address[15:14] != 2'b11)
                          begin
                            write <= 1;
                            select <= 1;
                            state <= 2;
                          end else if (wait_for_read == 0)
                          begin
                            write <= 0;
                            select <= 1;
                            state <= 2;
                          end
                      end 
                 end
              2: begin
                     select <= 0;
                     refresh <= 0;
                     state <= wait_for_read == 0 ? 3 : 0;
                 end
        endcase
    end
State 0 is an idle state, where we wait for the memory to become ready. When the memory controller is ready, we first need to check if the memory is due for a refresh.

The address register keeps track of which address we need to write test data to. Once the top two MSB's of the address register have approached 1's, we are finished with writing. From this point onwards we wait for the signal of wait_for_read to signal that we need to read data from a particular location.

We need state 2 to immediately unassert  the command we issued in the previous state. We also want to stop issuing commands once we have issued a read command by assigning state to 3.

Next, let us have a look at other snippets of code on which our state machine depends. First, the refresh logic:

    always @(posedge clk)
    begin
        if (refresh_counter == 0)
        begin
            refresh_underflow <= 1;
        end else if (refresh)
            refresh_underflow <= 0;
        begin
        end
    end
    
    always @(posedge clk)
    begin
        if (refresh_counter > 0)
        begin
            refresh_counter <= refresh_counter - 1;
        end else
        begin
            refresh_counter <= 120;
        end        
    end
Refresh counter continuously countdown from 120 to zero. With our module clocking at 20MHz this means this counter underflows every 6 microseconds, which is in line with the specs of our DDR RAM stating a refresh command should be issued every 7 microseconds.

The refresh_counter remember that a refresh needs to happen and gets cleared as soon as the refresh command was issued.

Next, let us look at the code that keeps track of the address to which we need to write to:

    always @(posedge clk)
    begin
        if (state == 2 && !refresh)
        begin
            address <= address + 8;
        end
    end
We advance the address to the next address once we are finished with a write command. We also adnace by 8 instead 1 because of bursty nature of DDR RAM. We also don't want to advance the address if the previous command was a refresh.

Let us have a look at data generation for the writes:

...
    assign data_out = {data_counter, 3'b000,
                       data_counter, 3'b001,
                       data_counter, 3'b010,
                       data_counter, 3'b011,
                       data_counter, 3'b100,
                       data_counter, 3'b101,
                       data_counter, 3'b110,
                       data_counter, 3'b111};
...					   
    always @(posedge clk)
    begin
        if (state == 2 && !refresh)
        begin
            data_counter <= data_counter + 1;
        end
    end
...
Here we create data for 8 bursts at a time.

Let us next create the logic where we wait for the read:

...
    reg [31:0] wait_for_read = 400000000;
...	
	always @(posedge clk)
    begin
        if (wait_for_read > 0)
        begin
            wait_for_read <= wait_for_read - 1;
        end
    end
...
This snippet will wait for about 20 seconds before doing a read. The cycle which writes all the test data will complete long before then, and will continue to refresh the DDR RAM continuously until it is time to do the read.

Adding the Memory controller to the existing design

Let us now add our Memory Tester to our existing design. From my previous post, you will remember that I have implemented our logic as another state machine in the module, mcontr_sequencer. Within this module we will also place in instance of memory tester, as follows:

mem_tester m2(
    .clk(memtest_out),
    .cmd_status(cmd_status),
    .select(cmd_valid),
    .write(write_out),
    .address_out(cmd_address),
    .refresh(refresh_out),
    .data_out(cmd_data_out),
    .data_in()
    );
We need to change the code a bit in our state machine living within the module mcontr_sequencer a bit so that it work with our memory tester.

The first change is as follows:

    always @(posedge mclk)
    begin
        if (start_init)
        begin
            case (state)
			...
			... initilise the memory ...
			...
              PREPARE_CMD: begin
                  test_cmd <= 32'h000001ff;
                  do_capture <= 0;
                  state <= WAIT_CMD;
                  cmd_status <= 1;
              end
              WAIT_CMD: begin
                  if (cmd_valid)
                  begin
                      if (refresh_out)
                      begin
                          state <= REFRESH_0;
                          cmd_status <= 2; 
                      end else begin
                          state <= STATE_PREA;
                          test_cmd <= {1'b0, 8'b0, cmd_address[15:10], 1'b0, 16'h21fd};
                          data_in <= cmd_data_out;
                          column_address <= cmd_address[9:0];
                          do_write <= write_out;
                          cmd_status <= 2;
                      end
                  end
              end			
			  ...
            endcase
        end
    end
We get into the state PREPARE_CMD, right after the memory was initialised, which we have covered in a previous post. Within the state PREPARE_CMD we set cmd_status to 1. This signals our memory tester that it is free to submit a command.

We then wait in the state WAIT_CMD until the memory tester has given us a command. We will cover the refresh command a bit later.

When we have received a read/write command, the first thing we need to do is to activate the row in question. You can see from the snippet above that we get the row address by looking at bits 15:10 of the cmd_address. The lower bits of the address bits 9:0 is the column address, and we save this for later use.

In the next state we get the dq bus ready for either reading or writing, and in subsequent states we wait for the ACTIVATE phase of the DDR RAM to complete:

              ...
              STATE_PREA: begin
                  state <= STATE_WAIT_READ_PATTERN_0;
                  dq_tri = do_write ? 0 : 15;
                  test_cmd <= 32'h000001ff;                  
              end
			  .... wait until activate is complete ...
Once activation is complete, we can issue the command for reading/writing a column:

              ISSUE_CMD: begin
                  state <= ASSERT_ODT;
                  test_cmd <= {1'b0, 4'b0, column_address, 1'b0, 4'h1, 
                      (do_write ? 2'b11 : 2'b00), 10'h1fd};
              end

              ASSERT_ODT: begin
                  test_cmd <= do_write ? 32'h000005ff : 32'h000001ff;
                  state <= WAIT_CMD_FINISHED;
              end
With writes we need to assert the ODT line, which we do in the ASSERT_ODT state.

Finally after completing a read/write, we need to precharge the row, which I am not going to show here.

In our state machine we still need to serve the Refresh command. Apart from assigning a value to test_cmd to trigger the Refresh command, we also need to honour the timing period tRFC after issuing the command, which is 160ns for the DDR RAM chip we use on the Arty A7 board. 

We implement this delay also on our state machine, so our memory tester will only need to issue the refresh command and don't need to worry about timing the tRFC delay. Out state machine implementing the refresh command will look like this:

...
reg                   [3:0] refresh_wait = 14;
...
             REFRESH_0: begin
                  test_cmd <= 32'h000031ff;
                  state <= REFRESH_1;
              end

              REFRESH_1: begin
                  test_cmd <= 32'h000001ff;
                  if (refresh_wait > 0)
                  begin
                      refresh_wait <= refresh_wait - 1;
                  end else
                  begin
                      refresh_wait <= 14;
                  end
                  if (refresh_wait == 0)
                  begin
                      state <= PREPARE_CMD
                  end
              end

Since our state machine is operating in the 83MHz domain, 14 cycles gives us 168ns, which is in line with tRFC.

Test Results

Running our design on the FPGA returned the results I expected when I read back a test value from memory after 20 seconds.

For the moment, I have nothing else to report back on 😄

In Summary

In this post we have implemented a very simple memory tester where we write a volume of data to memory, wait 20 seconds and read a test value back.

In the next post we will be continue to chip away at our memory controller.

On thing I am aware of I should give attention to in our memory controller is to reduce latency so that we can easily operate at 7MHz, which is the memory bandwidth the Amiga core requires. We will give attention to that in the next post.

Till next time!