Saturday 12 February 2022

Initial steps with the Arty A7

Foreword

Hi All! It has been quite a while since my last post.

In my last post I was the bearer of bad news regarding getting an Amiga core to run on a Zybo FPGA, because the effective latency of the DDR RAM was just too much.

In the previous post I also hinted that the direction I want to take to solve this latency issue, was to use an FPGA board that can give us raw access to DDR RAM. The board I have earmarked for this exercise is the Arty A7.

In this post I will give some feedback on my initial findings on using the Arty A7 board.

Initial Impressions

Having previously worked with the Zybo quite a bit, I got quite accustomed to the building blocks the Zynq SoC provides you out the box. On the Zynq there are two ARM cores, which you can write software to perform the non-time-critical functionality, easing your brain a bit on thinking how to implement it in the FPGA. Also, accessing RAM in the FPGA of the Zynq is also fairly straightforward.

Lastly, the Zynq also provides a number of onboard peripherals, like USB, I2C which you also program in software, once again simplifying your FPGA design.

When turning to the Arty A7, it feels initially like you are handed a blank canvas. You handed a bunch of logic elements, but no ARM cores, no DDR RAM controller and no on chip peripherals. Here I must give some credit to Vivado, that does provide you some wizards that can generate some of the building blocks that the Zynq provide. However, these generated blocks do eat into your available elements in the FPGA. 

Let us have a quick look at some of the wizards provided in Vivado. One of the wizards will create a MicroBlaze CPU core for you. This is a CPU core created by Xilinx which you are free to use within Vivado.

Another Wizard that Vivado provides is the MIG (Memory Interface Generator) Wizard. As the name suggests, this provides you an interface for communicating with DDR RAM. One of the useful interfaces that MIG provides is an AXI interface. AXI is also the interface that is used between Peripherals and the FPGA in the Zynq chip.

The MicroBlaze processor also supports the AXI interface, so in effect it is easy for the MicroBlaze processor to access DDR RAM via the generated MIG memory interface.

As you might have gathered, I will be putting all my efforts into understanding the design the MIG will generate 😀 , obviously trying to limit memory latency as much as possible, so that it is usable in an Amiga design.

My experience with a MIG design

I have found that using the MIG wizard is relatively painless, and your FPGA design can have access to the onboard DDR3 RAM in no time.

However, with the MIG generated design I was faced once again with too much memory latency. I was convinced that AXI might be adding some extra latency, so I trying to see if the MIG does provide an alternative interface.

Indeed, I found in the MIG documentation that that the MIG does provide a native interface. Just by its name 'native interface', I was optimistic that I am up for minimal memory latency. Too my dismay, I found that the native MIG interface yielded no latency improvement 😕

It was time to bring the butcher knife and trying to butcher the generated MIG Verilog code, trying to find and ripping out any pipelining code that can potentially add to the latency.

Trying to customise the code generated by the MIG is also a story for another day. In Vivado, most of the generated code is read-only, so the only real way to edit it is to take all the generated source files and add it to a new project that is not MIG aware. In this process you also need to remember to move over all the necessary constraints.

After some pain, I got to a point where I could customise a generated MIG design. This allowed me to hunt down in the design where the latency is happening. The bad news was that the component causing the latency was a close source component, very tightly knitted in the MIG design. So, unfortunately it was not a case of throwing this component out and replacing with a different component.

So, in short it didn't seem possible to tweak a MIG design up to a point where you have minimised latency.

However, the whole exercise with  MIG wasn't a waste. One of the useful outputs from the MIG process is all the necessary constraints, like the pins of the FPGA that is connected to the DDR RAM chip.

Investigating alternative designs

With the bit of a dead end I have reached with the MIG, I started looking around for Memory Controller designs on the Internet, that is open source. In this process I did found a couple of possible candidates which I will discuss in the next post.

Out of interest, I will share a solution that looked very simple and promising on Github:

https://github.com/ultraembedded/core_ddr3_controller

The nice thing about this core is that it also designed to run on the Arty A7. The downside of this core is again its latency, which gave it a maximum throughput of 5MHz.

In Summary

In this post I gave my initial findings on the Arty A7 board.

I also gave a broad outline of the detours I took to attempt to get to a memory controller design with minimal latency. Unfortunately, I wasn't able in this post to get a solution.

In the next post we will look into another open source memory controller available on Github, that have some merit for what we want to achieve.

In the next post, I will also be covering details about the DDR3 protocol.

Till next time!

No comments:

Post a Comment