Thursday 5 January 2023

SD Card Access for a Arty A7: Part 5

Foreword

In the previous post we replaced our state machine with a 6502 CPU + machine code program for issuing commands to the Gisselquist SD Card core. We managed to issue an IDLE to command to an SD Card with the 6502 core.

In this post we will continue our endeavour of trying to access an SD Card by means of the Gisselquist SD Card core and a 6502 CPU.

I mentioned in the previous post that in this post  I want to finish off with the process of powering up and initialising the SD Card, followed by reading some data from it. However, I found that the process of reading data from the SD Card is quite involved, so to keep things simple I will just be covering the process of initialising the SD Card in this post.

Revisiting 6502 Assemblers

In the previous post I wrote 6502 machine code manually. The required machine code was fairly straightforward, so doing the process manually wasn't that much of a deal.

However, from this point onwards, the complexities of machine code will only increase, so it make sense to rather use an Assembler.

Using an assembler the code will remain readable and self documenting.

The Assembler I have chosen for this purpose is the following online one:

https://www.masswerk.at/6502/assembler.html

During the course of this post, I will give a gradual introduction this assembler. Let us start with a quick outline:

.ORG $FF00
     ; Assembly language instructions
ENDROM = $FFFF-*-3
.FILL ENDROM 00
.BYTE 0, $FF, 00, 00
We start with the directive .ORG, specifying the start address of our program. The assembler needs this info to calculate various things, like if you jump to a label, to calculate the absolute address of that label.

Next we declare a symbol ENDROM, where we actually work with an address at the end of our assembly language program, donated by *.  At any point in time within the assembly listing you can get the current address via this asterisk. In the case of ENDROM, the expression will return the number of bytes remaining to get to a total ROM size of 256 bytes. From this number we subtract three, so we can leave a gap at the end for our reset vector.

With the .FILL directive, we add a number of padding bytes. As mentioned previously, ENDROM is the calculated number of bytes that needs to be added to get to 256 bytes, and the .FILL makes it happen.

The .BYTE directive allows us to emit one or more bytes of data. In this case it is the Reset vector, as well as the IRQ vector.

To get an idea into what this outline program will assemble as, let us enter the program into the above mentioned assembler:


As can be seen from the picture, we have a set of zero's starting at address $FF00. If you scroll down, you will see the zeros stop address $FFFF:


Thus, the resulting binary is exactly 256 bytes, which is what we want.

As also can be seen from these screenshots, there is a Show Address checkbox. Unchecking this checkbox, will remove the address from each line, which will make it easy to create a Hex file which is required by Vivado to populate a ROM.

Reducing repetition

In software development we have a very common term called DRY: Don't Repeat yourself.

Well, in the previous post I wrote some 6502 machine code where I repeated the same set of instructions for different pieces of data. We can do better and see if we can encapsulate the code into loops and Subroutines. Also, perhaps store the data into lookup tables.

Let us start with the command for setting the clock speed of sclk, and express it in a lookup table:

DATA: 
     .BYTE $55, $55, $55, $0B
     .BYTE $00, $00, $00, $C0 ; CMD C0
So, here we first present the data for setting the data register in the SDSPI core, and then the actual command.

Let us add one more command and see if we can start to spot some patterns:

DATA: 
     .BYTE $55, $55, $55, $0B
     .BYTE $00, $00, $00, $C0 ; CMD C0
     .BYTE $FF, $FF, $FF, $FF
     .BYTE $00, $00, $00, $40 ; CMD 40
We see that each command has a size of 8 bytes. We can use a zero based index for accessing the bytes for a particular command from the lookup table. For example, for Command $C0 we will use index 0 and for command $40 we will use index 1.

To deal with lookups from a table, the 6502 provide us with the Indirect Indexed addressing mode. Let us start with a basic loop for sending a command:

LOOP:
     LDA ($A0),Y
     STA $FE00,X
     INY
     DEX
     BPL LOOP
From this we can see that the address A0 should contain the base address of the lookup table, which we should initialise in the beginning like this:

.ORG $FF00
     LDX #$FF
     TXS
     LDA #<DATA
     STA $A0
     LDA #>DATA
     STA $A1
A couple of initialisation steps are happening. First we should init the stackpointer with the value $FF. The 6502 doesn't do this at startup and forgetting this initialisation will give you an XX during simulation of the Arlet core in the Stackpointer.

Both "<" and ">" are Assembler directives yielding the low and high address respectively of a label.

Let us focus at the loop code again. The Y register points to a specific entry into the lookup table, incrementing it to the next byte with each iteration of the loop. 

The X register starts with a value of 7 and goes to zero. This will transfer the data of a lookup entry to addresses FE07 to FE00. As from the previous post these addresses maps to the Gisselquist SD Card core.

One question that remains is how Y is initialised. The journey starts with the command index stored in the Accumulator, after which we do the following:

     ASL
     ASL
     ASL
     TAY
This is equivalent to multiplying the command index by 8.

This covers more or less what is required to issue a command to the SD Card. There is, however, one caveat we haven't dealt with in the code, and that is that we should wait for the SD Card to complete a command before issuing the next one.

The way to check this is to continuously poll address 0 of the Gisselquist core and see if the busy bit, which is bit 14, is cleared.

Having considered all this, we end off with the following subroutine for issuing a command to the SD Card:
 
CMD:
     ASL
     ASL
     ASL
     TAY
     LDX #$07
LOOP:
     LDA ($A0),Y
     STA $FE00,X
     INY
     DEX
     BPL LOOP
     AND #$80
     BMI END
BUSY
     LDA $FE00
     BIT $FE01
     BVS BUSY
END
     RTS
One thing that might look a bit strange is that we and the command byte, which is always the last byte in a command entry of the lookup table, with $80. Here we basically want to test if the command is a true SD Card command (which always starts with 01) and not a command dedicated to the Gisselquist core. There is always a wait associated with a SD Card command, but not with a Gisselquist core command.

Verilog issues

While I was testing the 6502 machine code I developed in this post, I discovered a couple of flaws with my existing FPGA design.

The first issue is when executing the command STA $FE00,X with X 0 or 4 which triggers a wishbone bus operation.

With this instruction the complete address is basically asserted for two consecutive clock cycles on the address bus and the write line is asserted only at the second consecutive cycle.

Now when the full address is asserted during the first clock cycle, the system assumes a memory read because the write signal is not asserted. With normal block RAM this is not an issue and will just result in a redundant read.

However, with addresses FE00 and FE04 things get a bit more complicated since these ones trigger a wishbone read transaction. As we know at this point in time wishbone reads asserts the RDY signal on the 6502 during some clock cycles.

All in all things just gets more complicated when you trigger a read on the wishbone bus on one clock cycle and a write on the bus the next clock cycle. These action makes the 6502 and the Gisselquist core out of sync with each other and the wrong values gets written.

There is probably a number of ways to solve this, but the easiest way I could come up with was just add a register to our FPGA design instructing the system to ignore all reads to the Wishbone bus. When are at a point in our program where we will do a couple of writes via an Absolute,X instructions we just need to set this register so reads to wishbone bus can be ignored.

Let us implement this register:

...
reg [7:0] ignore_reads = 0;
...
assign wb_stb = cpu_address[15:8] == 8'hfe && on_word_boundary && !(ignore_reads[0] && !we_6502);
...
always @(posedge gen_clk)
begin
  if (we_6502 && cpu_address[15:8] == 8'hfe)
  begin
    if (cpu_address[1:0] == 2'h1 && !cpu_address[3])
    begin
        reg_1 <= cpu_data_out;
    end else if (cpu_address[1:0] == 2'h2 && !cpu_address[3])
    begin
        reg_2 <= cpu_data_out;
    end else if (cpu_address[1:0] == 2'h3 && !cpu_address[3])
    begin
        reg_3 <= cpu_data_out;
    end else if (cpu_address[3:0] == 11)
    begin
       ignore_reads <= cpu_data_out;
    end
  end
end
...
I have highlighted the changes in build. I have made ignore_reads 8 bits wide, in case we need additional signals later on.

With these changes our memory map in the FE00 range is like this:
  • FE00-FE07: SD Card core registers
  • FE0B: Ignore reads
I will present a full Assembly listing at the end of this post to show how FE0B should be used.

Another thing we need to implement in Verilog is to map block RAM for Zero Page and the Stack. This is very similar to what we did in the previous post where we mapped ROM in the space FF00-FFFF, so I will not be covering it here.

Looking Deeper into SD Card commands

Up to this point we have only used the SD Card IDLE command. Let us have a look at some other commands with the focus of initialising an SD Card.

I will try and be brief about these commands. If you want more detail on these commands, you can consult the following sources:


Let us start by looking at the command CMD8, which tells us if the card is indeed an SD Card or an MMC card. In my case I will only call this command for own curiosity to confirm that this card is indeed an SD Card. At this point I will not expect the 6502 program to make any decision based on whether the card is an SDCard or MMC.

The Byte definition in 6502 assembly for this command is as follows:

     .BYTE $00, $00, $01, $AA
     .BYTE $00, $00, $02, $48 ; CMD 48
From this we can see that CMD8 starts with the byte $48, followed by four bytes which end with the bytes $01 and $AA.

You will notice that next the command byte, there is byte of value with 2. This value informs the Gisselquist core on what type of response we are expecting, which in this case is a response byte followed by 4 bytes. This info is important so that we read the correct number of bits from the serial line.

The next command of interest is CMD58. This basically tells us the voltages that the Card supports. The definition in Assembly code is the following:

     .BYTE $FF, $FF, $FF, $FF
     .BYTE $00, $00, $02, $7A ; CMD 7A
This is also a command where we get one response byte back followed by four bytes. The format of the trailing four bytes are as follows:

Here I am using a diagram from http://www.rjhcoding.com. The most interesting bits are bits 15-23, indicating the voltages the SD Card can handle. According to the SD Card spec the general excepted voltage is 3.3V. However, the SD Card I am testing with have all the bits 15-23 set to one, meaning that it can work with the voltage range 2.7V-3.6V. Not sure how much other SD Cards will differ.

Another interesting bit is bit 31. While the card is powering up, this bit will be 1, and will change to zero once power up is completed. 

Let us move onto the command that performs the actual initialisation. This actually involves two separate commands, CMD55 and ACMD41. The first command signals that the next command will be a application specific command, which is ACMD41.

The assembly byte definition for these commands are as follows:

     .BYTE $00, $00, $00, $00
     .BYTE $00, $00, $00, $77 ; CMD 55
     .BYTE $40, $00, $00, $00
     .BYTE $00, $00, $00, $69 ; CMD 41
You will notice that CMD 41 contains a command byte $40. This is because bit 31 of the command data is reserved and should be set to one.

The CMD55 and ACMD41 you need to call continuously in a loop and during each iteration you need to check the response byte of the ACMD41 command. When the response byte has transitioned from a 0 (e.g. BUSY), to 1 (initialised), the SD Card initialisation has completed and it is ready to accept read/write commands.

The full program

Here is the full program listing:

.ORG $FF00
     LDX #$FF
     TXS
     LDA #<DATA
     STA $A0
     LDA #>DATA
     STA $A1
     LDX #1
     STX $FE0B
START:
       LDA #0
       JSR CMD
       LDA #1
       JSR CMD
       LDA #2
       JSR CMD
       LDA #3
       JSR CMD
INIT
       LDA #4
       JSR CMD
       LDA #5
       JSR CMD
       ROR A
       BCS INIT 
       LDA #3
       JSR CMD
       LDA #2
       STA $FE0B
       LDA $FE04
DONE
       JMP DONE
CMD:
     ASL
     ASL
     ASL
     TAY
     LDX #$07
LOOP:
     LDA ($A0),Y
     STA $FE00,X
     INY
     DEX
     BPL LOOP
     AND #$80
     BMI END
     LDX #0
     STX $FE0B
BUSY
     LDA $FE00
     BIT $FE01
     BVS BUSY
END
     LDX #1
     STX $FE0B
     RTS
.ALIGN $8
DATA: 
     .BYTE $55, $55, $55, $0B
     .BYTE $00, $00, $00, $C0 ; CMD C0
     .BYTE $FF, $FF, $FF, $FF
     .BYTE $00, $00, $00, $40 ; CMD 40
     .BYTE $00, $00, $01, $AA
     .BYTE $00, $00, $02, $48 ; CMD 48
     .BYTE $FF, $FF, $FF, $FF
     .BYTE $00, $00, $02, $7A ; CMD 7A
     .BYTE $00, $00, $00, $00
     .BYTE $00, $00, $00, $77 ; CMD 55
     .BYTE $40, $00, $00, $00
     .BYTE $00, $00, $00, $69 ; CMD 41
ENDROM = $FFFF-*-3
.FILL ENDROM 00
.BYTE 0, $FF, 00, 00
As you can see there is a loop at the INIT label where we continuously CMD55 and ACMD41 until the card is initialised.

You will also notice that we use the address $FE0B as mentioned previously to disable the creation of wishbone read commands if required.

In this code I have also purposed bit 1 of $FE0B for something else. I am using this bit as trigger for a Xilinx ILA debug core for capturing data. In the code I am setting this bit when invoking command index 3 (e.g. CMD58 or command byte $7A) for a second time.

By triggering the ILA core at this point we can inspect the OCR after initialisation to see if bit 31 has changes to a zero, indicating that the initialisation was indeed successful. The signal I am inspecting with the ILA for this is the miso signal, from which we get the serial data from the SD Card.

To get a better overview of what is going on, I have included a screenshot of a ILA capture on machine for the above scenario:


The key signal here is miso. The location where the logic level initially drops from a 1 to 0 is the start of the response from the SD Card for the CMD58 command. Use the rising edge of each o_sclk as reference for each bit of data.

The first byte of data has every bit zero. This is our response byte and indicate that the SD Card is not in IDLE mode anymore. Should this byte had a value of 1, this would have indicated that the SD Card was in IDLE mode.

The following two bits are one, meaning that both bit 31 and bit 30 are one. This indicates that the power up routine is completed and the SD Card is ready to accept read/write commands.

From the rest of the bits we can deduce that bits 15-23 are all ones, meaning that my SD Card support all mentioned voltage levels.

In Summary

In this post we wrote a 6502 assembly program for initialising an SD Card. We also issued some other SD Card commands to confirm that the Card has properly powered up.

In the next post we will attempt to read from the SD Card.

Until next time!

No comments:

Post a Comment