Tuesday, 4 December 2018

Catching keystrokes from a USB keyboard


In the previous post we managed to read a couple of descriptors from a USB keyboard and identified which endpoint to use for capturing the keystrokes from the keyboard.

In this post we will develop some code for actually retrieving the keystrokes from the keyboard.

Moving to the configured state

For the majority of the previous post we lingered within the default state. Just to refresh our minds again of the other states for a USB device, let us look again at the following diagram:

As you can see, after the Default state there is still two states, Address and Configured, we need to go through until we can do something useful with the USB device.

Let us start by having a look at the Address state. In this state we assign an address to our USB device so so that it stops listening at the default address (e.g. 0).

To set the address, we need to make a bit of changes to our state_machine method:

void state_machine() {
 //bit 24 bit 18
 u32 in2 = Xil_In32(0xE0002144) | (1<<24) | (1<<18);
 Xil_Out32(0xE0002144, in2); //clear

 if (status == 0) {
  status = 1;
 } else if (status == 1) {
  status = 2;

  //set address
  Xil_Out32(0x301000, 0x00030500);
  Xil_Out32(0x301004, 0x00000000);
  schedTransfer(1,0,0x0, 0x300000);
 } else if (status == 2) {
  status = 3;



Here we are setting up a request type 5, which is SET_ADDRESS, and we are setting the device to address 3. We end by waiting 3 milliseconds just make sure everything settled down before we continue.

Let us now configure the device. For his we implement an extra status in our if-else block:

else if (status == 3) {
  in2 = Xil_In32(0x300004) | 3;
  Xil_Out32(0x300004, in2);

  //set configuration
  Xil_Out32(0x301000, 0x00010900);
  Xil_Out32(0x301004, 0x00000000);
  schedTransfer(1,0, 0, 0x300000);
  status = 4;

You will see that we adjust the device address in our first address to 3, because the address was changed in a previous state.

Next we select the appropriate configuration. In the previous post we determined that we should select configuration number #1.

At this stage our USB device is fully configured and ready to use.

A brief pause at periodic schedules

With our USB keyboard ready to use, the next thing obviously for us is to read the keystrokes.

My first take on reading these keystrokes was to also implement an asynchronous schedule. However, with this approach I didn't had any luck at all. Things worked better for me using Periodic schedules.

So, in this section let us spend some time discussing in more detail how periodic schedules work.

Firstly, let us look again at the diagram of how periodic schedules work:

From the diagram we see that everything is driven of a periodic frame list been referenced in part by a FrameIndex, which is updated at the end of each USB frame.

A USB Frame is basically a time period of 1 millisecond.

If you look into further detail on when the FrameIndex gets updated, you will see that strictly speaking the frameindex isn't updated every millisecond, but every 1/8 millisecond. Furthermore you will see that the bottom 3 bits of the FrameIndex is not used to index the Periodic frame list, but rather from bit 3 upwards of the frameindex.

At this point you my be wondering why the frameindex get incremented every 1/8 of a millsecond if the rest of the system only work in increments of 1 millisecond.

The answer is to maintain a bit of compatibility between USB 1.1 and USB 2.0. USB 1.1 always had frames of 1 millisecond in duration. USB 2.0 introduced the concept of microframes, breaking a framedown into even smaller durations of 1/8 milliseconds.

But, despite my explanation, how can you access 1/8 millisecond frames if the frame index, for all basic reasons, only gets incremented every 1 milliseconds? The key to this questions lies in the lower 8 bits of word 2 in a QH.

From the EHCI spec these 8 bits is referred to as the Interrupt schedule mask. Every bit in this byte correspond to a specific microframe within the frame. A one in any particular position means that the transaction will take place within the particular microframe.

If only one bit is set within the Interrupts schedule mask, only one transaction will execute within the frame. Similarly, if more than one bit is set, more than one transaction will trigger within the frame.

Let us now talk a bit about the data structures a Period Frame List points to. A Periodic Frame List also points to QH/qtd structures as an Asynchronous List does.

In fact, it is very convenient to think of each element in a Periodic Frame List as an Asynchronous list on its own. In this analogy, each element of the Periodic Frame List can be thought of as an ASYNCLISTADDR-register on its own.

There is , however, a small flaw in this analogy. In an Asynchronous schedule the ASYNCLISTADDR-register gets updated during traversal to always point to the next QH in the list. In a periodic schedule, however, each element in the Periodic Frame List always point to the first QH element in the list.

As such, within a periodic schedule a circular QH list doesn't make sense. 

Configuring the Periodic Schedule

Let us now write some code for scheduling the periodic schedule.

Firstly we need to specify the number of elements of our Periodic Frame List. We want to poll once every 16 milliseconds. Since each element have duration of 1 millisecond, it makes sense to have sixteen elements with only one of these elements pointing to a valid QH.

To set the frame list size we make use of three bits of register 0xe0002140: 15, 3 & 2.  These three bits gets grouped together as [15][3][2] and has the following meaning:

  • 000: List size is 1024 elements
  • 001: List size is 512 elements
  • 010: List size is 256 elements
  • 011: List size is 128 elements
  • 100: List size is 64 elements
  • 101: List size is 32 elements
  • 110: List size is 16 elements
  • 111: List size is 8 elements
From the above list we should use the value 110 which corresponds to the following code:

void setup_periodic() {
  u32 in2 = Xil_In32(0xE0002140) | (1<<15) | 8;
  Xil_Out32(0xE0002140, in2);


Our List will reside at address 0x304000, so we initialise this area and set the Periodic address base register:

void setup_periodic() {
 Xil_Out32(0x304000, 1);
 Xil_Out32(0x304004, 1);
 Xil_Out32(0x304008, 1);
 Xil_Out32(0x30400c, 1);
 Xil_Out32(0x304010, 1);
 Xil_Out32(0x304014, 1);
 Xil_Out32(0x304018, 1);
 Xil_Out32(0x30401c, 1);
 Xil_Out32(0x304020, 1);
 Xil_Out32(0x304024, 1);
 Xil_Out32(0x304028, 1);
 Xil_Out32(0x30402c, 1);
 Xil_Out32(0x304030, 1);
 Xil_Out32(0x304034, 1);
 Xil_Out32(0x304038, 1);
 Xil_Out32(0x30403c, 1);

 Xil_Out32(0xE0002154, 0x304000);

We start off the initialisation by setting all pointers to invalid pointers. We then set the one of these pointers to a valid one:

void setup_periodic() {
 struct QStruct *qh;
 qh = 0x204040;
 qh->word0 = 0x304082;
 qh->word1 = 0;
 qh->word2 = 0;
 qh->word3 = 0;
 qh->word4 = 1;
 qh->word5 = 1;

 qh = 0x204080;
 qh->word0 = 1;
 qh->word1 = 0x00085103; 
 qh->word2 = 0x40000001;
 qh->word3 = 0;
 qh->word4 = 0x204100;
 qh->word5 = 1;

 struct QStruct *qTD;
 qTD = 0x204100;
 qTD->word0 = 1; 
 qTD->word1 = 1; 
 qTD->word2 = 0x00080180; 
 qTD->word3 = 0x305000;

 //set first frame to qh
 Xil_Out32(0x304000, 0x304042);

You might find it a bit strange that we start with a QH that doesn't contain any qTD's at all, followed by a QH that does have them. I will explain the reasoning behind this a bit later on.

You will also see that the NAK count reload field for the second QH is zero. You might recall that for our asynchronous Schedule this was always 15. Why the difference?

To answer this question let us first look at what a NAK packet is.

When a USB host request data from a USB device and the device doesn't have any data available it will respond with a NAK packet. Sometimes you would like to throw an error if a certain number of NAK packets is received in a row. This is the purpose of the NAK reload field.

In our case we just would like to ignore these packets all together, so we set the RL field to zero. In our schedule when a NAK packet is encountered the slot will just be ignored and be moved on to the next slot.

What is left to be done is to enable the periodic schedule by adding another state within state_machine:

 } else if (status == 5) {
  //enable periodic scheduling
  in2 = Xil_In32(0xE0002140) | 16;
  Xil_Out32(0xE0002140, in2);
  status = 6;

As can be seen, we schedule a wait of 10 milliseconds before we transition the next state.

Reading the actual keystrokes

Let us now write some code for capturing keystrokes from the USB keyboard.

The basic idea is to display the keycode each time a key is pressed or released.

In the previous post we have set up the periodic scheduled with a scheduled qTD transfer in one slot.

We should poll this qTD datastructure till the transfer is finished, which happens when bit 7 of word 2 change to 0. We implement this functionality with an extra state:

 } else if (status == 6) {
  if (!(Xil_In32(0x304100 + 8) & 0x80)) {
   u32 word0 = Xil_In32(0x305000);
   u32 word1 = Xil_In32(0x305004);
   printf("%x %x\n",word0, word1);


We are polling the qTD datastructure at every 10 milliseconds. Once transfer is finished we will get the keystroke information at the first eight bytes at location 0x305000.

Once the scheduled transfer is finished, a new transfer would not be automatically scheduled. It is up to you to schedule a new one.

One could probably just reset the values in the qTD to restart a new transfer. Doing this we may end off with a potential cache coherency issue. To explain, look at the following block diagram of the USB block below:

Changes you made to QH and qTD structures are written to System Memory. The DMA block within the USB controller reads these reads these changes from Main Memory from time to time into internal Dual-Port RAM.

One cannot tell at which stage the USB Controller is reading from System memory and half baked qTD datastructures might end up into the Dual port RAM.

The solution to this issue is to not modify these structures but to create new structures:

 } else if (status == 6) {
  u32 qTDAddress = currentTD ? 0x304100 : 0x304120;
  u32 qTDAddressCheck = currentTD ? 0x304120 : 0x304100;

  if (!(Xil_In32(qTDAddressCheck + 8) & 0x80)) {
   u32 word0 = Xil_In32(0x305000);
   u32 word1 = Xil_In32(0x305004);
   printf("%x %x\n",word0, word1);
   struct QStruct *qh;
   qh = 0x304040;
   qh->word0 = 1;

   struct QStruct *qh2;
   qh2 = currentTD ? 0x304080 : 0x3040c0;

   qh2->word0 = 1;
   qh2->word1 = 0x00085103; 
   qh2->word2 = 0x40000001;
   qh2->word3 = 0;
   qh2->word4 = qTDAddress;
   qh2->word5 = 1;

   struct QStruct *qTD;
   qTD = qTDAddress;
   qTD->word0 = 1; 
   qTD->word1 = 1; 
   qTD->word2 = 0x00080180;
   qTD->word3 = 0x305000;
            u32 temp = qh2;
            temp = temp | 2;
   qh->word0 = temp;

   currentTD = ~currentTD;


As you see, this is where our QH comes in which contains no qTD's. Once we have created a new QH and qTD we just change the next pointer of the first mentioned QH.

One thing to also keep in mind when a transfer is complete, is to preserve the Data toggle bit and apply it to the new qTD. This is done as follows:

  u32 toggle = Xil_In32(qTDAddressCheck+8) & 0x80000000;
  if (!(Xil_In32(qTDAddressCheck + 8) & 0x80)) {
   qTD->word2 = 0x00080180 | toggle;

All the developed code should now be sufficient to capturing the keystrokes continuously and outputting to the console.

The meaning of USB keycodes

As mentioned in the previous section, each key press will result in 8 bytes been populated at address 0x305000. However only the last 6 bytes is significant to us.

Each value of these six bytes represent a keycode of a key that is currently pressed. This means that up to 6 keys can be pressed simultaneously.

USB key scan codes is a bit different than your convential PS/2 codes in that it is more predictable. For instance have a look at the USB scan codes for the first couple of alphabet letters:

  • Key A: scancode 4
  • Key B: scancode 5
  • Key C: scancode 6
  • Key D: scancode 7
  • Key D: scan code 8
  • Key E: scancode 9

In summary

In this post we implemented some code for catching keystrokes from the USB keyboard.

In the next post we will integrate the USB keyboard with our C64 module.

Till next time!

No comments:

Post a Comment