Monday, 15 June 2020

Gearing up to the newest version of Vivado

Foreword

When I was developing a C64 to run on an FPGA on the Zybo board, I exclusively used Vivado 2017.1.

However, during the course of this blog series, I got feedback from a couple of readers saying that they cannot build the source code from this project on later versions of Vivado.

So, in this post of I thought it would be rather interesting to see how the source code of this project wil behave on the latest version of Vivado.

What is the latest version of Vivado?

I wanted to get hold of the latest version of Vivado so I visited Xilinx's website.

On their website I found that the latest version was 2020.1, but it had no free version.

For the sake of clarity, free versions of Vivado usually have the word WebPack in there name and you will only get a subset of the functionality provided by a full version of Vivado.

Eventually I found that the latest free version of Vivado WebPack is 2019.1. This is the version we will be using in this post.

Issues experienced

When I opened the project in Vivado 2019.1, Vivado did quite a good job upgrading the IP's used in my design to the latest version.

This didn't go without any issues, though.

For a start, I got a number of issues with the two Clock wizard blocks in my design. One of the issues I was having, was that one of these blocks outputting two clock frequencies, ended off outputting only one frequency in Vivado 2019.1

In the the end I just deleted the two clock wizard blocks, created new ones and wire them up again.

Another minor issue experienced was that the Zynq block didn't had the I2C port 1 enabled, which I use to enable sound on the Zybo board. For this I also had to enable I2C port and wire up the ports.

One final issue I had, was when generating the Bitstream. The IDE complained about one of the AXI ports not been connected, due to optimisation.

I can remember I has a similar issue right in the beginning of the series when I was playing with AXI ports. At that time the issue I was having was with port AXI_ARID. The way I fixed that issue was just to wire AXI_ARID to a logic zero.

In Vivado 2019.1, the issue was the port AXI_AWID. Two very similar ports. The former port specifies the ID for an AXI read transaction, and the latter is for a ID for a write transaction. It is just funny that Vivado 2017.1 also didn't complain at the time about the AXI_AWID port.

Anyway, I fixed the AXI_AWID error by also wiring it to a logic 0.

Building the source in Vivado 2019.1

I have made the changes available on my GitHub repo for this project, which will enable you to build the project with Vivado 2019.1. For the record, here is the link again to the project repo on GitHub:


I wanted to retain the code that enables one to build the source on Vivado 2017.1, so I made the 2019.1 changes available in a branch. So, just a summary of the branches:

  • Master: Will build on Vivado 2017.1
  • Branch v2019.1: This branch will build on Vivado 2019.1.
When building the code on Vivado 2019.1, you might experience some minor issues. So, in this section I will discuss these issues, together with how to get around them.

Before running Synthesise, it is advisable to create a new HDL wrapper. In doing this, Vivado will complain about some IP's that are locked. Vivado will alternatively suggest to run Check IP Report Status.

So, under the the Reports menu select IP Report Status. From the tab that opens up select Upgrade selected. Eventually a couple of processes will start that will generate all the missing sources, after which you should be able to synthesise and generate bitstream.

In Summary

In this post I discussed how the source code of our C64 FPGA implementation currently behaves in Vivado 2019.1 and the changes required to make it work.

Till next time!

Wednesday, 29 April 2020

Initialising the Sound System

Foreword

In the previous post we added joystick support to our C64 FPGA from the Linux operating system. In addition we expanded the system to be able to use either Joystick Port #1 or Joystick Port #2.

In this post we will I show how to initialise the sound chip on the Zybo board from Linux, so that we can hear the sounds from the SID module.

Speaking of the sound system. A couple of posts ago I mentioned that I have updated my Github repo with the recent changes of our C64 core module, excluding the sound system.

I am pleased to announce that my Github repo now also contains the addition of the sound system.

Rendering SID samples to a speaker

Some time ago, I wote two posts, here and here, where I explained how to incorporate Thomas Kindler's SID core into our C64 design.

My discussion in these two posts basically stopped at the point where the SID samples was serialised over the I2S bus. I haven't explained the supporting processes at all that needs to happen so that the sounds finally gets rendered on a speaker.

Having an overview of these processes is necessary to understand the steps required for initialising the Sound System. So, let's get started!

Sound is produced and captured on the Zybo board by means of the SSM2603 chip from ANALOG Devices.

This chip receives sound samples via the I2S bus and is configured via a I2C bus. As previously mentioned we have already implemented the I2S bus for sending Audio Samples.

We haven't, however, discussed the implementation to interface with the I2C bus. Luckily one doesn't need to implement a I2C module from scratch, since the Zynq contains two I2C onchip peripherals.

In the next section I will briefly highlight what is required to use one of these onchip I2C peripherals to initialise the SSM2603.

The question is, however, how do we use a onchip I2C pheriperhal in our design? We will cover this in the next section.

Using a onchip I2C peripheral

To surface a I2C peripheral in our design, we need to configure our Zynq block. So, start by double clikcing on the Zynq Block, selecting the MIO Configuration section and opening I/O peripherals.

There is a couple of things you should do here. First you need to select the option I2C 1. With this option selected a drop down will appear next to this option in the IO column.

In this dropdown in the IO column you need to select to which external pins on the ZYNQ this I2C peripheral should be attached to. The first couple of options are MIO pins. If we select any of these we will not see the I2C pins in our block design at all. The only option that will surface the pins in our design is EMIO:



With this option selected the I2C pins will now appear within our design in the Zynq block:

You will see that I have already linked up these pins to the rest of the diagram. the iobuf block is a custom block I have created have pin as a birectional pin. The direction of this pin is controlled by the tristate input.

This is all there is to wiring up these pins. Next, let us see what software changes is required to drive these pins and to ultimately initialising the SSM2603.

Initialising SSM2603 from Linux

When I originally developed the SID functionality in the C64 core, I initialised the SSM2603 in a Bare-Metal application. Here is the source for the Bare-Metal application: https://github.com/ovalcode/c64fpga/blob/master/SDK.src/standalone/c64.c

There is quite a bit going on in this application and apart from initialising the SSM2603, we are also reading from a USB keyboard, as explained in a previous post. The key method to look for is init_sound().

If you follow the code in init_sound(), you will see that we are directly writing to the registers associated with the I2C 1 controller. Since we are working in Linux, at the moment, it is probably better practice to see if one can open I2C 1 as a device file and then manipulate the SSM2602 with this file.

In approaching this problem with accessing I2C 1 as a device file, I found myself trying to fiddle with the device tree to try and enable the I2C driver.  This ended up been quite a mission, so I reverted, at least for now, to copy the register writes/read from the standalone application and just modifying it a bit, so it can work in Linux. The whole SSM2603 init sequence we will also make part of the Kernel driver we have been developing in the last couple of posts.

As we have seen in previous posts, when you want to access physical memory locations in Linux, you first need to map it to a virtual address space. So, let us do it for the I2C 1 registers:

static void __iomem *i2c_reg;

static int __init ebbchar_init(void){
...
c64_reg = ioremap(0x43c00000,
           16000);
c64_reg = c64_reg;
c64_reg_screen_mode = c64_reg + 8;
c64_reg_keyboard_0 = c64_reg;
c64_reg_keyboard_1 = c64_reg + 4;
tape_mem_area =  ioremap(0x1f500000,
           2000000);

i2c_reg = ioremap(0xE0005000, 128);
...
}

The address returned by ioremap is an address in virtual address space. From this point onward when you want to access of the I2C registers, you need to use i2c_reg as your base and then add your register offset.

So, for instance, if you want to write the value 0x1f to the register 0x1c, you will do the following:

iowrite32(0x1f, i2c_reg+0x1c);

Similarly, if you want to read a register, you will do something like the following:

status = ioread32(i2c_reg+0x10) & 1;

You will see that in the standalone code, we are using two different operators for doing a read and write from a register: Xil_In32 and Xil_Out32. So when using this code in Linux remember to convert it to ioread32 and iowrite32 respectively. Also, remember that the parameter order of iowrite32 is different than that Xil_Out32: In Linux it is value followed by address and Xil_Out32  starts with the address, followed by the value.

If you want find the final source for the Linux Kernel driver, just go here: https://github.com/ovalcode/c64fpga/tree/master/SDK.src/linux

Another fine difference between the standalone code and the Linux code, is the operator we use for introducing a delay. In the standalone code I use usleep, where the value should be microseconds. Also, in Linux I am using msleep, where the value should be in milliseconds.

This is about all the changes required to initialise the SSM2603 from Linux.

When equivalent isn't exactly equivalent

In the previous post I have basically copy and pasted the SSM2603 initialisation code to our Linux Kernel driver, with minor changes.  So, in theory, this code should just worked in our Linux Driver.

However, when I tried this code for the first time, the SSM2603 didn't initialise at all.

After some investigation, I found that the driver got stuck in the following loop within the method writeReg:

         do {
          status = ioread32(i2c_reg+0x10) & 1;
         } while (!status);

This loop checks one of the status bits of I2C 1, which is set as soon as the I2C have shifted out a chunk of information.

This really confused me, because the equivalent code on the Bare-Metal application worked perfectly.  I tried thinking of a couple of possibilities that was causing this.

Firstly I wanted to know if the pins from I2C really got routed to EMIO instead of one or other MIO pin.

So armed, with the XSCT console, I inspected the memory locations for MIO routing and comparing the same registers to when Linux was running.

Maybe I should just take a step back and give an some background to the checks I was doing.

The MIO registers provides you with some options to route the I2C pins to different external pins. As an example, please have a look on page 1643 of the Zynq Technical Reference manual:


This register gives you the option to route the Serial Clock pin of the I2C 1 to pin 12. Similar registers exists, like 0xF8000734 and 0xF8000740, which allows you to route I2C pins to other MIO pins.

When I inspect these registers when running the Bare-Metal application, I found that no MIO pin is configured for surfacing any I2C 1 pin. This is exactly what I expect.

When inspecting the same registers when running Linux, I got the same result.

This inspection let to a bit of a dead-end, but there is still a question remaining: How do enable pins to get routed through EMIO?

At first sight I couldn't find any information about this in the Zynq TRM. However, the following diagram on page 49 provided some subtle information for me:


As seen all peripherals go into the MIO, which performs the necessary multiplexing as describe earlier.

Some of these peripherals are also connected to the EMIO. From the diagram it looks like the connections to EMIO are direct, so in theory these pins should always be available to the FPGA.

Once again I reached a dead-end. What else could be the reason why the the SSM2603 doesn't initialise in Linux?

I did a further search in the Zynq TRM PDF to see if I could locate any other registers that is related to the I2C peripheral, and eventually I came across the register APER_CLK_CTRL on page 1586. Specifically, my eyes ctached the following phrase:

Please note that these clocks must be enabled if you want to read from the peripheral register space.
This could be indeed the problem, we are trying to read the status, but we are not getting any sensible data back, because the AMBA clock is not enabled for I2C 1. For clocking to be enabled for I2C 1, bit 19 should be set for register APER_CLK_CTRL.

I could confirm that this bit was set when the Bare-Metal application was running, and, as a relief, I could confirm that bit wasn't set when running Linux!

This is luckily an easy fix!

...
static void __iomem *clk_reg;
...
static int __init ebbchar_init(void){
...
clk_reg = ioremap(0xF8000000, 0x200);
...
iowrite32(0x01EC044D, clk_reg+0x12c);
msleep(2);
init_sound();
...
}
...

I introduced a small delay after setting the bit in the register APER_CLK_CTRL.

After the fix the SSM2603 initialisation worked in Linux. One can quickly detect that the initialisation is working by hearing the speaker attached to Zybo board turning on.

In Summary

In this post we looked into how to initialise the SSM2603 for sound in Linux.

This set of blog posts on how to create a C64 on a Zybo is quickly running to an end, since I have achieved more or less all my goals with it.

That been said, there is so much more you can use the Zybo board for, some of these of which I also want to write Blog posts about.

I also want to write some posts about using the C64 outside of a FPGA context, like writing a Chess Engine.

So, in coming posts I will introduce some variety, that might not be related to implementing a C64 on on FPGA.

I might, however, revist the topic of C64 on an FPGA from time to time, for instance implementing a 1541 running in parallel with the C64 in the FPGA.

Till next time!

Tuesday, 14 April 2020

Mapping joystick bits from Linux

Foreword

Previously we managed to load a tape image from the Linux File system, and got our C64 core to load it.

In this post we will add some more key mappings for a joystick, so we can play the game that loads.

Making our C64 design accept two joystick inputs

Currently our C64 FPGA design have only implemented Joystick Port #2. Having implemented functionality to rapidly switch between different game tape images, it actually becomes more of a requiremnt to implement Joystick Port #1 as well.

So, let us start with this requirement by looking into the C64 FPGA design.

Our starting point is our IP block that contains both an Master and Slave AXI port. This block we need to modify so that it can get an additional output port for Joystick Port #1:


To wire up this port, we need to make the following adjustments to the user logic of our AXI Slave block:

 // Add user logic here
    assign slave_reg_0 = slv_reg0;
    assign slave_reg_1 = slv_reg1;
    assign restart = slv_reg2[1];
    assign tape_button = slv_reg2[2];
    assign joybits = slv_reg2[8:4];
    assign c64_mode = slv_reg2[9];
    assign joybits2 = slv_reg2[14:10];

 // User logic ends

The bits of the two joystick ports are not consecutive to each other, due to the c64_mode in between. This will be a source of interesting bit manipulation in our driver, as will be seen later.

On our C64 core IP, we need to add an extra input port to accept this extra Joystick input. In our C64 core we supply the bits of the second joystick port to port B of CIA#1.

Port B of CIA#1, however, is also used as an from our keyboard. Thus we need to merge the lower four bits of the keyboard input bits with the bits from our second joystick port.

We do this as follows:

...
    assign key_joy_merged = ~(~keyboard_result[4:0] | ~joybits2);
...     
    cia cia_1(
          .port_a_out(keyboard_control),
          .port_a_in({3'b111, joybits}),
          .port_b_in({keyboard_result[7:5],key_joy_merged}),
          .addr(addr[3:0]),
          .we(we & io_enabled & (addr[15:8] == 8'hdc)),
          .clk(clk_1_mhz),
          .chip_select(addr[15:8] == 8'hdc & io_enabled),
          .data_in(ram_in),
          .data_out(cia_1_data_out),
          .flag1(flag1 & !flag1_delayed),
          .irq(irq)
            );
...

This is all the changes we require for the second joystick port for our FPGA design

Changes to the driver

Let us see what changes is required to our driver to accommodate two joystick ports.

Firstly we need to modify our record structure for input events as follows:

struct keyboard 
{
           u32 word1;
           u32 word2;
           u32 joybits;
};


Joybits will store the bits for both joystick ports.

Next, I will make a couple of changes to our write method:

static ssize_t dev_write(struct file *filep, const char * keys, size_t len, loff_t *offset){
   struct keyboard temp[1];
   copy_from_user(temp, keys, 12);
   iowrite32(temp[0].word1, c64_reg_keyboard_0);
   iowrite32(temp[0].word2, c64_reg_keyboard_1);
   unsigned int tempjoy = temp[0].joybits & 0x3ff;
   tempjoy = ~tempjoy & 0x3ff;
   unsigned int joy_high = (tempjoy << 1) & 0x7c0;
   unsigned int joy_low = tempjoy & 0x1f;
   joy_high = joy_high | joy_low;
   joy_high = joy_high << 4;
   unsigned int screenread = ioread32(c64_reg_screen_mode) & 0xffff820f;
   screenread = screenread | joy_high;
   iowrite32(screenread, c64_reg_screen_mode);
   
   return 12;
}


So, we isolate the bits of both ports and shift them to the correct position. We also need to shift the data of the two joystick ports one bit apart, because they are separated by the c64_mode bit.

Changes to the application

In our application, we make use of a file called sym.txt to indicate the resulting c64 scancode for each mapped key.

The real question here is: How do we indicate that a set of mapped keys is purposed for a Joystick? For this purpose, I am going to use the value -3:

84 -3 0 0                                            
17 -3 1 0                                              
87 -3 2 0                                              
89 -3 3 0                                              
19 -3 4 0                                              

The third value indicates the relevant bit that should be set for the joystick.

You might also remember that we use the file sym.txt to populate an array called key_map, containing c64 scancode, where the row and column values are reduced to a single scan code value between the range 0 to 63.

For the joystick, we can just extend the range, so scancode 64 can be joystick bit 0, 65 joystick bit 1 and so on.

With this in mind, we can change the method init_table as follows:

void init_table() {
  for (int i = 0; i < 256; i++) {
    key_map[i][0] = -1;
    key_map[i][1] = -1;
    key_map[i][2] = -1;
    key_map[i][3] = -1;
  }
  FILE *fp;
  fp = fopen("sym.txt", "r");
  char input[80];
  int num1, num2, num3, num4;
  while (1) {
    int status = fscanf(fp,"%d", &num1);
    fscanf(fp,"%d", &num2);
    fscanf(fp,"%d", &num3);
    fscanf(fp,"%d", &num4);
    fscanf(fp,"%[^\n]",input);
    if (status == EOF)
      break;
    if (num2 == -3) {
      key_map[num1][0] = num3 + 64;
      key_map[num1][1] = 0;
    } else if (num4 == 8) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = 0;
      key_map[num1][2] = (num2 << 3) | num3;
      key_map[num1][3] = 1;
    } else if (num4 == 1) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = 1;
    } else if (num4 & 32) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = num4 & 1;
      fscanf(fp,"%d",&num1);
      fscanf(fp,"%d",&num2);
      fscanf(fp,"%d",&num3);
      fscanf(fp,"%d",&num4);
      fscanf(fp,"%[^\n]",input);
      key_map[num1][2] = (num2 << 3) | num3;
      key_map[num1][3] = num4 & 1;
    }
  }
}


We handle these codes in the key processing loop as follows:

    for (int i = 0; i < num_keys_to_process; i++) {
      int offset = shifted << 1;
      if (keys_to_process[i] == 0x18) {
        ioctl(fd,3);
        continue;
      }
      int c64_scan_code = key_map[keys_to_process[i] & 0xff][offset];
      if (c64_scan_code == -1)
        continue;
      if (c64_scan_code > 63) {
        keyToProcess.joybits = keyToProcess.joybits | (1 << (c64_scan_code - 64));
      } else if (c64_scan_code < 32) {
        keyToProcess.word1 = keyToProcess.word1 | (1 << c64_scan_code);
      } else {
        c64_scan_code = c64_scan_code - 32;        
        keyToProcess.word2 = keyToProcess.word2 | (1 << c64_scan_code);
      }
      if (key_map[keys_to_process[i] & 0xff][offset+1]) {
        keyToProcess.word1 = keyToProcess.word1 | (1<<15);
      }
    }

How do we switch between different joystick ports? A very simple mechanism would be to specify an extra parameter on the commandline. We can then just test for the number of arguments:

    for (int i = 0; i < num_keys_to_process; i++) {
      int offset = shifted << 1;
      if (keys_to_process[i] == 0x18) {
        ioctl(fd,3);
        continue;
      }
      int c64_scan_code = key_map[keys_to_process[i] & 0xff][offset];
      if (c64_scan_code == -1)
        continue;
      if (c64_scan_code > 63) {
        if (argc == 3)
          c64_scan_code = c64_scan_code + 5;
        keyToProcess.joybits = keyToProcess.joybits | (1 << (c64_scan_code - 64));
      } else if (c64_scan_code < 32) {
        keyToProcess.word1 = keyToProcess.word1 | (1 << c64_scan_code);
      } else {
        c64_scan_code = c64_scan_code - 32;        
        keyToProcess.word2 = keyToProcess.word2 | (1 << c64_scan_code);
      }
      if (key_map[keys_to_process[i] & 0xff][offset+1]) {
        keyToProcess.word1 = keyToProcess.word1 | (1<<15);
      }
    }

This conclude the code changes we should make to implement joystick ports.

In Summary

In this post we have implemented the necessary code changes for mapping keys to joystick ports in Linux.

In the next post we will write some extra code for our driver to initialise the sound system  so we can hear sound generated by our C64 core!

Till next time!

Friday, 10 April 2020

Blog Series Update: Source on Github Repo

Foreword

Good day all! In this post, I will not be doing a technical discussion, but rather a bit of an update on this Blog Series.

From almost the beginning of this Blog series, I maintained a Github Repository of the source code presented throughout the series.

I must confess that I haven't updated this repo with source code presented for quite a long time (more than a year!).

The reason for this is mainly because when working with Block Designs in Vivado, you end off with quite a bit of auto generated files during Synthesis, which can differ quite a bit from one synthesis run to the next.

This just made the whole exercise of maintaining changes on the Github repo quite a nightmare.

Also, what made matters worse, was that when you cloned the repo and try to synthesise the design, one always end off with some errors on the AXI blocks. Eventually you just end off deleting these AXI blocks, re-adding them in the block design, followed by wiring them up to the rest of the design.

In recent months I received quite a number queries from asking if I can update the source on my Github Repo.

So, in this lockdown period I am having in my country, and I belief many other countries around the world, I set forth to update update the Github Repo for this Blog series.

I also gave some thought into how to simplify the build process, which I will explain in this post.

Gearing up

Let us start this post by just giving the address again for the Github Repo that hosts the source code for this Blog Series:

https://github.com/ovalcode/c64fpga

This Repo also contains a readme telling you how to get the source code and how to build the project.

In this Readme I am also explaining the issue with AXI blocks as I explained in the Foreword.

It is here I found a way to improve the build process a bit. You will also start by running the tcl file to generate the project files, after which you will open the generated project.

Next you will start off the Synthesis of the project. As previously you will see quite a number of errors appearing, but just wait till the whole process completes.

Afterwards you IDE will something like the following:


Notice that at this point I have the Design Runs tab open showing that all Out-of-Context Runs that have failed.

This window showing the errors, will also give you the solution to fixing all these issues.

Now, right click on the first error node and select Open elaborated Design. The IDE will be busy for a minute or two, after which this node will change from an error icon to a check icon.

Next, one needs to following the same set of steps for the remaining items. There is about 13 of these items, so it will feel a bit cumbersome, taking about half an hour.

However, this only needs to be done once after a clone.

After finished doing the Open Eloborate design on all items, synthesis and Generate BitStream should work without an issue.

What the source code contains

At this point the Github Repo contains all the necessary sources for generating the FPGA bitstream you can use in the previous set of posts, that will give you a VGA output of the Linux console and optionally switching to the C64 screen.

If you would rather go directly to the C64 screen on power up, you can add a Constant block to the block design, supplying the value '1' to the C64_screen_mode port on the VGA block.

Currently, this Github doesn't provide source code that will generate the necessary ARM code for booting into Linux or a baremetal design. In the future, however, I might add the necessary source code for this to the GIT repo.

What is also missing from the source code is the IP block that send the generated SID audio to the sound chip on the Zybo board.

In Summary

In this post I have given an update on the source code for this Blog series on the Github Repo.

See you in the next post, and stay safe!
 

Thursday, 2 April 2020

Loading tape images from Linux file system

Foreword

In the previous post we wrote some code for mapping PC scancodes to C64 scancodes, and sending the result to our C64 FPGA core.

In this post we will be writing some code for reading tape images from the Linux file system and sending them to our C64 core.

Overview

Let us start this post by familiarising ourselves again on how tape image loading  currently works in our C64 core.

Our C64 core uses an AXI master port to read tape image data from the SDRAM on the Zybo from a predefined location.

So, when triggering a tape load in our C64 core by typing LOAD<ENTER>, it is up to you to ensure that this predefined memory area is populated with a valid tape image.

Apart from the memory area that should contain the tape image, our C64 core also have a peripheral register mapped into the Zynq memory space that controls tape operation. This peripheral register is located at address 0x43c0_0008.

In this register there is two bits of importance for controlling tape loading:


  • Bit 2: Tape Button: This bit corresponds to the Play Button of an original Datasette unit. 
  • Bit 1: Reset tape data pointer: When you have populated the tape image area in memory with a new image, you should briefly set this bit to 1 and then back to zero. This ensure that when you trigger a LOAD from the C64 core, the reading will start at the beginning of the memory area.
So, our application running in Linux should perform a couple of things to cater for tape loading.

Firstly when the application starts, it should the requested tape image into the tape area in memory. The tape image to load should be supplied as a command line parameter .

From previous posts, you might remember that with user applications Linux, you work in Virtual Memory space, and don't have access to physical memory addresses. In our case we will need access to physical memory addresses in order populate the area in memory with the required tape image so that our C64 core can access it.

Our Linux Kernel driver in this regard will also need to act as mediator for moving the tape image to required memory area.

Using IOCTL's

In one the previous two posts we have developed Kernel driver that served as an interface between a user application and our C64 FPGA core.

Currently when opening this driver as a file, you can only write C64 scan codes to it.

Can you utilise this driver to control more aspects of the C64 core? For instance, if you not only want to send C64 scancodes to this driver, but also tape images?

One can indeed, with the help of IOCTL's. According to Wikipedia:
ioctl (an abbreviation of input/output control) is a system call for device-specific input/output operations and other operations which cannot be expressed by regular system calls.
This sounds exactly what we want to achieve with our device driver. IOCTL provides you a way of serving multiple with a single file handle.

We encountered ioctl's in a previous post where I explained how to capture keyup/keydown events in Linux within the setupKeyboard event:

    /* save old keyboard mode */
    if (ioctl(0, KDGKBMODE, &old_keyboard_mode) < 0) {
 return 0;
    }

Here we used an ioctl to get the current keyboard mode from stdin (e.g. file handle zero). By simply doing a read() call from stdin, you will just receive key events from the keyboard and you simply wouldn't be able to get the keyboard mode at all.

IOCTL gives us some helping out for this need. IOCTL is almost like a read() function call, but it provides you with an additional important parameter: A command parameter. In the previous example we provided the command parameter KDGKBMODE.

So, how does one define an ioctl call within your device driver? For starters, you need to define a function in your driver having the following signature:

int (*ioctl) (struct inode *inode, struct file *filp,
unsigned int cmd, unsigned long arg);

In our case, we won't worry about the first two parameters. Our key parameters will only be the last two parameters, cmd and arg.

The arg parameter in our case will be a pointer, and we will need to cast it as such in our ioctl method.

Let us end this section, by defining a skeleton ioctl method, on which we will expand in coming sections:

...
static struct file_operations fops =
{
   .open = dev_open,
   .read = dev_read,
   .write = dev_write,
   .unlocked_ioctl = c64_ioctl,
   .release = dev_release,
};
...
static long c64_ioctl (struct file *filp,
                   unsigned int cmd, unsigned long arg) {

}
...

I have also added an extra member to out fops struct. This method instructs Linux on which method to call when we do an ioctl on our C64 driver.

Changes to our driver

Let us tart with the necessary changes to our C64 kernel driver.

The first thing that comes to mind is that currently, when we open the driver, we switch to the C64 screen right away.

So, in effect the copying of the tape data will take place while the C64 screen is already active. Somehow, this scenario doesn't seem so clean. We would want to only switch to the C64 screen once all background initialisation has been completed.

So let us do the following modifiction to the open method:

static int dev_open(struct inode *inodep, struct file *filep){
   numberOpens++;
   printk(KERN_INFO "EBBChar: Device has been opened %d time(s)\n", numberOpens);
   //iowrite32(0x200, c64_reg_screen_mode);   
   return 0;
}

No switching to C64 screen on open anymore!!

Next, let us work on the actual tape loading mechanism. Let us start with by defining a memory mapping for a physical address range storing the tape image that our C64 core will retrieve:

 ...
static void __iomem *tape_mem_area;
...
static int __init ebbchar_init(void){
...
  tape_mem_area =  ioremap(0x1f500000,
           2000000);
...
   return 0;
}

I have used start address 0x1f50_0000, which is also above the 500MB mark, out of available Kernel space.

Next, let us define an IOCTL call for copying tape data from userspace to out tape area:

...
static long c64_ioctl (struct file *filp,
                   unsigned int cmd, unsigned long arg) {
...
  if (cmd == 0) {
    unsigned char * user = (unsigned char *) arg;
    copy_from_user(tape_data, user, 8192);
    for (i = 0; i < 8192; i++) {
      iowrite8(tape_data[i], i + tape_mem_area + tape_pointer);
    }
    tape_pointer = tape_pointer + 8192;
  } 
...
  return 0;
}
...

So, zero is the command code to send chunks of data from our userspace program to the Kernel.

This call assumes a chunk size of 8KB per call. So, you need to always pass a pointer to an array of char with 8192 elements. This simplify code somewhat. If we are about to send the last portion of the tape file, the chunk size might be less than 8192 bytes. The remaining garbage in the array is not an issue since our C64 core only reads what it needs.

The tape_pointer variable keeps track of where we are currently with the copying of the tape image.

Once we are finished copying the tape image, we need a IOCTL call to actually switch to C64 and to inform the C64 core to start reading the tape image from the beginning:

static long c64_ioctl (struct file *filp,
                   unsigned int cmd, unsigned long arg) {
  int i;
  if (cmd == 0) {
    unsigned char * user = (unsigned char *) arg;
    printk("after cast\n");
    copy_from_user(tape_data, user, 8192);
    for (i = 0; i < 8192; i++) {
      iowrite8(tape_data[i], i + tape_mem_area + tape_pointer);
    }
    tape_pointer = tape_pointer + 8192;
  } else if (cmd == 1) {
    iowrite32(0x206, c64_reg_screen_mode);
    msleep(1000);
    iowrite32(0x204, c64_reg_screen_mode);
  } 

  return 0;
}

The reset function currently only resets the C64core's pointer to the beginning of the tape area. However, the FPGA can be altered to also reset the 6502 CPU.

We finally need one last IOCTL call to simulate the press of the Play button:

static long c64_ioctl (struct file *filp,
                   unsigned int cmd, unsigned long arg) {
  int i;
  if (cmd == 0) {
    unsigned char * user = (unsigned char *) arg;
    printk("after cast\n");
    copy_from_user(tape_data, user, 8192);
    for (i = 0; i < 8192; i++) {
      iowrite8(tape_data[i], i + tape_mem_area + tape_pointer);
    }
    tape_pointer = tape_pointer + 8192;
  } else if (cmd == 1) {
    iowrite32(0x206, c64_reg_screen_mode);
    msleep(1000);
    iowrite32(0x204, c64_reg_screen_mode);
  } else if (cmd == 3) {
    iowrite32(0x200, c64_reg_screen_mode);
  }

  return 0;
}

Changes to our userspace program

Let us see what changes we need for our userspace program.

We start with the following changes:

int main(int argc, char *argv[]){
...
   unsigned char data[8192];
   tapefile = fopen(argv[1],"r");
   int num_read;
   do {
     num_read = fread(data, 1, 8192, tapefile);
     ioctl(fd,0,&data);
   } while (num_read == 8192);

   ioctl(fd,1);
...
}


We receive the filename of the required as a parameter on the commandline. We then open this file and send chunks of 8KB to the driver.

Finally we call command #1 on the driver, which enables the C64 screen and reset the C64core so that it reads tape data from the beginning.

We finally need to cater for allowing the user to enable the Play, when required to do so. I have decided to allocate the Function key F11 for this purpose.

I will checking for this key in the loop that process 'ASCII'-scancodes:

    for (int i = 0; i < num_keys_to_process; i++) {
      int offset = shifted << 1;
      if (keys_to_process[i] == 0x18) {
        ioctl(fd,3);
        continue;
      }
      int c64_scan_code = key_map[keys_to_process[i] & 0xff][offset];
      if (c64_scan_code == -1)
        continue;
      if (c64_scan_code < 32) {
        keyToProcess.word1 = keyToProcess.word1 | (1 << c64_scan_code);
      } else {
        c64_scan_code = c64_scan_code - 32;        
        keyToProcess.word2 = keyToProcess.word2 | (1 << c64_scan_code);
      }
      if (key_map[keys_to_process[i] & 0xff][offset+1]) {
        keyToProcess.word1 = keyToProcess.word1 | (1<<15);
      }
    }

In my mapping file where I map Linux scan codes to ASCII-scancodes I have mapped the resulting code for F11 to 0x18. When we encounter this key, we make IOCTL call to enable the play button, and skip further processing of this key.

In Summary

In this post we have developed some functionality for loading tape images stored on a Linux File system, and copying it to an area of memory which our C64 core can access to trigger a tape load.

In the next post we will be mapping keys for a joystick, so we can play the game we have load from the tape image on the Linux File System!

Till next time!

Sunday, 22 March 2020

Redirecting keystrokes from Linux to the C64 module: Part 2

Foreword

In the previous post we have started the process of capturing keystrokes in Linux and rediecting them to out C64 module.

We ended with a Kernel driver and a very simple user program and left the PC scancode mapping to C64 scancode mapping for this post.

For the scancode mapping we could have probably hack together something quickly, with a long case statement just doing a mapping of the essentials keys.

I did indeed followed a similar minimalistic approach with previous emulators that I wrote, and it worked just fine.

However, when entering programs with such a minimal mapping, it becomes a guessing game when you need to enter a shift key sequence. On a PC keyboard, for instance, hitting Shift and '2', gives you the @-sign. On a C64 keyboard, the same key sequence gives you double quotes (").

In this regard the Vice Commodore emulator do a very convenient key mapping. If you enter a Shift+2 sequence on a PC keyboard, a @ will be displayed on the C64 screen.  Thus, no need to have a photo of a C64 handy when typing Shift sequences!

In this post we will look into how keymapping works in the Vice Commodore emulator, and see how we can apply a similar keymapping within our C64 implementation.

An Overview of Vice key mapping

Let us do a quick overview on how key mapping works in the Vice emulator.

The easiest way to approach this exercise is to have a look at a key mapping file within the Vice source.

Start by downloading the source of Vice as a tarball, untarring it, and opening the file data/C64/sdl_sym.vkm.

This is a text file and apart from configuration information, it contains lots of comments.

One of the useful comments is the format of every line:

# - normal line has 'keysym/scancode row column shiftflag'

Each line start with a scan code for a particular key on a source keyboard.

Quickly looking at these scan codes, it appears to be the ASCII representation of applicable key. The key 'A', for instance will have the scancode 97 in the file (e.g. lowecase a). Similarly, the '1' key will have the code 49.

The next two values, row and column, represents the row and column values from the C64 keyboard for the associated key.

The final value on a row, shiftflag, tells whether a shiftkey is applicable for this key mapping. This will become clear in a moment.

Further on in the file we have a couple of comments giving more information about shiftflag:

# Shiftflag can have the values:
# 0      key is not shifted for this keysym/scancode
# 1      key is shifted for this keysym/scancode
# 2      left shift
# 4      right shift
# 8      key can be shifted or not with this keysym/scancode
# 16     deshift key for this keysym/scancode
# 32     another definition for this keysym/scancode follows
# 64     shift lock
# 256    key is used for an alternative keyboard mapping

Let us try to understand these shiftflags by looking at some examples.

We start with a simple example:

49 7 0 8               /*            1 -> 1            */

Here the shiftflag is 8, which, according to the table, can be either shifted or not. In short, this means for that particular key, we can blindly pass shift key presses on the PC keyboard to the C64 core. This is because on both a PC keyboard and C64 keyboard a Shift+1 corresponds to an exclamation mark (!).

Let us look at a more complex example:

50 7 3 32              /*            2 -> 2            */
50 5 6 16              /*            @ -> @            */

The first row have a shiftcode of 32, which, according to the table: another definition for this keysym/scancode follows.

This means that when you use this key with a shift, don't simply pass on the shift key to the C64 core. In such a case you need to consider the definition in the next row.

In this, the definition in the next row give us a new C64 scan code for the PC Shift+2 combination. Also, the shift code for this row is 16, meaning that we should not send a shift key to the C64 core for this key combination. So, on a PC Shift+2 corresponds to @. On a C64 you can access the @-key without a shift.

Let us look at another example:

55 3 0 32              /*            7 -> 7            */
55 2 3 1               /*            & -> &            */

Here the Shift+7 key has the shiftcode 1. This means that not only does Shift+7 maps to a different C64 scancode, but in addition we need to send a shift key to the C64 core.

Let us end off this section with another interesting example:

39 3 0 33              /*            ' -> '            */
39 7 3 1               /*            " -> "            */

In the first row we have two shift set simultaneously. This firstly means that for Shift+', use the definition in the  next row.

It also means that if we use (') without the shift, we need to also pass a shift to the C64 core. This is because to type a (') on the C64 we need to use the key combination Shift+7.

ASCII like scan codes

I mentioned previously that the source scan codes in sdl_sym.vkm corresponds more or less to the ASCII code of the relevant keys.

However, the scan codes we get from Linux, doesn't have any relation to a equivalent ASCII code whatsoever.

So, we need a conversion table between Linux scan codes to the scan code we require. This table will look more or less like the following:

struct keyboard {
           char code;
           char desc[20];
};
struct keyboard temp[256] ={
  {},
  {'\x1b',"ESCAPE"}, //1
  {'1',"Key1"}, //2
  {'2',"Key2"}, //3
  {'3',"Key3"}, //4
  {'4',"Key4"}, //5
  {'5',"Key5"}, //6
  {'6',"Key6"}, //7
  {'7',"Key7"}, //8
  {'8',"Key8"}, //9
  {'9',"Key9"}, //a
  {'0',"Key0"}, //b
...
  {'q',"KeyQ"}, //10
  {'w',"KeyW"}, //11
  {'e',"KeyE"}, //12
  {'r',"KeyR"}, //13
  {'t',"KeyT"}, //14
  {'y',"KeyY"}, //15
  {'u',"KeyU"}, //16
  {'i',"KeyI"}, //17
  {'o',"KeyO"}, //18
  {'p',"KeyP"}, //19
  {'[',"Key["}, //1a
...
};

So, for instance if someone press the key '1', we will get the scan code 2 from Linux. If we look at position 2 in this lookup table, we will find char value '1'.

The question might arise what code we can use as a scan code for modifier keys (e.g. shift, control) that doesn't really map to any ASCII code.

For these keys we could either use Capitals or ASCII codes after 128.

Parsing sdl_sym.vkm

Earlier on we discussed the structure of a vkm file.

The question at his point is: How do we parse such a file and store in a structure that we can easily transform a PC scancode to a C64 scancode?

For this purpose we can create a lookup table where we use the PC scan code as an index to retrieve the appropriate row that contains the resulting C64 scan code.

That is easy enough, but how do we cater for the shifted version of a scancode?

We can cater the shifted version by having two elements in each row of the lookup table, with the second element in a row been the shifted version of a scancode.

You might also remember from our discussion on the vkm file structure, that the resulting C64 scancode might optionally have an shift key assosiated with it. To cater for this, our lookup table need to have 4 elements per row.

Let us do some coding and start by defining this lookup table and initialising it:

...
int key_map[256][4];
...
void init_table() {
  for (int i = 0; i < 256; i++) {
    key_map[i][0] = -1;
    key_map[i][1] = -1;
    key_map[i][2] = -1;
    key_map[i][3] = -1;
  }
...
}

We want to keep the parsing of the vkm file simple, so we will need to modify this file a bit. We will be removing all comment lines at the beginning. We will also remove all the negative scancodes towards the end of the file.

Speaking of removing comments. You will see that each mapping line ends with a comment. It is not necessary for a vigorous exercise to remove these comments as well. Instead, when reading each line, we will read the first four items on every line that we need and skip straight to the next line.

Let us see if we can implement this functionality in code:

void init_table() {
...
  FILE *fp;
  fp = fopen("sym.txt", "r");
  char input[80];
  int num1, num2, num3, num4;
  while (1) {
    int status = fscanf(fp,"%d", &num1);
    fscanf(fp,"%d", &num2);
    fscanf(fp,"%d", &num3);
    fscanf(fp,"%d", &num4);
    fscanf(fp,"%[^\n]",input);
    if (status == EOF)
      break;
  }
...
}

sym.txt is our modified vkm as described earlier.

For every line we read our four values, and skip to the next line with [^\n].

We can now use num1, num2, num3 and num4 on each line to populate the lookup table.

Let us start with the simple case for rows having the shiftcode 8:

void init_table() {
  for (int i = 0; i < 256; i++) {
    key_map[i][0] = -1;
    key_map[i][1] = -1;
    key_map[i][2] = -1;
    key_map[i][3] = -1;
  }
  FILE *fp;
  fp = fopen("sym.txt", "r");
  char input[80];
  int num1, num2, num3, num4;
  while (1) {
    int status = fscanf(fp,"%d", &num1);
    fscanf(fp,"%d", &num2);
    fscanf(fp,"%d", &num3);
    fscanf(fp,"%d", &num4);
    fscanf(fp,"%[^\n]",input);
    if (status == EOF)
      break;
    if (num4 == 8) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = 0;
      key_map[num1][2] = (num2 << 3) | num3;
      key_map[num1][3] = 1;
    }
  }
}


Here is number of things going on here, so let us unpack it a bit.

First, we are combining the row and column value into a single number, by putting row value at bits 5-3 and the column value into bits 2-0.

Also element 0 and 1 on the row, is for the unshifted version of the PC scancode, and element 2 and 3 for the shifted version.

Element 1 and 3 indicates whether we should pass a shift key to our C64 core with the resolved scancode.

Next, let us cater for rows having shiftcode as 1:

void init_table() {
  for (int i = 0; i < 256; i++) {
    key_map[i][0] = -1;
    key_map[i][1] = -1;
    key_map[i][2] = -1;
    key_map[i][3] = -1;
  }
  FILE *fp;
  fp = fopen("sym.txt", "r");
  char input[80];
  int num1, num2, num3, num4;
  while (1) {
    int status = fscanf(fp,"%d", &num1);
    fscanf(fp,"%d", &num2);
    fscanf(fp,"%d", &num3);
    fscanf(fp,"%d", &num4);
    fscanf(fp,"%[^\n]",input);
    if (status == EOF)
      break;
    if (num4 == 8) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = 0;
      key_map[num1][2] = (num2 << 3) | num3;
      key_map[num1][3] = 1;
    } else if (num4 == 1) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = 1;
    } 
  }
}

For rows having this shift code we don't have a shifted version for these scancodes.

Finally let us cater for rows with shiftcode 32. As seen in a previous section we can have cases where shiftcode 32 and 1 can be enabled simultaneously. So, when checking for flag 32, we need to mask off all other bits:

void init_table() {
  for (int i = 0; i < 256; i++) {
    key_map[i][0] = -1;
    key_map[i][1] = -1;
    key_map[i][2] = -1;
    key_map[i][3] = -1;
  }
  FILE *fp;
  fp = fopen("sym.txt", "r");
  char input[80];
  int num1, num2, num3, num4;
  while (1) {
    int status = fscanf(fp,"%d", &num1);
    fscanf(fp,"%d", &num2);
    fscanf(fp,"%d", &num3);
    fscanf(fp,"%d", &num4);
    fscanf(fp,"%[^\n]",input);
    if (status == EOF)
      break;
    if (num4 == 8) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = 0;
      key_map[num1][2] = (num2 << 3) | num3;
      key_map[num1][3] = 1;
    } else if (num4 == 1) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = 1;
    } else if (num4 & 32) {
      key_map[num1][0] = (num2 << 3) | num3;
      key_map[num1][1] = num4 & 1;
      fscanf(fp,"%d",&num1);
      fscanf(fp,"%d",&num2);
      fscanf(fp,"%d",&num3);
      fscanf(fp,"%d",&num4);
      fscanf(fp,"%[^\n]",input);
      key_map[num1][2] = (num2 << 3) | num3;
      key_map[num1][3] = num4 & 1;
    }
  }
}


When we reach a row with shiftflag 32, we also need to read the next line to get details about the shifted version of the scancode.

Transforming to C64 scancodes

Let us bring all code we have developed in this post together. The following outline shows the changes we need to make to our main method so that we convert PC scancodes to C64 scancodes:

int main(){
...
  init_table();
...
  while(1) {
...
    char keys_to_process[4];
...
    readKeyboard();

    for (int i = 0; i < 6; i++) {

      char ps_2_code = temp[keys[i]].code;
      if ((ps_2_code == 'C') || (ps_2_code == 'B')) {
        shifted = 1;
        continue;
      }
      keys_to_process[num_keys_to_process] = ps_2_code;
      num_keys_to_process++;
    }

    keyToProcess.word1 = 0;
    keyToProcess.word2 = 0;

    for (int i = 0; i < num_keys_to_process; i++) {
      int offset = shifted << 1;
      int c64_scan_code = key_map[keys_to_process[i]][offset];
      if (c64_scan_code == -1)
        continue;
      if (c64_scan_code < 32) {
        keyToProcess.word1 = keyToProcess.word1 | (1 << c64_scan_code);
      } else {
        c64_scan_code = c64_scan_code - 32;        
        keyToProcess.word2 = keyToProcess.word2 | (1 << c64_scan_code);
      }
      if (key_map[keys_to_process[i]][offset+1]) {
        keyToProcess.word1 = keyToProcess.word1 | (1<<15);
      }
    }
    ret = write(fd, &keyToProcess, 8); // Send the string to the LKM

  }
   return 0;
}


We start by calling init_table, which we developed earlier on.

We then take the captured keys produced by readKeyboard(), and convert them to ASCII like scancodes as they appear in the VKM file. The translated keys are stored in the array keys_to_process.

During this loop we also determine if any shift key is been held done, and set the variable shifted as such. I would also like to mention here that in this transaltion process I have decided to assign the left- and right-shiftkey to the ASCII values 'C' and 'B' respectively.

Once we have translated all keys, we loop though them and translate it to C64 scancodes. Here we make use of the shifted variable to decide if we are going to either use elements 0/1, or elements 2/3.

We finally enable the shiftKey in the C64 keyboard matrix if either element 1 or 3 are set.

Mapping the cursor keys

Up to this point the mapping of PC keys to C64 scancodes has been fairly straightforward. Our mapping model, however, falls a bit on its face when we try to map the cursor keys.

When we press one of the cursor keys, Linux returns us a break code preceding the actual scan code.

For instance, if we press cursor up, we will receive the following bytes:

E0 48

This is a bit problematic for our process that maps to ASCII like scancodes, which only works with single byte values.

We could condense these two bytes into a single one with the fact that these scan codes usually only make use of values 0 to 127. For scancodes preceded by E0, we can just set bit 7, and then set the lower 7 bits with our actual scan codes.

On thing we should aware of, though, is that when we release a key, we get a scan code of which bit 7 is also set.

With all this in mind, let us write some code for condensing a scan code with a break code, into a single byte.

We start with our readKeyboard() method. Previously this method read a single byte from stdin and acted accordingly.

Here we already have problems if we get a scancode with a breakcode. At this point we can already reduce our problems by reducing a scancode with a breakcode into a single number.

I will, however, reduce this number to a 9-bit number instead of a 8-number, so we can preserve information about whether the scancode is for a key down or a key up:

int getKeyCode(int *code) {
  char buf[1];
  int res;
  res = read(0, &buf[0], 1);
  if (res == -1)
    return -1;
  if ((buf[0] & 0xff) != 0xe0) {
    *code = buf[0] & 0xff;
    return res;
  }
  res = read(0, &buf[0], 1);
  if (buf[0] == 1) {
    restoreKeyboard();
    exit(0);
  }
  *code = (buf[0] & 0xff) | 0x100;
  return res;
}

void readKeyboard()
{
    int res;
    int code;
    res = getKeyCode(&code);
    processKey(code);
    while (res >= 0) {
        res = getKeyCode(&code);
        processKey(code);
    }
}


So, if we encounter a scancode with a breakcode, we just set bit 8 to a 1.

Next, we should change processKey so that it can interpret bit 8 accordingly:

void processKey(int scanCode) {
  if ((scanCode & 0x80)) {
    //do key release
    scanCode = (scanCode & 0x100) ? ((scanCode & 0x7f) | 0x80) : (scanCode & 0x7f);
    doKeyUp(scanCode);
  } else {
    //do key down
    scanCode = (scanCode & 0x100) ? ((scanCode & 0x7f) | 0x80) : scanCode;
    doKeyDown(scanCode);
  }
}


Once we have determined whether the scancode is a keydown or a keyup, we can reduce the scancode to 8-bits.

In Summary

In this post we have implemented scan code mapping from PC scancodes to C64 scancodes. For this purpose we have used a key mapping file from the Vice Commodore emulator.

In the next post we will be modifying our Test program so that it can accept a tape image filename as a parameter. The test program will then transfer the Tape Image from the Linux file system to the C64 core.

Till next time!

Sunday, 1 March 2020

Redirecting keystrokes from Linux to the C64 module: Part 1

Foreword

In the previous post we implemented functionality within our C64 design allowing us to toggle between Linux console output and C64 video output.

We performed this toggling between the two video outputs with the help of a toggle button present on the Zybo board. Later on we will be moving the control of this video mode toggling to software.

In this post we will be focusing on redirecting keystrokes from Linux to our C64 module.

In order to do this we need to develop a Kernel Driver and a user program in userspace.

This is quite a lot to cover in one post, so in this post we will not be developing a complex PC key to C64 mapping mechanism. Instead, as a proof of concept, we will just be capturing two keystrokes from a keyboard to display on the C64 screen.

For this reason I have decided to split the whole keystroke redirection functionality into two posts. In the next post we will be tackling advanced PC key -> C64 key mapping.

Surfacing Screen mode to software

As mentioned in the forward, we need to work towards the goal of switching screen mode in software.

In order to achieve this, we need to surface this mode bit within a register in our Slave AXI block.

For this purpose we can just use Slave register 2, since we don't utilise all the bits of this register. Bits 8 to 4 gets utilised by the joystick bits, so we can use bit 9 for screen mode. For this we make the following changes to our user logic:

 // Add user logic here
    assign slave_reg_0 = slv_reg0;
    assign slave_reg_1 = slv_reg1;
    assign restart = slv_reg2[1];
    assign tape_button = slv_reg2[2];
    assign joybits = slv_reg2[8:4];
    assign c64_mode = slv_reg2[9];
 // User logic ends


We need to ensure that this c64_mode gets surfaced in our Slave AXI block:
This port we will connect to our VGA block, effectively replacing the connection from the push button on the Zybo board.

We will now be able to control the screen mode in software by just writing to bit 9 of address 0x43c0_0008.

As mentioned in a previous post, it is not so easy to access a physical address in Linux, especially in Userspace.

So, this is one of the reasons we will be developing a Kernel driver in this post.

Into Kernel drivers

Let us get our fingers dirty with writing a Kernel driver. For beginners there is a nice resource, Linux Device Drivers (third edition), available here.

Before we start writing a Kernel device driver, let us first focus on what we want to achieve.

To send one or more keystrokes to our C64 module, we need to set one or more bits in the registers located at addresses 0x43c0_0000 and 0x43c0_0004. Together these two registers contains 64 bits, which corresponds to the 64 keys you find on a C64 keyboard.

The idea is that we open this Kernel device driver as a file and we write two 32-bits at a time to this 'file'. Our kernel driver in turn will write these values to address 0x43c0_0000 and 0x43c0_0004 respectively.

To send both register values as a unit, we can make use of a struct with the following definition:
struct keyboard 
{
           u32 word1;
           u32 word2;
};


The details will become clear later.

To help us to get started quickly, it will help if we can find a minimalistic example on the Internet that is similar to what we want to achieve. Derek Molloy's comes to the rescue here: http://derekmolloy.ie/writing-a-linux-kernel-module-part-2-a-character-device/

Derek gives the source of this tutorial on his Github site. In particular, we are interested in the following two files:


In ebbchar.c there is all the necessary code for a fully fletched character driver. The example provided open the device driver as a file, write a string to it and then reads it back.

When I tried out this example, it crashed when i tried writing to the driver. At first I could figure out why this was happening. However, when I had a look at the read and write method together, I discovered something:

static ssize_t dev_read(struct file *filep, char *buffer, size_t len, loff_t *offset){
   int error_count = 0;
   // copy_to_user has the format ( * to, *from, size) and returns 0 on success
   error_count = copy_to_user(buffer, message, size_of_message);

   if (error_count==0){            // if true then have success
      printk(KERN_INFO "EBBChar: Sent %d characters to the user\n", size_of_message);
      return (size_of_message=0);  // clear the position to the start and return 0
   }
   else {
      printk(KERN_INFO "EBBChar: Failed to send %d characters to the user\n", error_count);
      return -EFAULT;              // Failed -- return a bad address message (i.e. -14)
   }
}

static ssize_t dev_write(struct file *filep, const char *buffer, size_t len, loff_t *offset){
   sprintf(message, "%s(%zu letters)", buffer, len);   // appending received string with its length
   size_of_message = strlen(message);                 // store the length of the stored message
   printk(KERN_INFO "EBBChar: Received %zu characters from the user\n", len);
   return len;
}

In dev_read there is a call to copy_to_user, but not a similar call within dev_write. When passing a pointer from user space to kernel space, functions like copy_to_user and copy_from_user is necessary to move the information  between the two spaces.

Writing the C64 Keyboard driver

In the previous section we had a look at Derek Molloy's example Kernel driver. With the minimum amount of tweaks to this example driver, we can easily create our C64 Keyboard driver.

We start off by mapping our Slave AXI registers into Kernel Space:

...
static void __iomem *c64_reg_base;
static void __iomem *c64_reg_screen_mode;
static void __iomem *c64_reg_keyboard_0;
static void __iomem *c64_reg_keyboard_1;
...
static int __init ebbchar_init(void){
...
   c64_reg_base = ioremap(0x43c00000, 16384);
   c64_reg_screen_mode = c64_reg + 8;
   c64_reg_keyboard_0 = c64_reg;
   c64_reg_keyboard_1 = c64_reg + 4;
...
   return 0;
}
...

The key here is the call to ioremap, which maps maps a 16KB region, starting at the first address of our Slave AXI regsiters, into virtual memory.

We then define some more pointers in which we can access the keyboard bits and C64 screen mode directly.

I was thinking for some time what kind of interface we could use for switching between two screen modes. This ended off not to be a problem at all. We can just switch to C64 screen moe when we open the driver, and switching back to Linux Console mode when we close the driver again:

...
static int dev_open(struct inode *inodep, struct file *filep){
   numberOpens++;
   printk(KERN_INFO "EBBChar: Device has been opened %d time(s)\n", numberOpens);
   iowrite32(0x200, c64_reg_screen_mode);
   return 0;
}
...
static int dev_release(struct inode *inodep, struct file *filep){
   printk(KERN_INFO "EBBChar: Device successfully closed\n");
   iowrite32(0x0, c64_reg_screen_mode);
   return 0;
}
...

What we still need to do is to take writes to our kernel driver and sending this information to the physical registers:

static ssize_t dev_write(struct file *filep, const char * keys, size_t len, loff_t *offset){
   struct keyboard temp[1];
   copy_from_user(temp, keys, 8);
   iowrite32(temp[0].word1, c64_reg_keyboard_0);
   iowrite32(temp[0].word2, c64_reg_keyboard_1);
   return 8;
}


Our dev_write accepts the keys as a pointer of char. This is to conform to the interface when creating a character file driver. Here we cheat a bit, however. The actual data we will be sending will not be a an array of char, but a struct of keyboard.

Internally we will copy this data an actual keyboard structure. Lastly we will write the actual data to the actual registers.

With Linux running on our Zybo board, to load this driver is a two step process.

Firstly, similarly as we done with our Linux Framebuffer driver, we need to issue a insmod command, for loading the kernel driver so it can be used by the Linux Kernel driver.

When this particular driver loads, it will output the major number it is registered as. Make a note of this number, as you will need it to in order to add it as a node under /dev.

To add a node under /dev for this device, issue the following command:

mknod /dev/ebbchar c 244 0

In my case the major device number was 244. Also, the c indicates that we are about to add a character driver.

Writing the user program

Our test program is kind of a merge, where we take a take program in a previous post where captured keystrokes in Linux, together with Derek Molloy's test program.

Let us start the discussion by looking at the final main() method:

int main(){
   int ret, fd;
   char stringToSend[BUFFER_LENGTH];
   printf("Starting device test code example...\n");
   fd = open("/dev/ebbchar", O_RDWR);             // Open the device with read/write access
   if (fd < 0){
      perror("Failed to open the device...");
      return errno;
   }

  setupKeyboard();
  struct keyboard keyToProcess;

  while(1) {
    usleep(20000);
    readKeyboard();
    keyToProcess.word1 = 0;
    keyToProcess.word2 = 0;
    for (int i = 0; i < 6; i++) {
      if (keys[i] == 0)
        continue;
      int translated = getC64ScanCode(keys[i]);
      keyToProcess.word1 = (translated < 32 ) ? (keyToProcess.word1 | (1 << translated)) : keyToProcess.word1;
      if (translated > 31) {
        translated = translated - 32;
        keyToProcess.word2 = keyToProcess.word2 | (1 << translated);
      } 
    }
    ret = write(fd, &keyToProcess, 8); // Send the string to the LKM
    if (ret < 0){
       perror("Failed to write the message to the device.");
       return errno;
    }

  }
   return 0;
}

We start by opening a file handle to our device driver. In a main loop we capture key up/down events from the keyboard, convert it to a C64 scancode and send to our device driver as a keyboard struct.

You will remember from a previous post that we maintain a global keys array, which indicates which keys are currently been held down. This array caters for up to 6 elements and if an element is not in use, it will simply hold the value zero.

We will be implenting the method getC64ScanCode in the next post.

In Summary

In this post we have created a Linux Kernel device driver that will accept C64 scan codes from a user program and forward it to our C64 module.

In the next post we will be functionality where we will map PC scancodes to C64 scancodes. In this process I will try to utilise the keyboard mapping functionality present in the Vice C64 emulator.

Till next time!