Saturday, 17 January 2026

A Commodore 64 Emulator in Flutter: Part 15

Foreword

In the previous post we introduced the CIA as a separate class. Previously mimicked the CIA's operation, by just forcing a hard interrupt every 1/60th of a second, just to get our emulator to work, avoiding the complexities of implementing and scheduling timers.

Thus, in the previous post we delved deeper and implement the CIA. This was actually needed as a precursor to this post where we will be implementing Tape loading functionality, which require more granular operation of the CIA.

Adding Front end Interaction

The logical place to start, is to add functionality to our front end for attaching a tape image. So within main.dart, which is basically our front end code, we add two buttons for the RunningState front end:

...
} else if (state is RunningState) {
              return KeyboardListener(
                focusNode: context.read<C64Bloc>().focusNode,
                autofocus: true,
                onKeyEvent: (event) => {
                  if (event is KeyDownEvent) {
                    context.read<C64Bloc>().add(KeyC64Event(keyDown: true, key: event.logicalKey))
                  } else if (event is KeyUpEvent) {
                    context.read<C64Bloc>().add(KeyC64Event(keyDown: false, key: event.logicalKey))
                  }
                },
                child: Column(
                  children: [
                    Row(
                      children: [
                        IconButton(
                            icon: Icon(Icons.folder),
                            onPressed: () async {
                              context.read<C64Bloc>().add(LoadTapeRequested());
                            }),
                        IconButton(
                            icon: Icon(Icons.play_arrow),
                            onPressed: !state.tapeLoaded ? null : () async {
                              context.read<C64Bloc>().add(PlayTapeRequested());
                            })
                      ],
                    ),
                    RawImage(
                      image: state.image, scale: 0.5),
                ],
              ));
            }
...
As usual, we will add LoadTapeRequested and PlayTapeRequested in c64_event.dart.

Now we need to listen for every event within our Bloc class:

    on<PlayTapeRequested>((event, emit) {
      _tape.playTape();
    });

    on<LoadTapeRequested>((event, emit) async {
      final result = await FilePicker.platform.pickFiles(
        withData: true,
        type: FileType.custom,
        allowedExtensions: ['tap', 't64'],
      );

      if (result == null) return;
      tapeLoaded = true;
      _tape.setTapeImage(result.files.single.bytes!);
    });

For PlayTapeRequested we simulate the press of a play button. _tape is an instance of a class Tape, which we will define later.

With the LoadTapeRequested event, we Basically present a file dialogue where the user select the Tape image from the local file system and also pass it to the _tape instance.

The Tape Class

Let us start to implement the Tape class, which will emulate the functionality of Tape loading.

We start with a simple class:
class Tape implements TapeMemoryInterface {
  late Iterator _tapeImage;
  bool _playSelected = false;
  Alarms alarms;
  TapeInterrupt interrupt;
  Alarm? _tapeAlarm;

  Tape({required this.alarms, required this.interrupt});
}
Before we go into detail on how to implement this class, let us take a step back and think about how Tape loading works on a C64.

On a physical tape, you used back in the day to load games on a C64, you had pulses of varying lengths. It all boils down to basically two types of pulses: A short pulse or a long pulse, which corresponds to either a 0 or 1, which is a bit. The most basic element of data on a computer 😀.

Now, when considering the loading of the data from a physical tape on C64. The end of a pulse is indicated when it changes polarity from positive to negative or vice versa. This change of polarity causes an interrupt on the CPU, via the FLAG pin on CIA1. The tape loading routines inside the Kernal ROM use one of the CIA timers to measure the pulse widths, and decide based on that, if each bit is a zero or a one.

The tape image files you can download from the Internet of old games, are a sequence of pulse widths. With all this info at hand, it is starting to become apparent on what the tape class should do. Using these pulse width, it should schedule an alarm, the same structures we used previously within the CIA, for each pulse, and trigger an interrupt when it lapses. Looking at the private fields I defined above in the Tape class, it also hints towards this.

Let us have at the variable _tapeImage. It is of type Iterator. With this data structure we can basically iterate through the tape image pulse width by pulse width, without worrying about working with a counter that you need to update every time.

At this point we are ready to implement the method _setTapeImage(), which we mentioned previously:

  setTapeImage(type_data.Uint8List tapeData) {
    _tapeImage = tapeData.iterator;
    for (var i = 0; i < 21; i++) {
      _tapeImage.moveNext();
    }
    populateRemainingPulses();
  }

Uint8List variables provides you with an Iterator. In a tape image actual pulse width data actually starts after 21 bytes.

Once we are at the actual pulse width data, we need to know the width of the first pulse. This is the function of the method populateRemainingPulses() :

  populateRemainingPulses() {
    var val = _tapeImage.current;
    if (val != 0) {
      _remainingPulseTicks = val << 3;
      _tapeImage.moveNext();
    } else {
      var byte0 = _tapeImage.current;
      _tapeImage.moveNext();
      var byte1 = _tapeImage.current;
      _tapeImage.moveNext();
      var byte2 = _tapeImage.current;
      _tapeImage.moveNext();
      _remainingPulseTicks = (byte2 << 16) | (byte1 << 8) | byte0;
    }
  }

Here we need to understand the TAP format a bit better. Usually every byte indicates one pulse width. We then need to multiply this value by 8 to get to the width in CPU clock cycles.

The excpetion to the rule is when the byte value is zero. Then the next three bytes indicate the pulse width as an absolute value of CPU clock cycles e.g. no multiplication by 8 necessary then.

You will see that I am assigning the calculated value to a private variable _remainingPulseTicks. We are following a similar approach here than with timers in the CIA which we implemented in the previous post. It functions almost as a count down timer, and is updated with the alarm subsystem.

At this point a key question is: What kicks off the tape loading process? The answer lies in memory location 1 of the C64 memory. This memory location is well known for the location of switching out banks of memory in and out of view. However, this memory location also host two bits for tape control:
  • Bit 4 - Cassette Switch Sense; 1 = Switch Closed
  • Bit 5 - Cassette Motor Control; 0 = On, 1 = Off
The key here is bit 5, turning the Cassette motor on and off, which acts as the starting point for the tape loading process. Bit 4 tells us when the user presses the play button, which we will cover later.

In this let us create the following method in our Tape class:

  @override
  setMotor(bool on) {
    if (on == _currentMotorOn) {
      return;
    }
    _currentMotorOn = on;
    if (on) {
      setupAlarms();
    } else {
      _tapeAlarm!.unlink();
      _remainingPulseTicks = _tapeAlarm?.getRemainingTicks();
    }
  }
This method will be invoke when we write to memory via our Memory class. We will deal with this plumbing later.

If the motor switched on, we need to setup alarms. This is similar what we did with timers in the previous post. Before we move onto the implementation of setupAlarms(), lets have a look at what happend in the else, when the motor is switched off. In that case we unlink the alarm from the list of alarms, and we set _remainingPusleTicks to the remaining ticks of the pulse. This is just to cater for when we resume the motor, we can carry on from where we left on in the pulse.

Now, let us look at setupAlarms():

  setupAlarms() {
    _tapeAlarm ??= alarms.addAlarm( (remaining) => processTapeAlarm(remaining));
    if (_tapeAlarm!.list == null) {
      alarms.reAddAlarm(_tapeAlarm!);
    }
    _tapeAlarm!.setTicks(_remainingPulseTicks);
  }

Here we see the actual use of _remainingPulseTicks, when the motor is resumed.

Let us now have a look at the method processTapeAlarm() :

  processTapeAlarm(int remaining) {
    interrupt.triggerInterrupt();
    populateRemainingPulses();
    _tapeAlarm!.setTicks(_remainingPulseTicks + remaining);
  }
This method is called when the pulse has expired. During this we trigger an interrupt and reschedule the next alarm.

Finally, there is one remaining method we need to implement:

  @override
  int getCassetteSense() {
    return _playSelected ? 0 : 0x10;
  }

This basically provides bit 4 of memory location 1, which will be used by our memory class. More on this later.

Changes to the CIA class

Let us now have a look at the changes required in our CIA class.

There is quite a few changes, so I will just cover it on a high level.

First of all, we will need to implement TimerB as well. The tape loading routine in Kernel ROM uses this timer quite extensively. All I will say here, is that it is basically a copy and paste excercise from TimerA.

Next, we will look at the method hasInterrupts(), which is used by our CPU class to trigger an interrupt:

  hasInterrupts() {
    if (timerAintOccurred && timerAinterruptEnabled) {
      return true;
    } else if (timerBintOccurred && timerBinterruptEnabled) {
      return true;
    } else if (tapeInterruptOccurred && tapeInterruptEnabled) {
      return true;
    } else {
      return false;
    }
  }

You will notice that I have included timerB interrupts and tape interrupts in the check as well.

Next, let us look at the setMem() function in the CIA class:

  setMem(int address, int value) {
...
      case 0xD:
        if ((value & 0x80) != 0) {
          timerAinterruptEnabled = ((value & 1) == 1) ? true : timerAinterruptEnabled;
        } else {
          timerAinterruptEnabled = ((value & 1) == 1) ? false : timerAinterruptEnabled;
        }
        if ((value & 0x80) != 0) {
          timerBinterruptEnabled = ((value & 2) == 2) ? true : timerBinterruptEnabled;
        } else {
          timerBinterruptEnabled = ((value & 2) == 2) ? false : timerBinterruptEnabled;
        }
        if ((value & 0x80) != 0) {
          tapeInterruptEnabled = ((value & 16) == 16) ? true : tapeInterruptEnabled;
        } else {
          tapeInterruptEnabled = ((value & 16) == 16) ? false : tapeInterruptEnabled;
        }
...
  }

As you might remember previously, register D in the CIA is the interrupt mask register. Here we have added timerB and the Tape Inteerupt as some interrupts we can enable or mask out.

Finally, let us look at the getMem() method:

  int getMem(int address) {
...
  case 0xD:
        var value = 0;
        if (timerAintOccurred) {
          timerAintOccurred = false;
          value = value | 0x81;
        }
        if (timerBintOccurred) {
          timerBintOccurred = false;
          value = value | 0x82;
        }
        if (tapeInterruptOccurred) {
          tapeInterruptOccurred = false;
          value = value | 0x84;
        }
        return value;
    }
...
}

Here we are reading the same register from earlier, but reading doesn't return the masks, but the actual interrupts that occurred. Once again, we have added timerB and TapeInterrupt. Once this registere has been read, we also clear all occurred interrupts.

Changes to the Memory Class

Let us now have a look at the changes required in our memory class, for implementing tape loading.

First change is in the setMem() method:

  setMem(int value, int address ) {
    if ((address >> 8) == 0xDC) {
      cia1.setMem(address, value);
    } else if (address == 1) {
      _ram.setInt8(address, value);
      _tape.setMotor((value & 0x20) == 0 );
    } else {
      _ram.setInt8(address, value);
    }
  }
So, as mentioned earlier, bit 5 of memory location 1 controls the tape motor. Here we implement it, so that during a memory write to this location, we call setMotor appropriately.

Next, let us change the getMem() method:

  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else if ((address >> 8) == 0xDC ) {
      return cia1.getMem(address);
    } else if (address == 1) {
      var value = _ram.getUint8(address) & 0xef;
      return value | _tape.getCassetteSense();
    } else {
      return _ram.getUint8(address);
    }
  }

Here we add the Cassette sense bit when reading the byte from memory location one. As mentioned previously, the cassette sense bit indicates if we pressed the play button.

The results

With everything coded, let us see how the screens looks like when we spin up our emulator. At startup, our screen looks like this:


Notice we have two new icons at the top, a folder icon and a play button. We use the folder icon to locate the tape image file from our local file system. Once we have selected a tape image, the play button becomes enabled.

The play button if actually ressembling the play button on a real C64 Datasette unit which was hooked to a C64. So, when the screen shows "Press Play on tape", and you hit the play button, the loading process commenced.

Lets do the whole sequence. With the tape image attached, type LOAD at the flashing cursor, and then hit ENTER. Your screen will now look like this:

Now press play button next to the folder button.

With the play button pressed, the folloing prompts will popup:


After a number of seconds, the screen will look like this:


This is the Hooray moment. When seeing FOUND DAN DARE, or what the file name of the tape image you used, you know you have implemented the tape loading correctly.

One thing that immediately felt off when testing the tape loading, was that it felt much longer than usual before it showed "FOUND...". So I did some comparative benchmarks.

First, to get a realistic time, I measured how long it takes to find the file in the Vice C64 emulator. It was about 17 seconds.

Then I did the measurement in my Flutter emulator. In my Emulator, it took 24 seconds. Quite a lot slower!

I did some further depth investigations. After lots of pain, I discovered that the speed issue was caused by not building the app in release mode. I made a subtle assumption that if I start it IntelliJ, and I start it with the Play button and not with the Debug button, every thing will be optimised. Was I wrong!

Let us see how to run our project in release mode. Firstly, open a terminal window and cd into your project folder. Then run the following command:

flutter build web --release
After the build is finished, you will find the result in build/web with the project. cd into this folder. We now need a web server to serve this, and the easiest one to use is Python. So, within the build folder, run the following command:

python -m http.server 8000
Now access the emulator in the browser with http://localhost:8000/

This time around our times match up with tape loading.

While I was trying to figure out why my emulator was slow, I also discover memory usage was steadily climbing. When I fixed the issue in release, I wondered if the memory leak issue was also fixed. So, I left it running for about half an hour, and then hovered over the tab in Chrome to see the memory usage, and sadly the memory leak was there:


If you leave it running longer, it will eventually go over 1G of memory usage.

In the next post we will tackle this issue.

In Summary

In this post we implemented Tape image loading. An unfortunate issue I encountered was a memory leak.

In the next post I will see if I can fix this memory leak.

Until next time!

Wednesday, 3 December 2025

A Commodore 64 Emulator in Flutter: Part 14

Foreword

In the previous post we managed to interface the keyboard to our C64 Flutter emulator. With that implemented, we were able to enter a simple Basic program into our emulator and running it.

Now, my ultimate goal for writing this emulator, is to be able to run the game Dan Dare in our emulator, loading it from a tape image.

So, to achieve this end goal, the next goal would be for our emulator to be able to load a tape image. On a C64, loading from the tape rely heavily on the features of a CIA (Complex Interface Chip). The features tape loading rely on is connecting access the read head from the tape, timers and interrupts.

Up to now we have been mimicking some of the features of a CIA chip. The address range of the CIA chip is within DC00-DCFF. It immediately comes to mind that in the previous post we implemented two of the registers  of the CIA, DC00 and DC01 for keyboard access.

We also implicitly implemented a timer and interrupts in our emulator, interrupting the CPU every 1/60 of a second, so that the cursor can flash and keyboard entry could work. However, we blindly forced these interrupts just as a quick hack just to get the cursor and keyboard to work. We didn't even consider the values set in the CIA for setting the timer.

However, to implement tape loading we would not be able to get away with a quick hack 😀 We will need to emulate the CIA properly for this purpose.

So, in this post we will implement CIA emulation bit by bit. This will include revisiting our current keyboard and timer interrupt implementation (e.g. doing the 1/60 second interrupt), and implementing it properly with CIA implementation.

We will probably only get to tape emulation in the next post.

Enjoy!

Creating the CIA skeleton

Lets begin our journey by creating a CIA class just as a skeleton. This class will evolve over time to contain all the functionality that a CIA will contain:

class Cia1 {
  setMem(int address, int value) {
    print("setMem ${address.toRadixString(16)} ${value.toRadixString(16)}");
  }

  int getMem(int address) {
    print("getMem ${address.toRadixString(16)}");
    return 0;
  }
}
Here we do something interesting. Every write or read from the CIA address range we log. With this we can see which functionality is used and we can just implement the bare minimum functionality of the CIA chip.

With this exercise we also want to disable the hard coded interrupts happening every 1/60 second to avoid any potential side-effect with our CIA journey:

  step() {
/*
    if ((_cycles > 1000000) &&((_cycles % 16666) < 30) && (_i == 0)) {
      push(pc >> 8);
      push(pc & 0xff);
      push((_n << 7) | (_v << 6) | (2 << 4) | (_d << 3) | (_i << 2) | (_z << 1) | _c);
      _i = 1;
      pc = memory.getMem(0xfffe) | (memory.getMem(0xffff) << 8);
    }
*/
    var opCode = memory.getMem(pc);
    pc++;
    var insLen = CpuTables.instructionLen[opCode];
    ...
  }
Next we need to make an instance of this class and inject into our Memory class:

  C64Bloc() : super(InitialState()) {
    memory.setKeyInfo(this);
    on<InitEmulatorEvent>((event, emit) async {
      final basicData = await rootBundle.load("assets/basic.bin");
      final characterData = await rootBundle.load("assets/characters.bin");
      final kernalData = await rootBundle.load("assets/kernal.bin");
      Cia1 cia1 = Cia1();
      memory.setCia1(cia1);
      ...
    }
    ...
  }
We modify the actual Memory class like this:

class Memory {
...
  late final Cia1 cia1;
...
  setCia1(Cia1 cia1) {
    this.cia1 = cia1;
  }
...
  setMem(int value, int address ) {
    if ((address >> 8) == 0xDC) {
      cia1.setMem(address, value);
    } else {
      _ram.setInt8(address, value);
    }
  }

  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else if ((address >> 8) == 0xDC ) {
      return cia1.getMem(address);

    /*else if (address == 0xDC01) {
      return keyInfo.getKeyInfo(_ram.getUint8(0xDC00));*/
    } else {
      return _ram.getUint8(address);
    }
  }
}

So, every time an address starts with DC we send this access to the CIA instance. You will also see that I have commented out the explicit access to the DC01 register, which we added in the previous post for keyboard access. We will implement this functionality at a later stage into our CIA class.

Now, let us start the emulator and watch the log output:

setMem dc0d 7f
setMem dc00 7f
setMem dc0e 8
setMem dc0f 8
setMem dc03 0
setMem dc02 ff
setMem dc04 95
setMem dc05 42
setMem dc0d 81
getMem dc0e
setMem dc0e 11
setMem dc04 25
setMem dc05 40
setMem dc0d 81
getMem dc0e
setMem dc0e 11
So, let us quickly see what is going on here. With the write to DC0D, we disable all interupts going to the CPU.

The write to address DC00 is for the keyboard stuff, which we don't worry about at the moment.

Next we see the value 8 written to registers DC0E and DC0F. This puts timers A and B in One shot mode. 

Lets skip a couple of memory writes and get to the writing to locations dc04 and dc05. These are registers for setting the duration of timerA. Each count is a count of your 1MHz clock which also drives the 6510. Dc04 is the lo byte of the value and DC05 the high byte. So, this is 4295 hexadecimal which translates to 17045, which is close to that hard coded count we used previously for triggering an interrupt every 1/60 second.

Next, we do an assignment to location DC0D. This is the interrupt register. We see in the value assigned that the least significant bit is set. This is the value that controls interrupts from Timer A. We also see that the most significant bit is set in the assigned value. If this value is a 1 it means enable all interrupts that is a one is this value byte. So, in this case we have enabled interrupts from timer A.

Finally, we see a value that is assigned twice to register DC0E. With the assignment, two things are happening. Firstly bit 4 is set, which means force load the value from the timer from the latch, which in our case would be the hex value 4295. The second thing that happens with bit 0 that is set, is that timer A is finally started.

Something else is also happening subtly. Previous I mentioned we are setting bit 3 to 1, meaning it was in one shot mode. Now, however, we are setting this bit to a zero, which means that timer A will operate in continuous mode. This means after the timer has lapse it will automatically restart, which means we will get periodic interrupts every 1/60 second.

Implementing the Alarm System

With the skeleton implemented for the CIA chip, we should start implementing some meat for it. We will start with timer A. 

Now, timer A is very reliant on the number of cycles the CPU executed. There are other operations that is also dependant on the number of CPU cycles executed, like tape loading, drawing pixels at the right moment on the screen and SID sound generation.

I wrote a number of C64 emulators for other programming languages. I must admit, for all these emulators, I would would do all these operations that is dependant on CPU cycles executed, on every CPU instruction executed. In the beginning, when I just add timers or tape interrupts, I didn't really see issues.

However, as I added more of these operations dependant on CPU cycles executed, I saw performance gradually worsening, especially when I added more of the VIC-II operations.

Now, what I experienced isn't really something new. There is actually a computer science term for this trying to solve the issue, which is Loop fission. The following Wikipedia article explains a bit more about Loop fission:


Basically, when you have a loop where you do a lot of things in a loop iteration, one issue that pops up is that you have more cache misses, and your CPU needs to fetch data from slower RAM more often. By splitting the loop into more separate loops cache misses should be reduced and therefore improve performance.

I have digged a bit into the source code of the Vice Emulator and overall they also overall try to break things into separate loops. They have the whole concept of alarms. For instance everything VIC-II scan line is 63 cycles. So, instead of rendering a bit of a line after CPU instruction, they set an alarm that will trigger 63 cycles into the future. So, with every CPU instruction execution, it will check if 63 cycles has passed. Only when the 63 cycles has passed, then you execute an alarm handler that will render the full line.

Of course, during the course of the 63 cycles, something might change like the border color, in which the line will not only show one border color. In such cases when writing to such a register, one should keep record when the color change.

Lets start to create a Alarm subsystem for our emulator. We start with a brief outline:

class Alarms {
  final LinkedList<Alarm> _alarmList = LinkedList<Alarm>();

  Alarms();

  Alarm addAlarm(Function(int remainder) callback) {
    var alarm = Alarm._(this, callback);
    _alarmList.add(alarm);
    return alarm;
  }

}
So, here we have a class containing all our alarms. Internally all the alarms is store in a linked list, which is a data structure in Dart. We will visit this in a while.

There is also a method for adding a alarm with a callback, so when the alarm has expired you can call the callback to do some stuff. The remainder parameter indicates how much cycles we have gone over the alarm threshold when a cpu instruction has executed.

Lets now focus a bit on the LinkedList story. So, we have a declaration LinkedList<Alarm>(). LinkList is one of Flutter's build in classes which is a generic, which you need to type when you make an instance. In this case we are saying we will have a LinkedList containing instances of Alarm.

Now usually with generics, You can define Alarm in anyway you want. However, with a LinkedList, things are a bit more tricky, because every node needs to point to the next and previous node. This is just how a LinkedList is implemented.

Luckily you don't need to worry about implementing all this yourself. You can just let our Alarm class extends LinkedListEntry, then all this will happen automatically:

final class Alarm extends LinkedListEntry<Alarm> {
  late final Alarms _alarms;
  late final Function(int remainder) _callback;

  Alarm._(Alarms alarms, Function(int remainder) callback ) {
    _alarms = alarms;
    _callback = callback;
  }

}

Let us now add some more meat to our alarm class:

final class Alarm extends LinkedListEntry<Alarm> {
  var _targetClock = 0;
...
  setTicks(int ticks) {
    _targetClock = _alarms.getCurrentCpuCount() + ticks;
  }

  getRemainingTicks() {
    return _targetClock - _alarms.getCurrentCpuCount();
  }

  getTargetClock() {
    return _targetClock;
  }

  processAlarm(int remainder) {
    _callback(remainder);
  }
}
Basically I have added some methods for keeping track of how far we are from triggering a alarm. The processAlarm will be invoked when the alarm is triggered.

Now, let us add some meat to our Alarms class:

class Alarms {
  final LinkedList<Alarm> _alarmList = LinkedList<Alarm>();
  int _cpuCount = 0;

  Alarms();

  Alarm addAlarm(Function(int remainder) callback) {
    var alarm = Alarm._(this, callback);
    _alarmList.add(alarm);
    return alarm;
  }

  reAddAlarm(Alarm alarm) {
    _alarmList.add(alarm);
  }

  int getCurrentCpuCount() {
    return _cpuCount;
  }

  processAlarms(int cpuCycles) {
    _cpuCount = cpuCycles;
    for (Alarm item in _alarmList) {
      if (item.getRemainingTicks() <= 0) {
        item.processAlarm(item.getRemainingTicks());
      }
    }
  }
}
The key method added here is processAlarms(). This method loops through the alarms, checking which expired and then calling its callback.

Another interesting method is reAddAlarm(). It will happen often that we will stop a timer, at which we will remove it from the alarms queue, so it isn't triggered again. However, there might be a case where we want to start the timer again, at which we will use reAddAlarm(), to add it back to the queue so it is evaluated again for expiry.

Wiring everything together

With all the building blocks created in the previous section, lets now put them together. In C64Bloc let us do some initialisation:

class C64Bloc extends Bloc<C64Event, C64State> implements KeyInfo {
  final Memory memory = Memory();
  final List<int> matrix = [0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff];
  final FocusNode focusNode = FocusNode();
  late final Cpu _cpu = Cpu(memory: memory);
  late final Alarms alarms = Alarms();
  type_data.ByteData image = type_data.ByteData(200*200*4);
  int dumpNo = 0;
  int frameNo = 0;
  Timer? timer;
...
  C64Bloc() : super(InitialState()) {
    on<InitEmulatorEvent>((event, emit) async {
      final basicData = await rootBundle.load("assets/basic.bin");
      final characterData = await rootBundle.load("assets/characters.bin");
      final kernalData = await rootBundle.load("assets/kernal.bin");
      Cia1 cia1 = Cia1(alarms: alarms);
      cia1.setKeyInfo(this);
      memory.setCia1(cia1);
      memory.populateMem(basicData, characterData, kernalData);
      _cpu.setInterruptCallback(() => cia1.hasInterrupts());
...
  }
...
}
I have added a field for our alarms. I am also now injecting an Cia1 instance into our memory.

In our CPU class we also now use a InterruptCallBack, which our CPU class will call to see if any interrupts has occured. Our Cia1 instance will provide this info.

In our main event processing loop, we also make a small change:

    on<RunEvent>((event, emit) {
      timer = Timer.periodic(const Duration(milliseconds: 17), (timer) {
          int start = DateTime.now().millisecondsSinceEpoch;
          int targetCycles = _cpu.getCycles() + 16666;
          do {
            _cpu.step();
            alarms.processAlarms(_cpu.getCycles());
          } while (_cpu.getCycles() < targetCycles);
...
});
    });

After every CPU we process the alarms with the current cpu cycles.

Expanding the CIA1 class

Earlier we created a sceleton for the CIA1 class. We will now expand this class further.

As usual we start with some initialisation:


class Cia1 {
  int timerAlatchLow = 0xff;
  int timerAlatchHigh = 0xff;
  int timerAvalue = 0xffff;
  Alarms alarms;
  Alarm? timerAalarm;
  bool timerAstarted = false;
  bool timerAoneshot = false;
  int registerE = 0;
  int register0 = 0;
  bool timerAinterruptEnabled = false;
  bool timerAintOccurred = false;
  late final KeyInfo keyInfo;


  Cia1({required this.alarms});

  setKeyInfo(KeyInfo keyInfo) {
    this.keyInfo = keyInfo;
  }
 ...
}
The meaning of these private variables will became clear in a bit.

Next, let us implement the following method:

  updateTimerA() {
    if (!timerAstarted) {
       return;
    }
    if (timerAalarm != null) {
      timerAvalue = timerAalarm!.getRemainingTicks();
    }

  }

timerAValue is the value of count down timerA in the CIA. To increase locality, we dont update this value with the execution of every CPU instruction. Instead, we wrote this method that updates the value when the CPU reads the value of this register.

Next we add these methods:

  hasInterrupts() {
    if (timerAintOccurred && timerAinterruptEnabled) {
      return true;
    } else {
      return false;
    }
  }

  processTimerAalarm(int remaining) {
    // Do interrupt
    timerAintOccurred = true;
    if (timerAoneshot) {
      timerAalarm?.unlink();
      timerAstarted = false;
      return;
    }
    timerAalarm!.setTicks((timerAlatchLow | (timerAlatchHigh << 8)) + remaining);
  }

Here we deal with when the timer expire and we set interrupts. We remove the timer from the alarm list if it is oneshot. Otherwise we schedule the running of the timer again.

Finally, let us add methods for reading and writing to the CIA registers:

  setMem(int address, int value) {
    print("setMem ${address.toRadixString(16)} ${value.toRadixString(16)}");
    value = value & 0xff;
    address = address & 0xf;
    switch (address) {
      case 0x0:
        register0 = value;
      case 0x4:
        timerAlatchLow = value;
      case 0x5:
        timerAlatchHigh = value;
      case 0xD:
        if ((value & 0x80) != 0) {
          timerAinterruptEnabled = ((value & 1) == 1) ? true : timerAinterruptEnabled;
        } else {
          timerAinterruptEnabled = ((value & 1) == 1) ? false : timerAinterruptEnabled;
        }
      case 0xE:
        var startTimerA = ((value & 1) == 1) ? true : false;
        var forceTimerA = ((value & 16) != 0) ? true : false;
        updateTimerA();
        if (forceTimerA) {
          timerAvalue = timerAlatchLow | (timerAlatchHigh << 8);
        }
        var startingTimerA = startTimerA & !timerAstarted;
        var stoppingTimerA = !startTimerA & timerAstarted;
        var alreadyRunningTimerA = startTimerA && timerAstarted;
        if (startingTimerA || (alreadyRunningTimerA && forceTimerA)) {
          // schedule timer on alarm
          timerAalarm ??= alarms.addAlarm( (remaining) => processTimerAalarm(remaining));
          if (timerAalarm!.list == null) {
            alarms.reAddAlarm(timerAalarm!);
          }
          timerAalarm!.setTicks(timerAvalue);
          // set timer as started
        } else if (stoppingTimerA) {
          //unschedule timer A
          timerAalarm!.unlink();
        }
        timerAoneshot = (value & 8) != 0;
        timerAstarted = startTimerA;
        registerE = value;
      default:
        // throw "Not implemented";
    }

  }

  int getMem(int address) {
    print("getMem ${address.toRadixString(16)}");
    updateTimerA();
    address = address & 0xf;
    switch (address) {
      case 0x0:
        return register0;
      case 0x1:
        return keyInfo.getKeyInfo(register0);
      case 0x4:
        return timerAvalue & 0xff;
      case 0x5:
        return timerAvalue >> 8;
      case 0xD:
        if (timerAintOccurred) {
          timerAintOccurred = false;
          return 0x81;
        } else {
          return 0;
        }
      case 0xE:
        var result = registerE & 0x06;
        result = result | (timerAstarted ? 1 : 0);
        result = result | (timerAoneshot ? 8 : 0);
        return result;
    }
    return 255;
  }

You will see each time we read from the CIA1 we update the timer. In the write function we also adjust the alarms accordingly if we chane the state of the times.

Changes to the CPU class

There is finally just a small change we need to do to our CPU. Previously in our CPU we hardwired an interrupt that happened every 1/60th of a second. However, now we have implemented an CIA class, we need to change how interrupts works.

Here is the highlighted changes:


class Cpu {
...
  late final Function() _interruptCallback;
...
  setInterruptCallback(Function() callback) {
    _interruptCallback = callback;
  }
...
  step() {
    if (_interruptCallback() & (_i == 0)) {
      push(pc >> 8);
      push(pc & 0xff);
      push((_n <<< 7) | (_v << 6) | (2 << 4) | (_d << 3) | (_i << 2) | (_z << 1) | _c);
      _i = 1;
      pc = memory.getMem(0xfffe) | (memory.getMem(0xffff) << 8);
    }
...
  }
...
}
Now, we call the interruptCallBack, which basically tie back to the CIA1 class we created. Also, we only invoke an interrupt only when the Inteerupt disable flag is not set.

In Summary

In this post we introduced the CIA as a separate class. We also removed the hardcoded mechanism which trigger an interrupt every 1/60th of a second, and rather let the CIA schedule the interrupts as programmed by machine language.

In the next post we will start to implement tape loading from a raw tape image.

Until next time!

Thursday, 22 May 2025

A Commodore 64 Emulator in Flutter: Part 13

Foreword

In the previous we managed to boot the C64 system with a screen showing the contents of screen memory in real time. It booted with the welcome message and a flashing cursor.

In this post we will provide some keyboard interfacing with our C64 emulator. We will approach this in a very experimental fashion, exploring how Flutter itself work with keyboard interfacing in a app. Then we will try to see if we can get keyboard interfacing to work in our app, and finally see if our emulator can work with the keyboard.

Enjoy!

KeyboardListener in Flutter

What we want for our emulator is basically to tell when a key is held down, and when it is released. Flutter provides this for us via a KeyboardListener. From the Flutter documentation it is not so straightforward on how to use this, so I looked around for a worked example on the Internet and found the following:

https://medium.com/@wartelski/how-to-flutter-keyboard-events-keyboard-listener-in-flutter-web-0c36ab9654a9

The following snippet is the core of the example:

With this example we can basically catch it when a key is down. Now all is well in this example, except for we have a final variable for _focusNode. This is, however, only a thing we can do with a StatefulWidget. In our case, however, we are within a StatelessWidget, where we cannot do such things.

In our case we would place the focusNode in our Bloc. Probably not the best place if one think about separation of concerns, but for now it is the best place if we want to keep a single instance of FocusNode alive. So, we do the following changes:

class C64Bloc extends Bloc<C64Event, C64State> {
  final Memory memory = Memory();
  final FocusNode focusNode = FocusNode();
...
}
And now we go further and wrap our RawImage in a KeyboardListener:

...
           } else if (state is RunningState) {
              return KeyboardListener(
                focusNode: context.read<C64Bloc>().focusNode,
                autofocus: true,
                onKeyEvent: (event) => {
                  if (event is KeyDownEvent) {
                    if (event.logicalKey == LogicalKeyboardKey.keyM) {
                      print("The m key is pressed!!")
                    }
                  } else if (event is KeyUpEvent) {
                    if (event.logicalKey == LogicalKeyboardKey.keyM) {
                      print("The m key is released!!")
                    }
                  }

                },
                child: RawImage(
                    image: state.image, scale: 0.5),
              );
            } else {
...
So, here we listen for the "M" key and write out to the console when this key is pressed and released.

Simulating a key press in our emulator

Next, let us see we can simulate a key press in our emulator. To figure out how let us dig a bit into how the keyboard is implemente in hardware.

Firstly, a keyboard is arranged a matrix of rows and columns, and where a row and column meets, there is a key switch. If the switch is pushed, it will short the row to ground. To see if a switch is pressed is a two step process. You need to energised each column in turn and see which columns are shorted to ground.

Firstly, to get an idea how the matrix of a C64 is arranged, the following diagram is helpful:

Now, the big question is which memory locations do we need to manipulate and read to see which key was pressed.

The following web link provide us with a memory map which will aid in finding these memory locations:

Scrolling down, we eventually find the place where it is dealt with the keyboard:


As you can see, both these ports is used by the joystick ports and the keyboard. The first piece of info that is useful for us, is the following at memory location DC00:

  • Bit #x: 0 = Select keyboard matrix column #x.

So, this is actually where we energise one or more columns. In the matrix diagram, this is actually the parts labeled A - H. Each of these are assigned a bit number (0 - 7) in the byte we write to this port.

The next piece of useful info is at memory location DC01:

  • Bit #x: 0 = A key is currently being pressed in keyboard matrix row #x, in the column selected at memory address $DC00.

So, we select one or more columns in location DC00 and within the selected column, we can read via location DC01 which rows in that column is selected.

Let us now see how we can emulate a keypress in our emulator. At this point we are able to catch keys from the keyboard with a KeyboardListener. In our KeyboardListener we can basically trigger events for which we listen for in our Bloc.

First let us define a event class which we will trigger:

class KeyC64Event extends C64Event {
  final bool keyDown;
  KeyC64Event({required this.keyDown});
}
So, we will either trigger an event with keyDown = true, when a key is pressed, or an event with keyDown = false, when a key is released.

With this in mind, let us modify our KeyboardListener:

            } else if (state is RunningState) {
              return KeyboardListener(
                focusNode: context.read<C64Bloc>().focusNode,
                autofocus: true,
                onKeyEvent: (event) => {
                  if (event is KeyDownEvent) {
                    if (event.logicalKey == LogicalKeyboardKey.keyM) {
                      context.read<C64Bloc>().add(KeyC64Event(keyDown: true))
                    }
                  } else if (event is KeyUpEvent) {
                    if (event.logicalKey == LogicalKeyboardKey.keyM) {
                      context.read<C64Bloc>().add(KeyC64Event(keyDown: false))
                    }
                  }

                },
                child: RawImage(
                    image: state.image, scale: 0.5),
              );
            } else {
Next, let us listen for these events in our Bloc:

class C64Bloc extends Bloc<C64Event, C64State> {
...
  bool keyDown = false;
...
  C64Bloc() : super(InitialState()) {
...
    on<KeyC64Event>((event, emit) {
      keyDown = event.keyDown;
    });
...
  }
...
}
So, within our Bloc, keyDown is a variable keeping track of whether the key is up or down, which in this case is the state of the M key on our keyboard. We will make use of this variable to simulate a key stroke in our emulator.

Now, the action simulation of a key press should happen in our Memory class when a read is done from address DC01, we should consider which column is enable via address DC00, and see if in the column enabled, that there is indeed one of the keys held down and send back a value that reflects this.

So, we have a situation here where Memory wants some info from our Bloc class in which it lives, but we dont want to provide Memory for with all the state of the Bloc class. To achieve this we need to create an interface with methods returning the info the Memory needs.

Here is the interface:

abstract class KeyInfo {
  int getKeyInfo(int column);
}
And now let us implement the interface in our Bloc:

class C64Bloc extends Bloc<C64Event, C64State> implements KeyInfo {
...
  @override
  int getKeyInfo(int column) {
  }
...
}
So, given the list of columns energised, we return the rows. Now, as an exercise, lets say if we press the M key on the keyboard, which we currently check for in our KeyBoardListener, we want our C64 emulator to also show an M.

So, let us look at the keyboard matrix diagram again to see where the M key is located. The M key is located at column E and row 4. So with the bit counting starting at column A, the bit number of column E is 4.  So we are interested in column bit 3 and row bit 4. 

With this in mind, Let us give getKeyInfo() some meat:

  @override
  int getKeyInfo(int column ) {
    if (!keyDown) {
      return 0xff;
    }
    if ((column & 0x10) == 0) {
      return 0xef;
    } else {
      return 0xff;
    }
  }
One thing to remember here is that when working with the keyboard matrix, we don't work with the default assumption that one means active, but the other way around. So a zero means in the column byte that a certain column is energised, and a zero in the row byte means that the switch for that bit position is held down.

With all this written, let us make our Memory class make use of it:

class Memory {
...
  late final KeyInfo keyInfo;
...
  setKeyInfo(KeyInfo keyInfo) {
    this.keyInfo = keyInfo;
  }
...
}
So, we can pass our keyInfo object to our Memory class. We assign the keyInfo when our Bloc class is instantiated:
class C64Bloc extends Bloc<C64Event, C64State> implements KeyInfo {
...
  C64Bloc() : super(InitialState()) {
    memory.setKeyInfo(this);
...
  }
...
}
Finally, let us use keyInfo our Memory class:
...
  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else if (address == 0xDC01) {
      return keyInfo.getKeyInfo(_ram.getUint8(0xDC00));
    } else {
      return _ram.getUint8(address);
    }
  }
...
So, when address DC01 is read from our Memory we invoke getKeyInfo and passing it the contents of memory location DC00. At the moment we will fetch location DC00 from RAM.

Now, when we build and run, and press the M key a couple of times, the screen looks like as follows:

We managed to implement the implement a simple key press!

Implementing the full keyboard

Let us now look at implementing a full keyboard, or at least sufficient keys, like the alphabet, digits and some symbols, just to type a simple basic program within our emulator.

Up to now we kept track only of a single whether it is down via keyDown, but now we need to keep track of whether several keys are held down. So, we need like kind of a boolean matrix, or to put it more plainly, an array of eight bytes. Each column is a byte:

  final List<int> matrix = [0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff];
I mentioned earlier that in a real C64, a zero means the key is selected. So, this array filled with the value 0xff's, means no key is held down at the moment.

Previously in our Main class, we just looked for the M key being pressed and released, and then pass this event to our Bloc class. Obviously now we will need to remove this explicit check for the M key and pass all key events to our Bloc class. This necessitates us to modify our KeyC64Event class to say which key was pressed not if a key was pressed:

class KeyC64Event extends C64Event {  
  final bool keyDown;
  final LogicalKeyboardKey key;
  KeyC64Event({required this.keyDown, required this.key});
}
With this in place our Bloc class will receive indeed a key code, but what make only sense in the Flutter world. We need a kind of a lookup table or a map to convert a Flutter keyboard scan code to a C64 keyboard scan code. So for this purpose we create the following map, preferably in a separate file:

Map<LogicalKeyboardKey, int> keyMap = Map.unmodifiable({
  LogicalKeyboardKey.keyA : 0x0A,
  LogicalKeyboardKey.keyB : 0x1C,
  LogicalKeyboardKey.keyC : 0x14,
  LogicalKeyboardKey.keyD : 0x12,
...
  LogicalKeyboardKey.digit0 : 0x23,
  LogicalKeyboardKey.digit1 : 0x38,
  LogicalKeyboardKey.digit2 : 0x3B,
  LogicalKeyboardKey.digit3 : 0x08,
  LogicalKeyboardKey.digit4 : 0x0B,
  LogicalKeyboardKey.digit5 : 0x10,
  LogicalKeyboardKey.digit6 : 0x13,
  LogicalKeyboardKey.digit7 : 0x18,
  LogicalKeyboardKey.digit8 : 0x1B,
  LogicalKeyboardKey.digit9 : 0x20,
...
  LogicalKeyboardKey.space : 0x3c,
  LogicalKeyboardKey.shiftLeft : 0x0F,
  LogicalKeyboardKey.enter : 0x01,
...
});
With this map cretaed, we can now modify our listener a bit for the event KeyC64Event:

    on<KeyC64Event>((event, emit) {
      int c64KeyCode = keyMap[event.key] ?? 0;
      int col = c64KeyCode >> 3;
      int row = 1 << (c64KeyCode & 7);
      if (!event.keyDown) {
        matrix[col] |= row;
      } else {
        matrix[col] &= ~row;
      }
    });
We start off by looking up the C64 scancode, given the Flutter key code. Now bit 5-3 of the scan code is the column and bits 2-0 is the row.

In the if statement, if the key is released, we OR the bit position with. Please it is pressed, we mask off the bit position.

Now we need to modify the method getKeyInfo, which is the method our Memory class calls when reading Address DC01. When calling this method, we tell the method which columns needs to be considered. Potentially two ore more columns can be selected, in which case we need to do a kind of a OR operation, to reduce the selected columns to one.

We can express this reducing in a simple for loop:

  @override
  int getKeyInfo(int column ) {
    int result = 0xff; // Accumulator for the OR'ed numbers

    for (var row in matrix) {
      if ((column & 1) == 0) {
        result &= row; 
      }

      column = column >> 1;
    }

    return result;
  }
We are shifting the column right eveyrtime, looking everytime if the lowest bit is zero. If it is zero, we know the column is selected. We and all the selected columns together. If, for any row position in a selected column there is a zero, then the final value for that bit position would be zero. A zero means there was one or more keys selected in that bit position for the selected columns.

Now, let us see if we can write a simple program, with the keyboard input enabled:

Next, let us run the program:
We have a working program!

In Summary

In this post we implemented keyboard input and write a small test program.

In the next post we will start implementing tape loading, from a tape image.

Until next time!



Monday, 28 April 2025

A Commodore 64 Emulator in Flutter: Part 12

Foreword

In the previous post we successfully ran the Klaus Dormann Test Suite.

In this post we will be trying to boot the C64 system with its ROM's.

Enjoy!

Inserting the ROMS

Inserting the ROM's... Now that sounds like plugging and unplugging game cartridges 😂. In our case, this means loading the C64 ROM images from files into memory, and making sure our emulated CPU can access the contents.

We start by dumping the ROM images into the asset folder:


Usually for the C64 ROMS you get for download on the internet, the file names have always some version numbers in it. In my case, I just gave them simple names. Also, notice that I have removed the file program.bin we used in previous posts.

  C64Bloc() : super(InitialState()) {
    on<InitEmulatorEvent>((event, emit) async {
      final basicData = await rootBundle.load("assets/basic.bin");
      final characterData = await rootBundle.load("assets/characters.bin");
      final kernalData = await rootBundle.load("assets/kernal.bin");
      memory.populateMem(basicData, characterData, kernalData);
...
So, we load the different ROM's, waiting for the loading of each file to complete, and then going to the next file for loading.

Now, you might notice that from previous posts, that we now pass more ROMS to memory.populateMem. So let us delve a bit deeper in our Memory class to see what changes are required:

...
  late type_data.ByteData _basic;
  late type_data.ByteData _character;
  late type_data.ByteData _kernal;
...
  final type_data.ByteData _ram = type_data.ByteData(64*1024);
...
  populateMem(type_data.ByteData basicData, type_data.ByteData characterData,
      type_data.ByteData kernalData) {
    _basic = basicData;
    _character = characterData;
    _kernal = kernalData;
  }
...
Fairly straightforward. Each ROM that is passed through, we store in a variable.

Something else we do, is do define a 64KB array that will act as our RAM, the significant characteristic of the C64.

So, next, let us add some address mapping:

  setMem(int value, int address ) {
    _ram.setInt8(address, value);
  }

  int getMem(int address) {
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else {
      return _ram.getUint8(address);
    }
  }

For memory writes, we write straight to the ram array. For reads, we do it the usual C64 setup:

  • Addresses A000-BFFF: We read from basic ROM
  • Addresses E000-EFFF: We read from Kernal ROM
  • All other addresses we read from RAM

Booting the C64 System

We are now close to booting the C64 system with all its ROM's.

First things first. Our periodic timer current runs once every second, executing 1 000 000 millions cycles worth of CPU instructions. However, we want to reduce to a 60th of a second, so that later on we can draw a frame every time our time executes, yielding 60 frames a second, which is the frame rate of a native C64:

    on<RunEvent>((event, emit) {
      timer = Timer.periodic(const Duration(milliseconds: 17), (timer) {
          int targetCycles = _cpu.getCycles() + 16666;
          do {
            _cpu.step();
          } while (_cpu.getCycles() < targetCycles);
      });
    });

Every time we also execute 16666 cycles, which is the number of CPU cycles in a 1/60th of a second.

To boot the C64 ROM's, is actually fairly straightforward. You basically set the program counter to the value of the reset vector. For his we just we just create the following method:

  reset() {
    pc = memory.getMem(0xfffc) | (memory.getMem(0xfffd) <<< 8);
  }
So, here we populate the program counter with the reset vector at adress FFFC and FFFD.

We still need to call this method. We do this just after we have loaded all the ROM's:

  C64Bloc() : super(InitialState()) {
    on<InitEmulatorEvent>((event, emit) async {
      final basicData = await rootBundle.load("assets/basic.bin");
      final characterData = await rootBundle.load("assets/characters.bin");
      final kernalData = await rootBundle.load("assets/kernal.bin");
      memory.populateMem(basicData, characterData, kernalData);
      _cpu.reset();
...
Now, we can finally boot the C64 System. We wait for a minute, and then hit stop to view the registers:


We see the program stabilise at address FF61. Let us have a look at the Kernal disassembly listing what is going on at this address:
 
As seen here, we get stuck in a loop with the memory address D012 not changing. We can expect that such a thing can happen at the moment, because with our current emulator setup that address will write and read to raw RAM, and thus nothing will happen.

In reality D012 maps to the VIC-II display registers and provide info on which rasterline on the screen we are currently at. It is quite an undertaking to implement such a raster counter, so for now, let us see if we can quickly hack together something, so that the Address D012, can just change sometimes, just to get past that loop. Here is my quick hack:

  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else {
      return _ram.getUint8(address);
    }
  }

So, the hack is simply just a counter that keeps count of the number of reads, and we look at bit 9 of the counter. If it is set, we return a 1, otherwise a zero. In effect we will have a 1 for about a thousand counts, and then a zero for another thousand counts.

Let us now see where our program counter lands. This time it lands at E5D4. Lets look again at the disassembly listing for this address:


Here it seems we are in a waiting loop, waiting for the enter key to be pressed on the keyboard. I think this is a pretty decent place for our emulator to be and probably means that all initialisation has been completed, and we should have the welcome message in screen memory.

We want to check if the welcome message is in screen memory, but our debug dump current just show the first two pages of memory. We could, however, inspect the ram array in debug mode in Intellij.

So, startup the emulator in debug mode and press the play button to let the C64 system run at full speed. Wait for about a minute and then put a breakpoint on the first line of the getMem method in our Memory class. With the system running at full speed, that breakpoint will be hit almost instantaneously.

Open up an evaluate window and enter the following:


Here we inspect address 1024 of screen memory, which is the first byte of it. In this case the value is 32, which is a space. Inspecting addresses further in screen memory will reveal the welcome message.

In the next section we will render the contents of the screen at real time.

Rendering screen memory

We will now try and render screen memory in real time, showing a display similar to the C64 in text mode.

We ultimately need a mechanism that would allow us to work efficiently with image data on a pixel level. Flutter ultimately provide it to us via the RawImage widget, together with ui.Image with which you can work with an array of RGBA values.

Let us unpack this a bit. Let us start working with the raw array of RGBA values, where we will produce the frame for display, based on the screen memory and the character ROM.

Both the character rom and screen memory is present in our Memory class, so for now we will do the frame rendering in that class.

Firstly, let us define the byte buffer we are going to use over and over again:

class Memory {
...
    final type_data.ByteData image = type_data.ByteData(320*200*4);
...
}
So, as we can see, we have a resolution 3200x200, which is the resolution of a real C64 screen. We multiply the end result by 4, because each pixel is bytes in our buffer, one byte each for red, blue green and the alpha channel.

Next, let us write method for rendering a screen to the byte array:

  type_data.ByteData getDisplayImage() {
    const rowSpan = 320 * 4;
    for (int i = 0; i < 1000; i++ ) {
      var charCode = _ram.getUint8(i + 1024);
      var charAddress = charCode << 3;
      var charBitmapRow = (i ~/ 40) << 3;
      var charBitmapCol = (i % 40) << 3;
      int rawPixelPos = charBitmapRow * rowSpan + charBitmapCol * 4;
      for (int row = /*charAddress*/ 0 ; row < /*charAddress +*/ 8; row++ ) {
        int bitmapRow = _character.getUint8(row + charAddress);
        int currentRowAddress = rawPixelPos + row * rowSpan;
        for (int pixel = 0; pixel < 8; pixel++) {
          if ((bitmapRow & 0x80) != 0) {
              image.setUint32(currentRowAddress + (pixel << 2), 0x000000ff);
          } else {
              image.setUint32(currentRowAddress + (pixel << 2), 0xffffffff);
          }
          bitmapRow = bitmapRow << 1;
        }
      }

    }
    return image;
  }

So, here we loop through all thousand characters codes in screen memory and rendering everyone. Each character code is actually an index into character ROM, every character is its own 8x8 pixel bitmap.

Now, this method is invoke everytime when our perioc timer runs:

...
import 'dart:ui' as ui;
...
   on<RunEvent>((event, emit) {
      timer = Timer.periodic(const Duration(milliseconds: 17), (timer) {
          int start = DateTime.now().millisecondsSinceEpoch;
          int targetCycles = _cpu.getCycles() + 16666;
          do {
            _cpu.step();
          } while (_cpu.getCycles() < targetCycles);
          ui.decodeImageFromPixels(memory.getDisplayImage().buffer.asUint8List(), 
             320, 200, ui.PixelFormat.bgra8888, setImg);
      });
    });
ui.decodeImageFromPixels is a menthof within the dart:ui library of flutter. It will create an Image object from a pixel buffer, which in this case is the rendered screen buffer.

We also pass ui.PixelFormat.bgra8888 as a parameter, indicating our buffer is in the format with byte each for red, green, blue, green and alpha.

We also pass a callback method, setImg in this case, which will be called once we have the generated Image object.

So, let us implement this callback method:

    void setImg(ui.Image data) {
      emit(RunningState(image: data, frameNo: frameNo++));
    }
Here you can see we are emitting the image in a state object, so our BlocBuilder can pick up the change and render the image. You will also notice that we have a frameNo Property that we modify with each new image, so our BlockBuilder can easily pick up the change.

You will recall that from previous posts, that we did define RunningState previously, which we applied changes to now. Here is the revised version:

class RunningState extends C64State {
  RunningState({required this.image,
    required this.frameNo});

  final int frameNo;
  final ui.Image image;
  @override
  List<Object> get props => [frameNo];
}
Finally, let us modify our BlocBuilder:

...
        body: BlocBuilder<C64Bloc, C64State>(
          builder: (BuildContext context, state) {
            if (state is InitialState) {
              return const CircularProgressIndicator();
            } else if (state is DataShowState) {
              return Column(
                children: [
                  Text(getRegisterDump(state.a, state.x, state.y, state.n,
                      state.z, state.c, state.i, state.d, state.v, state.pc)),
                  Text(
                    getMemDump(state.memorySnippet),
                    style: const TextStyle(
                      fontFamily: 'RobotoMono', // Use the monospace font
                    ),
                  ),
                ],
              );
            } else if (state is RunningState) {
              return RawImage(
                  image: state.image, scale: 0.5);
            } else {
              return const CircularProgressIndicator();
            }
          },
        ),
...
So, if the state is RunningState, we return a RawImage widget, which will be displayed on the screen. We pass the image in the state to the RawImage widget. We also use a scale of 0.5, with which we basically doubles the displayed size. The native resolution of 320x200 of a C64 frame display very small on a modern display, so at least with the scale, it can appear bigger.

With everything coded we can now give it a test run. The startup sequence appear to take more or less the same time as a real C64, and eventually the welcome screen appear:

We are making progress, but still, there is no flashing cursor.

Getting the cursor to flash

Let us see if we can get the cursor to flash. 

If you go down the bowls of the C64 system, you will found that the core of a standard C64 system that just started up, is that there is a timer interrupt every 60th of a second. This interrupt does a couple of things, like checking if any key was pressed or released and updating the status of the cursor.

So, let us see if we we can put a hack together, that off the bat we just force an interrupt every 60th of a second, without worrying for now to implement emulation of the full CIA chip with a timer.

The easiest way is in the step() method of our CPU class:

  step() {
    if ((_cycles > 1000000) &&((_cycles % 16666) < 30) && (_i == 0)) {
      push(pc >> 8);
      push(pc & 0xff);
      push((_n << 7) | (_v << 6) | (2 << 4) | (_d << 3) | (_i << 2) | (_z << 1) | _c);
      _i = 1;
      pc = memory.getMem(0xfffe) | (memory.getMem(0xffff) << 8);
    }
...
  }
So, we wait for a second before triggering interrupts in 1/60 second intervals. With the change, the cursor actually flashes:

In Summary

In this post we managed to boot the C64 system with all its ROMs and managed to render screen memory in real time, showing the welcome message and the flashing cursor. 

The source code for post is available in the following Github tag: https://github.com/ovalcode/c64_flutter/tree/c64_flutter_part12

In the next post we will add some keyboard interaction with our emulator.

Until next time!


Wednesday, 9 April 2025

A Commodore 64 Emulator in Flutter: Part 11

Foreword

In the previous post we ran the Klaus Dormann Test Suite on our emulator. In this process we found a couple of issues with our emulator. We fixed a couple of issues, but found a couple of more issues we still need to fixed.

In this post we will look at the remaining issues. Solving these remaining issues wasn't so much of a deal at all, so this post will be shorter normal.

The remaining fixes

One of the major issues I found while running the Klaus Dormann Test Suite on my emulator, was some incorrect values for some of the CPU data tables. This include some instructions having the incorrect address mode and incorrect instruction lengths.

The other issue I experienced, was failed test cases because decimal mode wasn't implemented. Implementing Decimal mode is fairly straightforward. We start with implementing the following methods:

  int adcDecimal(int operand) {
     int l = 0;
     int h = 0;
     int result = 0;
     l = (_a & 0x0f) + (operand & 0x0f) + _c;
     if ((l & 0xff) > 9) l += 6;
     h = (_a >> 4) + (operand >> 4) + (l > 15 ? 1 : 0);
     if ((h & 0xff) > 9) h += 6;
     result = (l & 0x0f) | (h << 4);
     result &= 0xff;
     _c = (h > 15) ? 1 : 0;
     _z = (result == 0) ? 1 : 0;
     _n = 0;
     _v = 0;
     return result;
   }
 
   int sbcDecimal(int operand) {
     int l = 0;
     int h = 0;
     int result = 0;
     l = (_a & 0x0f) - (operand & 0x0f) - (1 - _c);
     if ((l & 0x10) != 0) l -= 6;
     h = (_a >> 4) - (operand >> 4) - ((l & 0x10) != 0 ? 1 : 0);
     if ((h & 0x10) != 0) h -= 6;
     result = (l & 0x0f) | (h << 4);
     _c = ((h & 0xff) < 15) ? 1 : 0;
     _z = (result == 0) ? 1 : 0;
     _n = 0;
     _v = 0;
     return (result & 0xff);
   }

We modify the applicable instruction selectors:

       case 0x69:
         adc(arg0);
         if (_d == 1) {
           _a = adcDecimal(arg0);
         } else {
           adc(arg0);
         }
       case 0x65:
       case 0x75:
       case 0x6D:
       case 0x7D:
       case 0x79:
       case 0x61:
       case 0x71:
         adc(memory.getMem(resolvedAddress));
         if (_d == 1) {
           _a = adcDecimal(memory.getMem(resolvedAddress));
         } else {
           adc(memory.getMem(resolvedAddress));
         }
         
      case 0xE9:
         sbc(arg0);
         if (_d == 1) {
           _a = sbcDecimal(arg0);
         } else {
           sbc(arg0);
         }
       case 0xE5:
       case 0xF5:
       case 0xED:
       case 0xFD:
       case 0xF9:
       case 0xE1:
       case 0xF1:
         sbc(memory.getMem(resolvedAddress));
         if (_d == 1) {
           _a = sbcDecimal(memory.getMem(resolvedAddress));
         } else {
           sbc(memory.getMem(resolvedAddress));
         }

Test Results

With everything fixed, we can see if all the tests passed.

The Test Suite runs for about two minutes on my emulator. After the two minutes, when hitting stop, the register window will look as follows:


From this point the program counter remains at 3469. Lets have a look at the assembly listing to see what is at this address:

So, this is confirmation that our emulator passed all the tests!

In Summary

In this post we confirm that we implemented all the CPU instructions correctly in our emulator, using Klaus Dormann's Test Suite.

Here is a link to the tag of this post's source code: https://github.com/ovalcode/c64_flutter/tree/c64_flutter_part11

In the next post we will start writing some more code to boot the C64 ROM's.

Until next time!

Saturday, 1 March 2025

A Commodore 64 Emulator in Flutter: Part 10

Foreword

In the previous post we implemented the last couple of 6502 instructions in our C64 Flutter emulator.

In this post we will be running the Klaus Dormann Test Suite on our emulator to ensure we have implemented all the instructions correctly.

Starting up the Klaus Dormann Test Suite

Let us see if we can startup the Klaus Dormann Test Suite on our emulator, although only in a single stepping fashion at the moment.

To get started, we need two files from Klaus' Github repository:

The first link is the actually binary which will execute in our emulator. This is a 64KB binary which will fill the whole address space accessible by the 6502.

The second file is a listing file, containing the actual disassembled version of the binary we are running. The listing file is useful if you want to follow along to see what the program is actually doing in a certain point in time.

Firstly we dump the binary in the assets folder of our Flutter project and rename it to program.bin. This is the default binary our emulator looks for when it starts up.

Now, usually if a 6502 system starts up, it looks at the reset vector at address 0xFFFC and 0xFFFD for the starting address for which it should start executing code, something which we didn't implemented yet.

In the Klaus Test suite there is also a reset vector defined, but within the context of the Test Suite it has the function to detect if an accidental reset was triggered. So, in actual fact this Test Suite doesn't use the reset vector to everything. Rather, when using the test suite, you should just set the PC register to 0x400 and start execution. This makes our life easier, and for the moment we don't need to worry about implementing the Reset vector stuff.

So, in put cpu.dart, the following change needs to be done, change in bold:

 
...
  int _n = 0, _z = 0, _c = 0, _i = 0, _d = 0, _v = 0;
  int _sp = 0xff;
  int pc = 0x400;
...
With this we can startup our emulator and single step through the code of Test Suite.

Unattended running

To single step through the Klaus Dormann Test Suite in our emulator will be such a daunting tasks. You will probably need to click the step button thousands of times.

It would make our lives easier if we could just let the Test Suite run unattended, with us just pausing the execution once in while, to see how far we have progressed through the tests.

We do this by adding a button right next to the title. As part of the process we need to wrap both the title and the button in a row in order for everything to align properly. All this is happening in the main.dart file:

        appBar: AppBar(
          title:  Row(
              mainAxisSize: MainAxisSize.min,
              children: [
                const Text("Emulator C64"),
                BlocBuilder<C64Bloc, C64State>(
                  builder: (BuildContext context, state) {
                    return Row(
                        mainAxisAlignment: MainAxisAlignment.end,
                        children: [
                          _getRunStopButton(state, context)
                        ]);
                  })
              ]),
        ),
We want our run button to behave like a toggle switch, toggling between a play and a pause button. To do all these fancy stuff, we need to inject some state, which we achieve by wrapping everything with a BlocBuilder. We did discuss the workings of BlocBuilder in a previous post.

Now, the method _getRunStopButton() returns for us three possible buttons, depending on the state, which could be a play button, a stop button, and a disabled play button if everything hasn't initialised yet:

  Widget _getRunStopButton(C64State state, BuildContext context) {
    if(state is DataShowState) {
      return IconButton(
        icon: Icon(Icons.play_arrow),
        onPressed:  () {
          context.read<C64Bloc>().add(RunEvent());
        }
      );
    } else if (state is RunningState) {
      return IconButton(
          icon: Icon(Icons.stop_circle),
          onPressed:  () {
            context.read<C64Bloc>().add(StopEvent());
          }
      );
    } else {
      return const IconButton(
          icon: Icon(Icons.play_arrow),
          onPressed:  null
      );
    }
  }

Here we test for different states. Firstly we show an enabled play button if we are in DataShowState. As you might remember from previous posts, with DataShowState, we display a dump of memory and registers, and we can single step from that point. This is the perfect scenario to provide a play button that will run the emulator at full speed.

Pressing the play button emits a RunEvent, which we still need to implement a listener for. We will do that in a bit.

Secondly if our emulator is in the RunningState, we display the stop button. We should still implement the RunningState State, which in actual fact is a very simple implementation:

class RunningState extends C64State {}
No values or properties we need to convey to here, just conveying the mere fact when we are in the running state.

Finally, for any other state we just want to show a play button that is disabled. This will only happen when our Application is loading up and loading the memory image, which is our case is the Test Suite.

Now, we have defined a number of events that we need to listen for in c64_bloc.dart.

Firstly, let us define the listener for RunEvent. This will be the core of our unattended running. Here we want schedule a timer that runs every second and then we also execute a second worth of CPU instructions (aka 1 000 000 CPU cycles). We need to emit a RunningState state so our front end can update accordingly.

Let us start with an outline:

...
  Timer? timer;
...
    on<RunEvent>((event, emit) {
      timer = Timer.periodic(const Duration(seconds: 1), (timer) {
...
      });
      emit(RunningState());
    });
...
We define the timer variable as a global variable in our C64Bloc class, since we want to be able to cancel the timer in another event handler.

Now, to determine when our CPU has executed 1 million cycles worth of instructions, our CPU needs to keep record of the cycles for each of the instructions it executes. This is obviously in the step method:

...
  int _cycles = 0;
...
  int getCycles() {
    return _cycles;
  }
...
  step() {
...
    _cycles = _cycles + CpuTables.instructionCycles[opCode];
    var resolvedAddress =
        calculateEffectiveAddress(CpuTables.addressModes[opCode], arg0, arg1);
    switch (opCode) {
...
    }
  }
We have defined the instructionCycles array in a previous post, which specify the number of cycles for every opcode. So, with every step we can just add the number of cycles for the opcode being executed to a _cyles variable.

With this implemented, we can add some meat to our timer callback function:

...
    on<RunEvent>((event, emit) {
      timer = Timer.periodic(const Duration(seconds: 1), (timer) {
          int targetCycles = _cpu.getCycles() + 1000000;
          do {
            _cpu.step();
          } while (_cpu.getCycles() < targetCycles);
      });
      emit(RunningState());
    });
...
So, we just add one million to our current Cpu cycle count and that will be the target at which we will stop the loop.

Finally, we need to implement the stop event:

    on<StopEvent>((event, emit) {
      timer?.cancel();
      emit(DataShowState(
          dumpNo: dumpNo++,
          memorySnippet: ByteData.sublistView(memory.getDebugSnippet(), 0, 512),
          a: _cpu.getAcc(),
          x: _cpu.getX(),
          y: _cpu.getY(),
          n: _cpu.getN() == 1,
          z: _cpu.getZ() == 1,
          c: _cpu.getC() == 1,
          i: _cpu.getI() == 1,
          d: _cpu.getD() == 1,
          v: _cpu.getV() == 1,

          pc: _cpu.pc));
    });

Here we just cancel the timer emit a DataShowState, so after we have stopped the running, we want to display the current state of memory and the registers.

When the emulator runs unattended, we also want to hide the state display to avoid confusion and just show "running". To keep the discussion focused, I will not be going into this detail.

Running the Test Suite

Finally we are at a point where we can run Klaus Dormann's Test Suite. On startup, the screen look like this:

As dicussed, the play button to start the emulator in unattended mode is next to the title.

When clicking play, the screen changes like this:

One weird thing you might notice, is if you click run and quickly stop again, you will see the Program counter is still at 0x400, the starting address of the test suite. As if nothing executed. The reason for this is very subtle. Our timer callback will only execute if the timer lapsed. So, in our case we need to wait at least 1 second to expect some results before clicking the stop button.

So, if we let it run a bit longer, our result will look like this:


So, when stopped our Program counter was at 0x9D7. Funny thing is, you can let it run for long as you want to, but the Program counter remains stuck at 0x9D7.

What is going on here?

To find the answer we need to look at the source listing of the Test Suit and search for that address:


Here it is clear, if something went wrong with the test, it will do an endless loop at the address 09D7. So, obviously, our emulator failed test, but which one? Look back a couple of lines, we see the comment: The IRQ vector was never executed

Aha! We never implemented IRQ's (Interrupt Requests) in our emulator. Having said that, it briefly caught me in a mystical moment, almost like as a kid and playing on a Commodore 64, I wondered for the first time what was going on underneath the hood.

In this case I wondered where the IRQ came from. This Test Suite doesn't implement any magical peripherals? After a moment I realise that this was probably caused by me not implementing the BRK instruction, and looking further back in the listing did confirm this.

This was actually a very interesting experience for me. It was the first time I encountered a problem, and my first instinct is moment of nostalgia 😂

In the following section we will implement the BRK instruction and then run the emulator again.

Implementing the BRK and RTI instructions

So, let us quickly implement the BRK and RTI instructions. There is one caveat with the BRK instruction. It is a one byte instruction, but in actual fact it behaves like a 2 byte instruction. The BRK triggers an IRQ and when it returns it doesnt return to the address directly after the BRK instruction, but one address further on.

To account for this quirk of the BRK instruction, we can adjust the instruction length in the instructionLen table for the BRK instruction to 2.

With the table adjusted, we implement the BRK and RTI instruction as follows:

      /*BRK*/
      case 0x00:
        push(pc >> 8);
        push(pc & 0xff);
        push((_n << 7) | (_v << 6) | (3 << 4) | (_d << 3) | (_i << 2) | (_z << 1) | _c);
        _i = 1;
        pc = (memory.getMem(0xffff) << 8) | memory.getMem(0xfffe);

      /*RTI*/
      case 0x40:
        int temp = pull();
        _c = temp & 1;
        _z = (temp >> 1) & 1;
        _i = (temp >> 2) & 1;
        _d = (temp >> 3) & 1;
        _v = (temp >> 6) & 1;
        _n = (temp >> 7) & 1;
        pc = pull() | (pull() << 8);
Now, when we run the test suite again, we get passed this failed test suite. However, we end up in another endless loop at address 0xdeb, which indicates another failed test.

We will investigate this failed case, as well as other potential failed cases in the next post.

In Summary

In this post we ran the Klaus Dormann Test Suite on our Emulator in unattended mode. The first failed test case we encountered was the BRK/RTI instruction that wasn't implemented.

With the BRK/RTI instruction implemented we encountered another failed test case which we will investigate in the next post, as well as other potential failed test cases which will pop up.

You can find all the source code for this project as well as the binary image containing the Klaus Dormann test suite, here.

Until next time!