Tuesday, 24 March 2026

A Commodore 64 Emulator in Flutter: Part 16

Foreword

In the previous post we managed to implement Tape loading, and managed to emulate this process until it shows it found a file name.

We ended the post discovering that the emulator has a serious memory leak, which will attempt to solve in this post.

Unpacking the Memory Leak

In the previous post we saw that our emulator had a serious memory leak, where memory usage grew over 1 Gigabyte in less than half an hour.

I carefully went through my code but couldn't really find any obvious place where a memory leak was happening. 

I did, however, had a suspicion that the cause of the memory leak was probably related to how the frames was rendered to the screen. This is probably the process in the emulator where the most data move back and forth.

Eventually I debated with ChatGPT and Gemini which components are the best for doing rendering in flutter which would cause memory leaks. I tried all the suggestions but didn't really resulted in fixing the memory leak.

Eventually Gemini came with a suggestion to use a native HTML canvas to do the rendering. This strike as a sensible idea as I was using an HTML canvas in one of my JavaScript Emulators I used about 10 years, without any memory leak.

Also, I started to realise my current rendering implementing was perhaps on the heavy side. With a Bloc emitting a state change on every frame, part of the widget tree was being redrawn 60 times a second. This sounded very intense, so the idea of reusing an HTML canvas seemed like a way out.

I did a proof of concept, and create a small Flutter project, using the HTML canvas as described, and just rendered some simple, like a moving line, being redrawn 60 times a second. In this proof of concept I actually found that the memeory usage remained within bounds.

So, in this post I will be following this approach of rewriting the emulator to make use of an HTML Canvas, in order to eliminate the memory leak.

Bringing a native HTML canvas to Flutter

Let us pause for moment, and see how we can introduce a native HTML canvas in Flutter.

We begin with a simple class:
class EmulatorCanvas {
  late final html.CanvasElement canvas;
  late final html.CanvasRenderingContext2D ctx;

  final int width;
  final int height;
  int inc = 0;

  EmulatorCanvas(this.width, this.height) {
    canvas = html.CanvasElement(width: width, height: height);
    ctx = canvas.context2D;

    // Register with Flutter
    ui.platformViewRegistry.registerViewFactory(
      'emulator-canvas',
          (int viewId) => canvas,
    );
  }

}

In this code, html is a Dart package, and html.CanvasElement actually creates us a native HTML Canvas element. It is important to note that at this stage, the created canvas element is not attached to the HTML page at the moment.

The registerViewFactory actually allows the widget tree to have access to the created Canvas element, and we associate the name emulator-canvas with it.

Let us now see where we will use thus class:

import 'package:file_picker/file_picker.dart';
import 'package:flutter/cupertino.dart';
import 'package:flutter/material.dart';
import 'package:flutter/scheduler.dart';
import 'package:flutter_bloc/flutter_bloc.dart';

import 'emulator_canvas.dart';
import 'emulator_controller.dart';

class VideoScreen extends StatefulWidget {
  const VideoScreen({super.key});

  @override
  State<VideoScreen> createState() => _VideoScreenState();
}

class _VideoScreenState extends State<VideoScreen>
    with SingleTickerProviderStateMixin {

  late EmulatorCanvas emCanvas;

  @override
  void initState() {
    super.initState();

    emCanvas = EmulatorCanvas(320, 200);

  }

  @override
  void dispose() {
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      backgroundColor: const Color(0xFF4040E0),
      body: Column(
        children: [

          const SizedBox(
            width: 640,
            height: 400,
            child: HtmlElementView(viewType: 'emulator-canvas'),
          ),
        ],
      ),
    );
  }
}
Firstly we have VideoScreen as a StatefulWidget, which meand we will reuse this instance, and it will not be destroyed with every state change.

As the name implies, StatefulWidget, the Widget should contain state that should be mutable. For this reason our Widget ties to the class _VideoScreenState. Something interesting about this class is that it is declared with with SingleTickerProviderStateMixin. This means that ticker events is synchronised to screen refreshes and is called once per screen refresh.

Here we also declare an Instance of emCanvas. Finally with the widget we return via the build method, we also wrap the emCanvas instance into it with the label emulator-canvas. Previously we registered EmCanvas with flutter by that name, so in that way we can associate it inside our returning widget.

As an additional extra, it is interesting to inspect the HTML in the browser:


One can actually see the canvas element of what we defined in our code.

Obviously all this needs to be wired up all the way until the main screen in main.dart, which we will cover in the next section.

Moving towards a Controller Architecture

Up to now the centre of our C64 emulator in flutter was a BloC. In our BloC we emitted a new state with every frame, which instructed the front end to create a new widget instance for displaying the new frame.

As indicated earlier in this post, this is really clunky considering we need to render 60 frames a second. To get around this, we will discard our BloC idea and rather opt for a Controller architecture.

Let us start at the highest level, main.dart, which hosts our flutter application:

Future<void> main() async {
  WidgetsFlutterBinding.ensureInitialized();
  final controller = await EmulatorController.create();
  runApp(
    MaterialApp(
      home: RepositoryProvider.value(
        value: controller,
        child: const EmulatorRoot(),
      )
    ));
}
Now, as you might have guest, the core functionality of our emulator will live in the class EmulatorController. We will perform more or less the same things we performed in our old C64BloC class.

You might also remember that when we were previously writing our C64BloC class, we did some asynchronous tasks, loading the three C64 ROMS from disk, and we had to use use the keyword await to wait until everything was into memory before we continue. We need to do something similar with our controller, which necessitates us to declare our main() method as async, in order to use the await functionality.

In our main() method, we also make use of a RepositoryProvider. This enables us to inject our controller class further down in our tree where it might be needed.

To help us orientate everything let us have a look at the implementation of EmulatorRoot:

class EmulatorRoot extends StatefulWidget {
  // final String name;
  const EmulatorRoot({super.key});

  @override
  State<EmulatorRoot> createState() => _EmulatorRootState();
}

class _EmulatorRootState extends State<EmulatorRoot> {
  int _currentIndex = 0; // 0 = debug, 1 = video

  @override
  Widget build(BuildContext context) {
    EmulatorController controller = context.read<EmulatorController>();
    return Scaffold(
      appBar: AppBar(
        title: const Text("C64 Emulator"),
        actions: [
          IconButton(
            icon: const Icon(Icons.bug_report),
            onPressed: () => setState(() => _currentIndex = 0),
          ),
          IconButton(
            icon: const Icon(Icons.tv),
            onPressed: () => setState(() => _currentIndex = 1),
          ),
        ],
      ),
      body: IndexedStack(
        index: _currentIndex,
        children: [
          // DebugScreen(),
          KeyboardListener (
            // VideoScreen(),
            focusNode: controller.focusNode,
            autofocus: true,
            onKeyEvent: (event) => {
              if (event is KeyDownEvent) {
                controller.keyboardEvent(event.logicalKey, true)
                // context.read<C64Bloc>().add(KeyC64Event(keyDown: true, key: event.logicalKey))
              } else if (event is KeyUpEvent) {
                controller.keyboardEvent(event.logicalKey, false)
                // context.read<C64Bloc>().add(KeyC64Event(keyDown: false, key: event.logicalKey))
              }
            },
            child: const VideoScreen(),

          )
        ],
      ),
    );
  }
}
This is yet another StatefulWdiget with Associated state. The basic idea outlined here is to have a tabbed view, showing a debug view on one tab and the screen of the running emulator in another tab. We will not show how to implement the Debug tab in this series and is just shown as a possibility into how this emulator can develop.

For the tab showing the runnig emulator screen, we casically show VideoScreen, which we developed earlier on. We have also wrapped this screen with a KeyboarListener, for interception keystrokes. This is similar as we did in previous posts.

You will also see with the keyevents we call controller.keyboardEvent. This will interface our emulator with a keyboard.

Let us next, look at the internals of EmulatorController:

class EmulatorController implements KeyInfo{
  final Memory memory = Memory();
  final List<int> matrix = [0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff];
  late final Cpu _cpu = Cpu(memory: memory);
  late final Tape _tape;
  late final Alarms alarms = Alarms();
  FocusNode focusNode = FocusNode();
  bool tapeLoaded = false;

  EmulatorController._();

  static Future<EmulatorController> create() async {
    final instance = EmulatorController._();
    await instance._init();
    return instance;
  }

  Future<void> _init() async {
    final basicData = await rootBundle.load("assets/basic.bin");
    final characterData = await rootBundle.load("assets/characters.bin");
    final kernalData = await rootBundle.load("assets/kernal.bin");
    Cia1 cia1 = Cia1(alarms: alarms);
    cia1.setKeyInfo(this);
    Tape tape = Tape(alarms: alarms, interrupt: cia1);
    _tape = tape;
    memory.setCia1(cia1);
    memory.populateMem(basicData, characterData, kernalData);
    memory.setTape(tape);
    _cpu.setInterruptCallback(() => cia1.hasInterrupts());
    _cpu.reset();
  }
...
}
This is pretty much the same we did in our Bloc. There is, however, a couple of things we do extra. We hide the constructor and to get a new instance, we need to call create() to give us a properly initialised instance.

Implementing Screen Refreshing

So, we have just implemented the basics for a controller architecture for our Flutter C64 emulator. Let us next focus on how to render the frames.

Firstly, within video_screen.dart, we need to make this file aware of our controller within the initState() method:

  @override
  void initState() {
    super.initState();

    controller = context.read<EmulatorController>();
    emCanvas = EmulatorCanvas(320, 200);
    controller.setCanvasArray(emCanvas.getFrameBuffer());
    ...
  }
We use context.read to get the injected instance of the controller which was injected higher in the tree. Once we have the controller instance we pass it through the the framebuffer of the canvas, so that our controller do some drawing if required.

In an earlier section in this post I briefly talk about the use of SingleTickerProviderStateMixin, which syncs frame refreshes with refresh rate of the screen. We will now go further with this implementation inside the initState() method.

  void initState() {
    super.initState();

    controller = context.read<EmulatorController>();
    emCanvas = EmulatorCanvas(320, 200);
    controller.setCanvasArray(emCanvas.getFrameBuffer());

    _ticker = createTicker((Duration elapsed) {
      if ((elapsed.inMilliseconds - lastProcessed) < 16) {
        return;
      }
      lastProcessed = elapsed.inMilliseconds;

      controller.executeChunk();
      emCanvas.renderFrame();
    });

    _ticker.start();
  }

Here we create a ticker instance, which will execute with every screen refresh. With controller.executeChunk(), we tell our emulator execute one frame worth of cycles. This is more or less the same approach we implemented previously. 

You will also see that we throttle the rendering a bit to get close to the real speed of a C64, by just exiting the ticker body if it is not yet time to display the next frame. Having said that, most displays refreshes at a rate of 60Hz. So, if you take out the return code, you emulator should still run at more or less the same speed of a real C64.

Next, let us look at the implementation of controller.executeChunk():

  void executeChunk() {
    int targetCycles = _cpu.getCycles() + 16666;
    do {
      _cpu.step();
      alarms.processAlarms(_cpu.getCycles());
    } while (_cpu.getCycles() < targetCycles);
    memory.renderDisplayImage();
  }

So, in thus method we execute a frame worth of CPU cyles and we render a frame to be displayed. The Array we render to is the one we passed through earlier with controller.setCanvasArray()

Let us finally have a look at the implementation of emCanvas.renderFrame():

  void renderFrame() {
    ctx.putImageData(imageData, 0, 0);
  }

Here we are dealing raw HTML territory. WIth ctx.putImageData, we write to the actual HTML Canvas element we defined earlier.

In Summary

In this post we reworked our C64 emulator to a Controller architecture in order to fix a memory leak. With our new architecture, we don't create a new widget with every frame, but rather maintain a single HTML Canvas element throughout the life cycle of our emulator, to which we render all frames.

In the next post we will add some colors to our C64 frames, together with some borders. We will also emulate the drawing of the border in a more granular fashion, in order to accurately similate the flashing borders while loading the game from a tape image.

As usual, you can find the source for every post on my GitHub page. For this post, you can go here

Until next time! 

Saturday, 17 January 2026

A Commodore 64 Emulator in Flutter: Part 15

Foreword

In the previous post we introduced the CIA as a separate class. Previously mimicked the CIA's operation, by just forcing a hard interrupt every 1/60th of a second, just to get our emulator to work, avoiding the complexities of implementing and scheduling timers.

Thus, in the previous post we delved deeper and implement the CIA. This was actually needed as a precursor to this post where we will be implementing Tape loading functionality, which require more granular operation of the CIA.

Adding Front end Interaction

The logical place to start, is to add functionality to our front end for attaching a tape image. So within main.dart, which is basically our front end code, we add two buttons for the RunningState front end:

...
} else if (state is RunningState) {
              return KeyboardListener(
                focusNode: context.read<C64Bloc>().focusNode,
                autofocus: true,
                onKeyEvent: (event) => {
                  if (event is KeyDownEvent) {
                    context.read<C64Bloc>().add(KeyC64Event(keyDown: true, key: event.logicalKey))
                  } else if (event is KeyUpEvent) {
                    context.read<C64Bloc>().add(KeyC64Event(keyDown: false, key: event.logicalKey))
                  }
                },
                child: Column(
                  children: [
                    Row(
                      children: [
                        IconButton(
                            icon: Icon(Icons.folder),
                            onPressed: () async {
                              context.read<C64Bloc>().add(LoadTapeRequested());
                            }),
                        IconButton(
                            icon: Icon(Icons.play_arrow),
                            onPressed: !state.tapeLoaded ? null : () async {
                              context.read<C64Bloc>().add(PlayTapeRequested());
                            })
                      ],
                    ),
                    RawImage(
                      image: state.image, scale: 0.5),
                ],
              ));
            }
...
As usual, we will add LoadTapeRequested and PlayTapeRequested in c64_event.dart.

Now we need to listen for every event within our Bloc class:

    on<PlayTapeRequested>((event, emit) {
      _tape.playTape();
    });

    on<LoadTapeRequested>((event, emit) async {
      final result = await FilePicker.platform.pickFiles(
        withData: true,
        type: FileType.custom,
        allowedExtensions: ['tap', 't64'],
      );

      if (result == null) return;
      tapeLoaded = true;
      _tape.setTapeImage(result.files.single.bytes!);
    });

For PlayTapeRequested we simulate the press of a play button. _tape is an instance of a class Tape, which we will define later.

With the LoadTapeRequested event, we Basically present a file dialogue where the user select the Tape image from the local file system and also pass it to the _tape instance.

The Tape Class

Let us start to implement the Tape class, which will emulate the functionality of Tape loading.

We start with a simple class:
class Tape implements TapeMemoryInterface {
  late Iterator _tapeImage;
  bool _playSelected = false;
  Alarms alarms;
  TapeInterrupt interrupt;
  Alarm? _tapeAlarm;

  Tape({required this.alarms, required this.interrupt});
}
Before we go into detail on how to implement this class, let us take a step back and think about how Tape loading works on a C64.

On a physical tape, you used back in the day to load games on a C64, you had pulses of varying lengths. It all boils down to basically two types of pulses: A short pulse or a long pulse, which corresponds to either a 0 or 1, which is a bit. The most basic element of data on a computer 😀.

Now, when considering the loading of the data from a physical tape on C64. The end of a pulse is indicated when it changes polarity from positive to negative or vice versa. This change of polarity causes an interrupt on the CPU, via the FLAG pin on CIA1. The tape loading routines inside the Kernal ROM use one of the CIA timers to measure the pulse widths, and decide based on that, if each bit is a zero or a one.

The tape image files you can download from the Internet of old games, are a sequence of pulse widths. With all this info at hand, it is starting to become apparent on what the tape class should do. Using these pulse width, it should schedule an alarm, the same structures we used previously within the CIA, for each pulse, and trigger an interrupt when it lapses. Looking at the private fields I defined above in the Tape class, it also hints towards this.

Let us have at the variable _tapeImage. It is of type Iterator. With this data structure we can basically iterate through the tape image pulse width by pulse width, without worrying about working with a counter that you need to update every time.

At this point we are ready to implement the method _setTapeImage(), which we mentioned previously:

  setTapeImage(type_data.Uint8List tapeData) {
    _tapeImage = tapeData.iterator;
    for (var i = 0; i < 21; i++) {
      _tapeImage.moveNext();
    }
    populateRemainingPulses();
  }

Uint8List variables provides you with an Iterator. In a tape image actual pulse width data actually starts after 21 bytes.

Once we are at the actual pulse width data, we need to know the width of the first pulse. This is the function of the method populateRemainingPulses() :

  populateRemainingPulses() {
    var val = _tapeImage.current;
    if (val != 0) {
      _remainingPulseTicks = val << 3;
      _tapeImage.moveNext();
    } else {
      var byte0 = _tapeImage.current;
      _tapeImage.moveNext();
      var byte1 = _tapeImage.current;
      _tapeImage.moveNext();
      var byte2 = _tapeImage.current;
      _tapeImage.moveNext();
      _remainingPulseTicks = (byte2 << 16) | (byte1 << 8) | byte0;
    }
  }

Here we need to understand the TAP format a bit better. Usually every byte indicates one pulse width. We then need to multiply this value by 8 to get to the width in CPU clock cycles.

The excpetion to the rule is when the byte value is zero. Then the next three bytes indicate the pulse width as an absolute value of CPU clock cycles e.g. no multiplication by 8 necessary then.

You will see that I am assigning the calculated value to a private variable _remainingPulseTicks. We are following a similar approach here than with timers in the CIA which we implemented in the previous post. It functions almost as a count down timer, and is updated with the alarm subsystem.

At this point a key question is: What kicks off the tape loading process? The answer lies in memory location 1 of the C64 memory. This memory location is well known for the location of switching out banks of memory in and out of view. However, this memory location also host two bits for tape control:
  • Bit 4 - Cassette Switch Sense; 1 = Switch Closed
  • Bit 5 - Cassette Motor Control; 0 = On, 1 = Off
The key here is bit 5, turning the Cassette motor on and off, which acts as the starting point for the tape loading process. Bit 4 tells us when the user presses the play button, which we will cover later.

In this let us create the following method in our Tape class:

  @override
  setMotor(bool on) {
    if (on == _currentMotorOn) {
      return;
    }
    _currentMotorOn = on;
    if (on) {
      setupAlarms();
    } else {
      _tapeAlarm!.unlink();
      _remainingPulseTicks = _tapeAlarm?.getRemainingTicks();
    }
  }
This method will be invoke when we write to memory via our Memory class. We will deal with this plumbing later.

If the motor switched on, we need to setup alarms. This is similar what we did with timers in the previous post. Before we move onto the implementation of setupAlarms(), lets have a look at what happend in the else, when the motor is switched off. In that case we unlink the alarm from the list of alarms, and we set _remainingPusleTicks to the remaining ticks of the pulse. This is just to cater for when we resume the motor, we can carry on from where we left on in the pulse.

Now, let us look at setupAlarms():

  setupAlarms() {
    _tapeAlarm ??= alarms.addAlarm( (remaining) => processTapeAlarm(remaining));
    if (_tapeAlarm!.list == null) {
      alarms.reAddAlarm(_tapeAlarm!);
    }
    _tapeAlarm!.setTicks(_remainingPulseTicks);
  }

Here we see the actual use of _remainingPulseTicks, when the motor is resumed.

Let us now have a look at the method processTapeAlarm() :

  processTapeAlarm(int remaining) {
    interrupt.triggerInterrupt();
    populateRemainingPulses();
    _tapeAlarm!.setTicks(_remainingPulseTicks + remaining);
  }
This method is called when the pulse has expired. During this we trigger an interrupt and reschedule the next alarm.

Finally, there is one remaining method we need to implement:

  @override
  int getCassetteSense() {
    return _playSelected ? 0 : 0x10;
  }

This basically provides bit 4 of memory location 1, which will be used by our memory class. More on this later.

Changes to the CIA class

Let us now have a look at the changes required in our CIA class.

There is quite a few changes, so I will just cover it on a high level.

First of all, we will need to implement TimerB as well. The tape loading routine in Kernel ROM uses this timer quite extensively. All I will say here, is that it is basically a copy and paste excercise from TimerA.

Next, we will look at the method hasInterrupts(), which is used by our CPU class to trigger an interrupt:

  hasInterrupts() {
    if (timerAintOccurred && timerAinterruptEnabled) {
      return true;
    } else if (timerBintOccurred && timerBinterruptEnabled) {
      return true;
    } else if (tapeInterruptOccurred && tapeInterruptEnabled) {
      return true;
    } else {
      return false;
    }
  }

You will notice that I have included timerB interrupts and tape interrupts in the check as well.

Next, let us look at the setMem() function in the CIA class:

  setMem(int address, int value) {
...
      case 0xD:
        if ((value & 0x80) != 0) {
          timerAinterruptEnabled = ((value & 1) == 1) ? true : timerAinterruptEnabled;
        } else {
          timerAinterruptEnabled = ((value & 1) == 1) ? false : timerAinterruptEnabled;
        }
        if ((value & 0x80) != 0) {
          timerBinterruptEnabled = ((value & 2) == 2) ? true : timerBinterruptEnabled;
        } else {
          timerBinterruptEnabled = ((value & 2) == 2) ? false : timerBinterruptEnabled;
        }
        if ((value & 0x80) != 0) {
          tapeInterruptEnabled = ((value & 16) == 16) ? true : tapeInterruptEnabled;
        } else {
          tapeInterruptEnabled = ((value & 16) == 16) ? false : tapeInterruptEnabled;
        }
...
  }

As you might remember previously, register D in the CIA is the interrupt mask register. Here we have added timerB and the Tape Inteerupt as some interrupts we can enable or mask out.

Finally, let us look at the getMem() method:

  int getMem(int address) {
...
  case 0xD:
        var value = 0;
        if (timerAintOccurred) {
          timerAintOccurred = false;
          value = value | 0x81;
        }
        if (timerBintOccurred) {
          timerBintOccurred = false;
          value = value | 0x82;
        }
        if (tapeInterruptOccurred) {
          tapeInterruptOccurred = false;
          value = value | 0x84;
        }
        return value;
    }
...
}

Here we are reading the same register from earlier, but reading doesn't return the masks, but the actual interrupts that occurred. Once again, we have added timerB and TapeInterrupt. Once this registere has been read, we also clear all occurred interrupts.

Changes to the Memory Class

Let us now have a look at the changes required in our memory class, for implementing tape loading.

First change is in the setMem() method:

  setMem(int value, int address ) {
    if ((address >> 8) == 0xDC) {
      cia1.setMem(address, value);
    } else if (address == 1) {
      _ram.setInt8(address, value);
      _tape.setMotor((value & 0x20) == 0 );
    } else {
      _ram.setInt8(address, value);
    }
  }
So, as mentioned earlier, bit 5 of memory location 1 controls the tape motor. Here we implement it, so that during a memory write to this location, we call setMotor appropriately.

Next, let us change the getMem() method:

  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else if ((address >> 8) == 0xDC ) {
      return cia1.getMem(address);
    } else if (address == 1) {
      var value = _ram.getUint8(address) & 0xef;
      return value | _tape.getCassetteSense();
    } else {
      return _ram.getUint8(address);
    }
  }

Here we add the Cassette sense bit when reading the byte from memory location one. As mentioned previously, the cassette sense bit indicates if we pressed the play button.

The results

With everything coded, let us see how the screens looks like when we spin up our emulator. At startup, our screen looks like this:


Notice we have two new icons at the top, a folder icon and a play button. We use the folder icon to locate the tape image file from our local file system. Once we have selected a tape image, the play button becomes enabled.

The play button if actually ressembling the play button on a real C64 Datasette unit which was hooked to a C64. So, when the screen shows "Press Play on tape", and you hit the play button, the loading process commenced.

Lets do the whole sequence. With the tape image attached, type LOAD at the flashing cursor, and then hit ENTER. Your screen will now look like this:

Now press play button next to the folder button.

With the play button pressed, the folloing prompts will popup:


After a number of seconds, the screen will look like this:


This is the Hooray moment. When seeing FOUND DAN DARE, or what the file name of the tape image you used, you know you have implemented the tape loading correctly.

One thing that immediately felt off when testing the tape loading, was that it felt much longer than usual before it showed "FOUND...". So I did some comparative benchmarks.

First, to get a realistic time, I measured how long it takes to find the file in the Vice C64 emulator. It was about 17 seconds.

Then I did the measurement in my Flutter emulator. In my Emulator, it took 24 seconds. Quite a lot slower!

I did some further depth investigations. After lots of pain, I discovered that the speed issue was caused by not building the app in release mode. I made a subtle assumption that if I start it IntelliJ, and I start it with the Play button and not with the Debug button, every thing will be optimised. Was I wrong!

Let us see how to run our project in release mode. Firstly, open a terminal window and cd into your project folder. Then run the following command:

flutter build web --release
After the build is finished, you will find the result in build/web with the project. cd into this folder. We now need a web server to serve this, and the easiest one to use is Python. So, within the build folder, run the following command:

python -m http.server 8000
Now access the emulator in the browser with http://localhost:8000/

This time around our times match up with tape loading.

While I was trying to figure out why my emulator was slow, I also discover memory usage was steadily climbing. When I fixed the issue in release, I wondered if the memory leak issue was also fixed. So, I left it running for about half an hour, and then hovered over the tab in Chrome to see the memory usage, and sadly the memory leak was there:


If you leave it running longer, it will eventually go over 1G of memory usage.

In the next post we will tackle this issue.

In Summary

In this post we implemented Tape image loading. An unfortunate issue I encountered was a memory leak.

In the next post I will see if I can fix this memory leak.

Until next time!

Wednesday, 3 December 2025

A Commodore 64 Emulator in Flutter: Part 14

Foreword

In the previous post we managed to interface the keyboard to our C64 Flutter emulator. With that implemented, we were able to enter a simple Basic program into our emulator and running it.

Now, my ultimate goal for writing this emulator, is to be able to run the game Dan Dare in our emulator, loading it from a tape image.

So, to achieve this end goal, the next goal would be for our emulator to be able to load a tape image. On a C64, loading from the tape rely heavily on the features of a CIA (Complex Interface Chip). The features tape loading rely on is connecting access the read head from the tape, timers and interrupts.

Up to now we have been mimicking some of the features of a CIA chip. The address range of the CIA chip is within DC00-DCFF. It immediately comes to mind that in the previous post we implemented two of the registers  of the CIA, DC00 and DC01 for keyboard access.

We also implicitly implemented a timer and interrupts in our emulator, interrupting the CPU every 1/60 of a second, so that the cursor can flash and keyboard entry could work. However, we blindly forced these interrupts just as a quick hack just to get the cursor and keyboard to work. We didn't even consider the values set in the CIA for setting the timer.

However, to implement tape loading we would not be able to get away with a quick hack 😀 We will need to emulate the CIA properly for this purpose.

So, in this post we will implement CIA emulation bit by bit. This will include revisiting our current keyboard and timer interrupt implementation (e.g. doing the 1/60 second interrupt), and implementing it properly with CIA implementation.

We will probably only get to tape emulation in the next post.

Enjoy!

Creating the CIA skeleton

Lets begin our journey by creating a CIA class just as a skeleton. This class will evolve over time to contain all the functionality that a CIA will contain:

class Cia1 {
  setMem(int address, int value) {
    print("setMem ${address.toRadixString(16)} ${value.toRadixString(16)}");
  }

  int getMem(int address) {
    print("getMem ${address.toRadixString(16)}");
    return 0;
  }
}
Here we do something interesting. Every write or read from the CIA address range we log. With this we can see which functionality is used and we can just implement the bare minimum functionality of the CIA chip.

With this exercise we also want to disable the hard coded interrupts happening every 1/60 second to avoid any potential side-effect with our CIA journey:

  step() {
/*
    if ((_cycles > 1000000) &&((_cycles % 16666) < 30) && (_i == 0)) {
      push(pc >> 8);
      push(pc & 0xff);
      push((_n << 7) | (_v << 6) | (2 << 4) | (_d << 3) | (_i << 2) | (_z << 1) | _c);
      _i = 1;
      pc = memory.getMem(0xfffe) | (memory.getMem(0xffff) << 8);
    }
*/
    var opCode = memory.getMem(pc);
    pc++;
    var insLen = CpuTables.instructionLen[opCode];
    ...
  }
Next we need to make an instance of this class and inject into our Memory class:

  C64Bloc() : super(InitialState()) {
    memory.setKeyInfo(this);
    on<InitEmulatorEvent>((event, emit) async {
      final basicData = await rootBundle.load("assets/basic.bin");
      final characterData = await rootBundle.load("assets/characters.bin");
      final kernalData = await rootBundle.load("assets/kernal.bin");
      Cia1 cia1 = Cia1();
      memory.setCia1(cia1);
      ...
    }
    ...
  }
We modify the actual Memory class like this:

class Memory {
...
  late final Cia1 cia1;
...
  setCia1(Cia1 cia1) {
    this.cia1 = cia1;
  }
...
  setMem(int value, int address ) {
    if ((address >> 8) == 0xDC) {
      cia1.setMem(address, value);
    } else {
      _ram.setInt8(address, value);
    }
  }

  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else if ((address >> 8) == 0xDC ) {
      return cia1.getMem(address);

    /*else if (address == 0xDC01) {
      return keyInfo.getKeyInfo(_ram.getUint8(0xDC00));*/
    } else {
      return _ram.getUint8(address);
    }
  }
}

So, every time an address starts with DC we send this access to the CIA instance. You will also see that I have commented out the explicit access to the DC01 register, which we added in the previous post for keyboard access. We will implement this functionality at a later stage into our CIA class.

Now, let us start the emulator and watch the log output:

setMem dc0d 7f
setMem dc00 7f
setMem dc0e 8
setMem dc0f 8
setMem dc03 0
setMem dc02 ff
setMem dc04 95
setMem dc05 42
setMem dc0d 81
getMem dc0e
setMem dc0e 11
setMem dc04 25
setMem dc05 40
setMem dc0d 81
getMem dc0e
setMem dc0e 11
So, let us quickly see what is going on here. With the write to DC0D, we disable all interupts going to the CPU.

The write to address DC00 is for the keyboard stuff, which we don't worry about at the moment.

Next we see the value 8 written to registers DC0E and DC0F. This puts timers A and B in One shot mode. 

Lets skip a couple of memory writes and get to the writing to locations dc04 and dc05. These are registers for setting the duration of timerA. Each count is a count of your 1MHz clock which also drives the 6510. Dc04 is the lo byte of the value and DC05 the high byte. So, this is 4295 hexadecimal which translates to 17045, which is close to that hard coded count we used previously for triggering an interrupt every 1/60 second.

Next, we do an assignment to location DC0D. This is the interrupt register. We see in the value assigned that the least significant bit is set. This is the value that controls interrupts from Timer A. We also see that the most significant bit is set in the assigned value. If this value is a 1 it means enable all interrupts that is a one is this value byte. So, in this case we have enabled interrupts from timer A.

Finally, we see a value that is assigned twice to register DC0E. With the assignment, two things are happening. Firstly bit 4 is set, which means force load the value from the timer from the latch, which in our case would be the hex value 4295. The second thing that happens with bit 0 that is set, is that timer A is finally started.

Something else is also happening subtly. Previous I mentioned we are setting bit 3 to 1, meaning it was in one shot mode. Now, however, we are setting this bit to a zero, which means that timer A will operate in continuous mode. This means after the timer has lapse it will automatically restart, which means we will get periodic interrupts every 1/60 second.

Implementing the Alarm System

With the skeleton implemented for the CIA chip, we should start implementing some meat for it. We will start with timer A. 

Now, timer A is very reliant on the number of cycles the CPU executed. There are other operations that is also dependant on the number of CPU cycles executed, like tape loading, drawing pixels at the right moment on the screen and SID sound generation.

I wrote a number of C64 emulators for other programming languages. I must admit, for all these emulators, I would would do all these operations that is dependant on CPU cycles executed, on every CPU instruction executed. In the beginning, when I just add timers or tape interrupts, I didn't really see issues.

However, as I added more of these operations dependant on CPU cycles executed, I saw performance gradually worsening, especially when I added more of the VIC-II operations.

Now, what I experienced isn't really something new. There is actually a computer science term for this trying to solve the issue, which is Loop fission. The following Wikipedia article explains a bit more about Loop fission:


Basically, when you have a loop where you do a lot of things in a loop iteration, one issue that pops up is that you have more cache misses, and your CPU needs to fetch data from slower RAM more often. By splitting the loop into more separate loops cache misses should be reduced and therefore improve performance.

I have digged a bit into the source code of the Vice Emulator and overall they also overall try to break things into separate loops. They have the whole concept of alarms. For instance everything VIC-II scan line is 63 cycles. So, instead of rendering a bit of a line after CPU instruction, they set an alarm that will trigger 63 cycles into the future. So, with every CPU instruction execution, it will check if 63 cycles has passed. Only when the 63 cycles has passed, then you execute an alarm handler that will render the full line.

Of course, during the course of the 63 cycles, something might change like the border color, in which the line will not only show one border color. In such cases when writing to such a register, one should keep record when the color change.

Lets start to create a Alarm subsystem for our emulator. We start with a brief outline:

class Alarms {
  final LinkedList<Alarm> _alarmList = LinkedList<Alarm>();

  Alarms();

  Alarm addAlarm(Function(int remainder) callback) {
    var alarm = Alarm._(this, callback);
    _alarmList.add(alarm);
    return alarm;
  }

}
So, here we have a class containing all our alarms. Internally all the alarms is store in a linked list, which is a data structure in Dart. We will visit this in a while.

There is also a method for adding a alarm with a callback, so when the alarm has expired you can call the callback to do some stuff. The remainder parameter indicates how much cycles we have gone over the alarm threshold when a cpu instruction has executed.

Lets now focus a bit on the LinkedList story. So, we have a declaration LinkedList<Alarm>(). LinkList is one of Flutter's build in classes which is a generic, which you need to type when you make an instance. In this case we are saying we will have a LinkedList containing instances of Alarm.

Now usually with generics, You can define Alarm in anyway you want. However, with a LinkedList, things are a bit more tricky, because every node needs to point to the next and previous node. This is just how a LinkedList is implemented.

Luckily you don't need to worry about implementing all this yourself. You can just let our Alarm class extends LinkedListEntry, then all this will happen automatically:

final class Alarm extends LinkedListEntry<Alarm> {
  late final Alarms _alarms;
  late final Function(int remainder) _callback;

  Alarm._(Alarms alarms, Function(int remainder) callback ) {
    _alarms = alarms;
    _callback = callback;
  }

}

Let us now add some more meat to our alarm class:

final class Alarm extends LinkedListEntry<Alarm> {
  var _targetClock = 0;
...
  setTicks(int ticks) {
    _targetClock = _alarms.getCurrentCpuCount() + ticks;
  }

  getRemainingTicks() {
    return _targetClock - _alarms.getCurrentCpuCount();
  }

  getTargetClock() {
    return _targetClock;
  }

  processAlarm(int remainder) {
    _callback(remainder);
  }
}
Basically I have added some methods for keeping track of how far we are from triggering a alarm. The processAlarm will be invoked when the alarm is triggered.

Now, let us add some meat to our Alarms class:

class Alarms {
  final LinkedList<Alarm> _alarmList = LinkedList<Alarm>();
  int _cpuCount = 0;

  Alarms();

  Alarm addAlarm(Function(int remainder) callback) {
    var alarm = Alarm._(this, callback);
    _alarmList.add(alarm);
    return alarm;
  }

  reAddAlarm(Alarm alarm) {
    _alarmList.add(alarm);
  }

  int getCurrentCpuCount() {
    return _cpuCount;
  }

  processAlarms(int cpuCycles) {
    _cpuCount = cpuCycles;
    for (Alarm item in _alarmList) {
      if (item.getRemainingTicks() <= 0) {
        item.processAlarm(item.getRemainingTicks());
      }
    }
  }
}
The key method added here is processAlarms(). This method loops through the alarms, checking which expired and then calling its callback.

Another interesting method is reAddAlarm(). It will happen often that we will stop a timer, at which we will remove it from the alarms queue, so it isn't triggered again. However, there might be a case where we want to start the timer again, at which we will use reAddAlarm(), to add it back to the queue so it is evaluated again for expiry.

Wiring everything together

With all the building blocks created in the previous section, lets now put them together. In C64Bloc let us do some initialisation:

class C64Bloc extends Bloc<C64Event, C64State> implements KeyInfo {
  final Memory memory = Memory();
  final List<int> matrix = [0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff];
  final FocusNode focusNode = FocusNode();
  late final Cpu _cpu = Cpu(memory: memory);
  late final Alarms alarms = Alarms();
  type_data.ByteData image = type_data.ByteData(200*200*4);
  int dumpNo = 0;
  int frameNo = 0;
  Timer? timer;
...
  C64Bloc() : super(InitialState()) {
    on<InitEmulatorEvent>((event, emit) async {
      final basicData = await rootBundle.load("assets/basic.bin");
      final characterData = await rootBundle.load("assets/characters.bin");
      final kernalData = await rootBundle.load("assets/kernal.bin");
      Cia1 cia1 = Cia1(alarms: alarms);
      cia1.setKeyInfo(this);
      memory.setCia1(cia1);
      memory.populateMem(basicData, characterData, kernalData);
      _cpu.setInterruptCallback(() => cia1.hasInterrupts());
...
  }
...
}
I have added a field for our alarms. I am also now injecting an Cia1 instance into our memory.

In our CPU class we also now use a InterruptCallBack, which our CPU class will call to see if any interrupts has occured. Our Cia1 instance will provide this info.

In our main event processing loop, we also make a small change:

    on<RunEvent>((event, emit) {
      timer = Timer.periodic(const Duration(milliseconds: 17), (timer) {
          int start = DateTime.now().millisecondsSinceEpoch;
          int targetCycles = _cpu.getCycles() + 16666;
          do {
            _cpu.step();
            alarms.processAlarms(_cpu.getCycles());
          } while (_cpu.getCycles() < targetCycles);
...
});
    });

After every CPU we process the alarms with the current cpu cycles.

Expanding the CIA1 class

Earlier we created a sceleton for the CIA1 class. We will now expand this class further.

As usual we start with some initialisation:


class Cia1 {
  int timerAlatchLow = 0xff;
  int timerAlatchHigh = 0xff;
  int timerAvalue = 0xffff;
  Alarms alarms;
  Alarm? timerAalarm;
  bool timerAstarted = false;
  bool timerAoneshot = false;
  int registerE = 0;
  int register0 = 0;
  bool timerAinterruptEnabled = false;
  bool timerAintOccurred = false;
  late final KeyInfo keyInfo;


  Cia1({required this.alarms});

  setKeyInfo(KeyInfo keyInfo) {
    this.keyInfo = keyInfo;
  }
 ...
}
The meaning of these private variables will became clear in a bit.

Next, let us implement the following method:

  updateTimerA() {
    if (!timerAstarted) {
       return;
    }
    if (timerAalarm != null) {
      timerAvalue = timerAalarm!.getRemainingTicks();
    }

  }

timerAValue is the value of count down timerA in the CIA. To increase locality, we dont update this value with the execution of every CPU instruction. Instead, we wrote this method that updates the value when the CPU reads the value of this register.

Next we add these methods:

  hasInterrupts() {
    if (timerAintOccurred && timerAinterruptEnabled) {
      return true;
    } else {
      return false;
    }
  }

  processTimerAalarm(int remaining) {
    // Do interrupt
    timerAintOccurred = true;
    if (timerAoneshot) {
      timerAalarm?.unlink();
      timerAstarted = false;
      return;
    }
    timerAalarm!.setTicks((timerAlatchLow | (timerAlatchHigh << 8)) + remaining);
  }

Here we deal with when the timer expire and we set interrupts. We remove the timer from the alarm list if it is oneshot. Otherwise we schedule the running of the timer again.

Finally, let us add methods for reading and writing to the CIA registers:

  setMem(int address, int value) {
    print("setMem ${address.toRadixString(16)} ${value.toRadixString(16)}");
    value = value & 0xff;
    address = address & 0xf;
    switch (address) {
      case 0x0:
        register0 = value;
      case 0x4:
        timerAlatchLow = value;
      case 0x5:
        timerAlatchHigh = value;
      case 0xD:
        if ((value & 0x80) != 0) {
          timerAinterruptEnabled = ((value & 1) == 1) ? true : timerAinterruptEnabled;
        } else {
          timerAinterruptEnabled = ((value & 1) == 1) ? false : timerAinterruptEnabled;
        }
      case 0xE:
        var startTimerA = ((value & 1) == 1) ? true : false;
        var forceTimerA = ((value & 16) != 0) ? true : false;
        updateTimerA();
        if (forceTimerA) {
          timerAvalue = timerAlatchLow | (timerAlatchHigh << 8);
        }
        var startingTimerA = startTimerA & !timerAstarted;
        var stoppingTimerA = !startTimerA & timerAstarted;
        var alreadyRunningTimerA = startTimerA && timerAstarted;
        if (startingTimerA || (alreadyRunningTimerA && forceTimerA)) {
          // schedule timer on alarm
          timerAalarm ??= alarms.addAlarm( (remaining) => processTimerAalarm(remaining));
          if (timerAalarm!.list == null) {
            alarms.reAddAlarm(timerAalarm!);
          }
          timerAalarm!.setTicks(timerAvalue);
          // set timer as started
        } else if (stoppingTimerA) {
          //unschedule timer A
          timerAalarm!.unlink();
        }
        timerAoneshot = (value & 8) != 0;
        timerAstarted = startTimerA;
        registerE = value;
      default:
        // throw "Not implemented";
    }

  }

  int getMem(int address) {
    print("getMem ${address.toRadixString(16)}");
    updateTimerA();
    address = address & 0xf;
    switch (address) {
      case 0x0:
        return register0;
      case 0x1:
        return keyInfo.getKeyInfo(register0);
      case 0x4:
        return timerAvalue & 0xff;
      case 0x5:
        return timerAvalue >> 8;
      case 0xD:
        if (timerAintOccurred) {
          timerAintOccurred = false;
          return 0x81;
        } else {
          return 0;
        }
      case 0xE:
        var result = registerE & 0x06;
        result = result | (timerAstarted ? 1 : 0);
        result = result | (timerAoneshot ? 8 : 0);
        return result;
    }
    return 255;
  }

You will see each time we read from the CIA1 we update the timer. In the write function we also adjust the alarms accordingly if we chane the state of the times.

Changes to the CPU class

There is finally just a small change we need to do to our CPU. Previously in our CPU we hardwired an interrupt that happened every 1/60th of a second. However, now we have implemented an CIA class, we need to change how interrupts works.

Here is the highlighted changes:


class Cpu {
...
  late final Function() _interruptCallback;
...
  setInterruptCallback(Function() callback) {
    _interruptCallback = callback;
  }
...
  step() {
    if (_interruptCallback() & (_i == 0)) {
      push(pc >> 8);
      push(pc & 0xff);
      push((_n <<< 7) | (_v << 6) | (2 << 4) | (_d << 3) | (_i << 2) | (_z << 1) | _c);
      _i = 1;
      pc = memory.getMem(0xfffe) | (memory.getMem(0xffff) << 8);
    }
...
  }
...
}
Now, we call the interruptCallBack, which basically tie back to the CIA1 class we created. Also, we only invoke an interrupt only when the Inteerupt disable flag is not set.

In Summary

In this post we introduced the CIA as a separate class. We also removed the hardcoded mechanism which trigger an interrupt every 1/60th of a second, and rather let the CIA schedule the interrupts as programmed by machine language.

In the next post we will start to implement tape loading from a raw tape image.

Until next time!

Thursday, 22 May 2025

A Commodore 64 Emulator in Flutter: Part 13

Foreword

In the previous we managed to boot the C64 system with a screen showing the contents of screen memory in real time. It booted with the welcome message and a flashing cursor.

In this post we will provide some keyboard interfacing with our C64 emulator. We will approach this in a very experimental fashion, exploring how Flutter itself work with keyboard interfacing in a app. Then we will try to see if we can get keyboard interfacing to work in our app, and finally see if our emulator can work with the keyboard.

Enjoy!

KeyboardListener in Flutter

What we want for our emulator is basically to tell when a key is held down, and when it is released. Flutter provides this for us via a KeyboardListener. From the Flutter documentation it is not so straightforward on how to use this, so I looked around for a worked example on the Internet and found the following:

https://medium.com/@wartelski/how-to-flutter-keyboard-events-keyboard-listener-in-flutter-web-0c36ab9654a9

The following snippet is the core of the example:

With this example we can basically catch it when a key is down. Now all is well in this example, except for we have a final variable for _focusNode. This is, however, only a thing we can do with a StatefulWidget. In our case, however, we are within a StatelessWidget, where we cannot do such things.

In our case we would place the focusNode in our Bloc. Probably not the best place if one think about separation of concerns, but for now it is the best place if we want to keep a single instance of FocusNode alive. So, we do the following changes:

class C64Bloc extends Bloc<C64Event, C64State> {
  final Memory memory = Memory();
  final FocusNode focusNode = FocusNode();
...
}
And now we go further and wrap our RawImage in a KeyboardListener:

...
           } else if (state is RunningState) {
              return KeyboardListener(
                focusNode: context.read<C64Bloc>().focusNode,
                autofocus: true,
                onKeyEvent: (event) => {
                  if (event is KeyDownEvent) {
                    if (event.logicalKey == LogicalKeyboardKey.keyM) {
                      print("The m key is pressed!!")
                    }
                  } else if (event is KeyUpEvent) {
                    if (event.logicalKey == LogicalKeyboardKey.keyM) {
                      print("The m key is released!!")
                    }
                  }

                },
                child: RawImage(
                    image: state.image, scale: 0.5),
              );
            } else {
...
So, here we listen for the "M" key and write out to the console when this key is pressed and released.

Simulating a key press in our emulator

Next, let us see we can simulate a key press in our emulator. To figure out how let us dig a bit into how the keyboard is implemente in hardware.

Firstly, a keyboard is arranged a matrix of rows and columns, and where a row and column meets, there is a key switch. If the switch is pushed, it will short the row to ground. To see if a switch is pressed is a two step process. You need to energised each column in turn and see which columns are shorted to ground.

Firstly, to get an idea how the matrix of a C64 is arranged, the following diagram is helpful:

Now, the big question is which memory locations do we need to manipulate and read to see which key was pressed.

The following web link provide us with a memory map which will aid in finding these memory locations:

Scrolling down, we eventually find the place where it is dealt with the keyboard:


As you can see, both these ports is used by the joystick ports and the keyboard. The first piece of info that is useful for us, is the following at memory location DC00:

  • Bit #x: 0 = Select keyboard matrix column #x.

So, this is actually where we energise one or more columns. In the matrix diagram, this is actually the parts labeled A - H. Each of these are assigned a bit number (0 - 7) in the byte we write to this port.

The next piece of useful info is at memory location DC01:

  • Bit #x: 0 = A key is currently being pressed in keyboard matrix row #x, in the column selected at memory address $DC00.

So, we select one or more columns in location DC00 and within the selected column, we can read via location DC01 which rows in that column is selected.

Let us now see how we can emulate a keypress in our emulator. At this point we are able to catch keys from the keyboard with a KeyboardListener. In our KeyboardListener we can basically trigger events for which we listen for in our Bloc.

First let us define a event class which we will trigger:

class KeyC64Event extends C64Event {
  final bool keyDown;
  KeyC64Event({required this.keyDown});
}
So, we will either trigger an event with keyDown = true, when a key is pressed, or an event with keyDown = false, when a key is released.

With this in mind, let us modify our KeyboardListener:

            } else if (state is RunningState) {
              return KeyboardListener(
                focusNode: context.read<C64Bloc>().focusNode,
                autofocus: true,
                onKeyEvent: (event) => {
                  if (event is KeyDownEvent) {
                    if (event.logicalKey == LogicalKeyboardKey.keyM) {
                      context.read<C64Bloc>().add(KeyC64Event(keyDown: true))
                    }
                  } else if (event is KeyUpEvent) {
                    if (event.logicalKey == LogicalKeyboardKey.keyM) {
                      context.read<C64Bloc>().add(KeyC64Event(keyDown: false))
                    }
                  }

                },
                child: RawImage(
                    image: state.image, scale: 0.5),
              );
            } else {
Next, let us listen for these events in our Bloc:

class C64Bloc extends Bloc<C64Event, C64State> {
...
  bool keyDown = false;
...
  C64Bloc() : super(InitialState()) {
...
    on<KeyC64Event>((event, emit) {
      keyDown = event.keyDown;
    });
...
  }
...
}
So, within our Bloc, keyDown is a variable keeping track of whether the key is up or down, which in this case is the state of the M key on our keyboard. We will make use of this variable to simulate a key stroke in our emulator.

Now, the action simulation of a key press should happen in our Memory class when a read is done from address DC01, we should consider which column is enable via address DC00, and see if in the column enabled, that there is indeed one of the keys held down and send back a value that reflects this.

So, we have a situation here where Memory wants some info from our Bloc class in which it lives, but we dont want to provide Memory for with all the state of the Bloc class. To achieve this we need to create an interface with methods returning the info the Memory needs.

Here is the interface:

abstract class KeyInfo {
  int getKeyInfo(int column);
}
And now let us implement the interface in our Bloc:

class C64Bloc extends Bloc<C64Event, C64State> implements KeyInfo {
...
  @override
  int getKeyInfo(int column) {
  }
...
}
So, given the list of columns energised, we return the rows. Now, as an exercise, lets say if we press the M key on the keyboard, which we currently check for in our KeyBoardListener, we want our C64 emulator to also show an M.

So, let us look at the keyboard matrix diagram again to see where the M key is located. The M key is located at column E and row 4. So with the bit counting starting at column A, the bit number of column E is 4.  So we are interested in column bit 3 and row bit 4. 

With this in mind, Let us give getKeyInfo() some meat:

  @override
  int getKeyInfo(int column ) {
    if (!keyDown) {
      return 0xff;
    }
    if ((column & 0x10) == 0) {
      return 0xef;
    } else {
      return 0xff;
    }
  }
One thing to remember here is that when working with the keyboard matrix, we don't work with the default assumption that one means active, but the other way around. So a zero means in the column byte that a certain column is energised, and a zero in the row byte means that the switch for that bit position is held down.

With all this written, let us make our Memory class make use of it:

class Memory {
...
  late final KeyInfo keyInfo;
...
  setKeyInfo(KeyInfo keyInfo) {
    this.keyInfo = keyInfo;
  }
...
}
So, we can pass our keyInfo object to our Memory class. We assign the keyInfo when our Bloc class is instantiated:
class C64Bloc extends Bloc<C64Event, C64State> implements KeyInfo {
...
  C64Bloc() : super(InitialState()) {
    memory.setKeyInfo(this);
...
  }
...
}
Finally, let us use keyInfo our Memory class:
...
  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else if (address == 0xDC01) {
      return keyInfo.getKeyInfo(_ram.getUint8(0xDC00));
    } else {
      return _ram.getUint8(address);
    }
  }
...
So, when address DC01 is read from our Memory we invoke getKeyInfo and passing it the contents of memory location DC00. At the moment we will fetch location DC00 from RAM.

Now, when we build and run, and press the M key a couple of times, the screen looks like as follows:

We managed to implement the implement a simple key press!

Implementing the full keyboard

Let us now look at implementing a full keyboard, or at least sufficient keys, like the alphabet, digits and some symbols, just to type a simple basic program within our emulator.

Up to now we kept track only of a single whether it is down via keyDown, but now we need to keep track of whether several keys are held down. So, we need like kind of a boolean matrix, or to put it more plainly, an array of eight bytes. Each column is a byte:

  final List<int> matrix = [0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff];
I mentioned earlier that in a real C64, a zero means the key is selected. So, this array filled with the value 0xff's, means no key is held down at the moment.

Previously in our Main class, we just looked for the M key being pressed and released, and then pass this event to our Bloc class. Obviously now we will need to remove this explicit check for the M key and pass all key events to our Bloc class. This necessitates us to modify our KeyC64Event class to say which key was pressed not if a key was pressed:

class KeyC64Event extends C64Event {  
  final bool keyDown;
  final LogicalKeyboardKey key;
  KeyC64Event({required this.keyDown, required this.key});
}
With this in place our Bloc class will receive indeed a key code, but what make only sense in the Flutter world. We need a kind of a lookup table or a map to convert a Flutter keyboard scan code to a C64 keyboard scan code. So for this purpose we create the following map, preferably in a separate file:

Map<LogicalKeyboardKey, int> keyMap = Map.unmodifiable({
  LogicalKeyboardKey.keyA : 0x0A,
  LogicalKeyboardKey.keyB : 0x1C,
  LogicalKeyboardKey.keyC : 0x14,
  LogicalKeyboardKey.keyD : 0x12,
...
  LogicalKeyboardKey.digit0 : 0x23,
  LogicalKeyboardKey.digit1 : 0x38,
  LogicalKeyboardKey.digit2 : 0x3B,
  LogicalKeyboardKey.digit3 : 0x08,
  LogicalKeyboardKey.digit4 : 0x0B,
  LogicalKeyboardKey.digit5 : 0x10,
  LogicalKeyboardKey.digit6 : 0x13,
  LogicalKeyboardKey.digit7 : 0x18,
  LogicalKeyboardKey.digit8 : 0x1B,
  LogicalKeyboardKey.digit9 : 0x20,
...
  LogicalKeyboardKey.space : 0x3c,
  LogicalKeyboardKey.shiftLeft : 0x0F,
  LogicalKeyboardKey.enter : 0x01,
...
});
With this map cretaed, we can now modify our listener a bit for the event KeyC64Event:

    on<KeyC64Event>((event, emit) {
      int c64KeyCode = keyMap[event.key] ?? 0;
      int col = c64KeyCode >> 3;
      int row = 1 << (c64KeyCode & 7);
      if (!event.keyDown) {
        matrix[col] |= row;
      } else {
        matrix[col] &= ~row;
      }
    });
We start off by looking up the C64 scancode, given the Flutter key code. Now bit 5-3 of the scan code is the column and bits 2-0 is the row.

In the if statement, if the key is released, we OR the bit position with. Please it is pressed, we mask off the bit position.

Now we need to modify the method getKeyInfo, which is the method our Memory class calls when reading Address DC01. When calling this method, we tell the method which columns needs to be considered. Potentially two ore more columns can be selected, in which case we need to do a kind of a OR operation, to reduce the selected columns to one.

We can express this reducing in a simple for loop:

  @override
  int getKeyInfo(int column ) {
    int result = 0xff; // Accumulator for the OR'ed numbers

    for (var row in matrix) {
      if ((column & 1) == 0) {
        result &= row; 
      }

      column = column >> 1;
    }

    return result;
  }
We are shifting the column right eveyrtime, looking everytime if the lowest bit is zero. If it is zero, we know the column is selected. We and all the selected columns together. If, for any row position in a selected column there is a zero, then the final value for that bit position would be zero. A zero means there was one or more keys selected in that bit position for the selected columns.

Now, let us see if we can write a simple program, with the keyboard input enabled:

Next, let us run the program:
We have a working program!

In Summary

In this post we implemented keyboard input and write a small test program.

In the next post we will start implementing tape loading, from a tape image.

Until next time!



Monday, 28 April 2025

A Commodore 64 Emulator in Flutter: Part 12

Foreword

In the previous post we successfully ran the Klaus Dormann Test Suite.

In this post we will be trying to boot the C64 system with its ROM's.

Enjoy!

Inserting the ROMS

Inserting the ROM's... Now that sounds like plugging and unplugging game cartridges 😂. In our case, this means loading the C64 ROM images from files into memory, and making sure our emulated CPU can access the contents.

We start by dumping the ROM images into the asset folder:


Usually for the C64 ROMS you get for download on the internet, the file names have always some version numbers in it. In my case, I just gave them simple names. Also, notice that I have removed the file program.bin we used in previous posts.

  C64Bloc() : super(InitialState()) {
    on<InitEmulatorEvent>((event, emit) async {
      final basicData = await rootBundle.load("assets/basic.bin");
      final characterData = await rootBundle.load("assets/characters.bin");
      final kernalData = await rootBundle.load("assets/kernal.bin");
      memory.populateMem(basicData, characterData, kernalData);
...
So, we load the different ROM's, waiting for the loading of each file to complete, and then going to the next file for loading.

Now, you might notice that from previous posts, that we now pass more ROMS to memory.populateMem. So let us delve a bit deeper in our Memory class to see what changes are required:

...
  late type_data.ByteData _basic;
  late type_data.ByteData _character;
  late type_data.ByteData _kernal;
...
  final type_data.ByteData _ram = type_data.ByteData(64*1024);
...
  populateMem(type_data.ByteData basicData, type_data.ByteData characterData,
      type_data.ByteData kernalData) {
    _basic = basicData;
    _character = characterData;
    _kernal = kernalData;
  }
...
Fairly straightforward. Each ROM that is passed through, we store in a variable.

Something else we do, is do define a 64KB array that will act as our RAM, the significant characteristic of the C64.

So, next, let us add some address mapping:

  setMem(int value, int address ) {
    _ram.setInt8(address, value);
  }

  int getMem(int address) {
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else {
      return _ram.getUint8(address);
    }
  }

For memory writes, we write straight to the ram array. For reads, we do it the usual C64 setup:

  • Addresses A000-BFFF: We read from basic ROM
  • Addresses E000-EFFF: We read from Kernal ROM
  • All other addresses we read from RAM

Booting the C64 System

We are now close to booting the C64 system with all its ROM's.

First things first. Our periodic timer current runs once every second, executing 1 000 000 millions cycles worth of CPU instructions. However, we want to reduce to a 60th of a second, so that later on we can draw a frame every time our time executes, yielding 60 frames a second, which is the frame rate of a native C64:

    on<RunEvent>((event, emit) {
      timer = Timer.periodic(const Duration(milliseconds: 17), (timer) {
          int targetCycles = _cpu.getCycles() + 16666;
          do {
            _cpu.step();
          } while (_cpu.getCycles() < targetCycles);
      });
    });

Every time we also execute 16666 cycles, which is the number of CPU cycles in a 1/60th of a second.

To boot the C64 ROM's, is actually fairly straightforward. You basically set the program counter to the value of the reset vector. For his we just we just create the following method:

  reset() {
    pc = memory.getMem(0xfffc) | (memory.getMem(0xfffd) <<< 8);
  }
So, here we populate the program counter with the reset vector at adress FFFC and FFFD.

We still need to call this method. We do this just after we have loaded all the ROM's:

  C64Bloc() : super(InitialState()) {
    on<InitEmulatorEvent>((event, emit) async {
      final basicData = await rootBundle.load("assets/basic.bin");
      final characterData = await rootBundle.load("assets/characters.bin");
      final kernalData = await rootBundle.load("assets/kernal.bin");
      memory.populateMem(basicData, characterData, kernalData);
      _cpu.reset();
...
Now, we can finally boot the C64 System. We wait for a minute, and then hit stop to view the registers:


We see the program stabilise at address FF61. Let us have a look at the Kernal disassembly listing what is going on at this address:
 
As seen here, we get stuck in a loop with the memory address D012 not changing. We can expect that such a thing can happen at the moment, because with our current emulator setup that address will write and read to raw RAM, and thus nothing will happen.

In reality D012 maps to the VIC-II display registers and provide info on which rasterline on the screen we are currently at. It is quite an undertaking to implement such a raster counter, so for now, let us see if we can quickly hack together something, so that the Address D012, can just change sometimes, just to get past that loop. Here is my quick hack:

  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else {
      return _ram.getUint8(address);
    }
  }

So, the hack is simply just a counter that keeps count of the number of reads, and we look at bit 9 of the counter. If it is set, we return a 1, otherwise a zero. In effect we will have a 1 for about a thousand counts, and then a zero for another thousand counts.

Let us now see where our program counter lands. This time it lands at E5D4. Lets look again at the disassembly listing for this address:


Here it seems we are in a waiting loop, waiting for the enter key to be pressed on the keyboard. I think this is a pretty decent place for our emulator to be and probably means that all initialisation has been completed, and we should have the welcome message in screen memory.

We want to check if the welcome message is in screen memory, but our debug dump current just show the first two pages of memory. We could, however, inspect the ram array in debug mode in Intellij.

So, startup the emulator in debug mode and press the play button to let the C64 system run at full speed. Wait for about a minute and then put a breakpoint on the first line of the getMem method in our Memory class. With the system running at full speed, that breakpoint will be hit almost instantaneously.

Open up an evaluate window and enter the following:


Here we inspect address 1024 of screen memory, which is the first byte of it. In this case the value is 32, which is a space. Inspecting addresses further in screen memory will reveal the welcome message.

In the next section we will render the contents of the screen at real time.

Rendering screen memory

We will now try and render screen memory in real time, showing a display similar to the C64 in text mode.

We ultimately need a mechanism that would allow us to work efficiently with image data on a pixel level. Flutter ultimately provide it to us via the RawImage widget, together with ui.Image with which you can work with an array of RGBA values.

Let us unpack this a bit. Let us start working with the raw array of RGBA values, where we will produce the frame for display, based on the screen memory and the character ROM.

Both the character rom and screen memory is present in our Memory class, so for now we will do the frame rendering in that class.

Firstly, let us define the byte buffer we are going to use over and over again:

class Memory {
...
    final type_data.ByteData image = type_data.ByteData(320*200*4);
...
}
So, as we can see, we have a resolution 3200x200, which is the resolution of a real C64 screen. We multiply the end result by 4, because each pixel is bytes in our buffer, one byte each for red, blue green and the alpha channel.

Next, let us write method for rendering a screen to the byte array:

  type_data.ByteData getDisplayImage() {
    const rowSpan = 320 * 4;
    for (int i = 0; i < 1000; i++ ) {
      var charCode = _ram.getUint8(i + 1024);
      var charAddress = charCode << 3;
      var charBitmapRow = (i ~/ 40) << 3;
      var charBitmapCol = (i % 40) << 3;
      int rawPixelPos = charBitmapRow * rowSpan + charBitmapCol * 4;
      for (int row = /*charAddress*/ 0 ; row < /*charAddress +*/ 8; row++ ) {
        int bitmapRow = _character.getUint8(row + charAddress);
        int currentRowAddress = rawPixelPos + row * rowSpan;
        for (int pixel = 0; pixel < 8; pixel++) {
          if ((bitmapRow & 0x80) != 0) {
              image.setUint32(currentRowAddress + (pixel << 2), 0x000000ff);
          } else {
              image.setUint32(currentRowAddress + (pixel << 2), 0xffffffff);
          }
          bitmapRow = bitmapRow << 1;
        }
      }

    }
    return image;
  }

So, here we loop through all thousand characters codes in screen memory and rendering everyone. Each character code is actually an index into character ROM, every character is its own 8x8 pixel bitmap.

Now, this method is invoke everytime when our perioc timer runs:

...
import 'dart:ui' as ui;
...
   on<RunEvent>((event, emit) {
      timer = Timer.periodic(const Duration(milliseconds: 17), (timer) {
          int start = DateTime.now().millisecondsSinceEpoch;
          int targetCycles = _cpu.getCycles() + 16666;
          do {
            _cpu.step();
          } while (_cpu.getCycles() < targetCycles);
          ui.decodeImageFromPixels(memory.getDisplayImage().buffer.asUint8List(), 
             320, 200, ui.PixelFormat.bgra8888, setImg);
      });
    });
ui.decodeImageFromPixels is a menthof within the dart:ui library of flutter. It will create an Image object from a pixel buffer, which in this case is the rendered screen buffer.

We also pass ui.PixelFormat.bgra8888 as a parameter, indicating our buffer is in the format with byte each for red, green, blue, green and alpha.

We also pass a callback method, setImg in this case, which will be called once we have the generated Image object.

So, let us implement this callback method:

    void setImg(ui.Image data) {
      emit(RunningState(image: data, frameNo: frameNo++));
    }
Here you can see we are emitting the image in a state object, so our BlocBuilder can pick up the change and render the image. You will also notice that we have a frameNo Property that we modify with each new image, so our BlockBuilder can easily pick up the change.

You will recall that from previous posts, that we did define RunningState previously, which we applied changes to now. Here is the revised version:

class RunningState extends C64State {
  RunningState({required this.image,
    required this.frameNo});

  final int frameNo;
  final ui.Image image;
  @override
  List<Object> get props => [frameNo];
}
Finally, let us modify our BlocBuilder:

...
        body: BlocBuilder<C64Bloc, C64State>(
          builder: (BuildContext context, state) {
            if (state is InitialState) {
              return const CircularProgressIndicator();
            } else if (state is DataShowState) {
              return Column(
                children: [
                  Text(getRegisterDump(state.a, state.x, state.y, state.n,
                      state.z, state.c, state.i, state.d, state.v, state.pc)),
                  Text(
                    getMemDump(state.memorySnippet),
                    style: const TextStyle(
                      fontFamily: 'RobotoMono', // Use the monospace font
                    ),
                  ),
                ],
              );
            } else if (state is RunningState) {
              return RawImage(
                  image: state.image, scale: 0.5);
            } else {
              return const CircularProgressIndicator();
            }
          },
        ),
...
So, if the state is RunningState, we return a RawImage widget, which will be displayed on the screen. We pass the image in the state to the RawImage widget. We also use a scale of 0.5, with which we basically doubles the displayed size. The native resolution of 320x200 of a C64 frame display very small on a modern display, so at least with the scale, it can appear bigger.

With everything coded we can now give it a test run. The startup sequence appear to take more or less the same time as a real C64, and eventually the welcome screen appear:

We are making progress, but still, there is no flashing cursor.

Getting the cursor to flash

Let us see if we can get the cursor to flash. 

If you go down the bowls of the C64 system, you will found that the core of a standard C64 system that just started up, is that there is a timer interrupt every 60th of a second. This interrupt does a couple of things, like checking if any key was pressed or released and updating the status of the cursor.

So, let us see if we we can put a hack together, that off the bat we just force an interrupt every 60th of a second, without worrying for now to implement emulation of the full CIA chip with a timer.

The easiest way is in the step() method of our CPU class:

  step() {
    if ((_cycles > 1000000) &&((_cycles % 16666) < 30) && (_i == 0)) {
      push(pc >> 8);
      push(pc & 0xff);
      push((_n << 7) | (_v << 6) | (2 << 4) | (_d << 3) | (_i << 2) | (_z << 1) | _c);
      _i = 1;
      pc = memory.getMem(0xfffe) | (memory.getMem(0xffff) << 8);
    }
...
  }
So, we wait for a second before triggering interrupts in 1/60 second intervals. With the change, the cursor actually flashes:

In Summary

In this post we managed to boot the C64 system with all its ROMs and managed to render screen memory in real time, showing the welcome message and the flashing cursor. 

The source code for post is available in the following Github tag: https://github.com/ovalcode/c64_flutter/tree/c64_flutter_part12

In the next post we will add some keyboard interaction with our emulator.

Until next time!