Sunday, 10 May 2026

A Commodore 64 Emulator in Flutter: Part 17

Foreword

In the previous post, we fixed a memory leak in our emulator. It was fixed by changing the emulator to use a HTML Canvas, and reusing the same canvas instance with every frame.

In this post we will be implementing proper border rendering, so that we can properly emulate the flashing borders while we load the game Dan Dare from a tape image.

In order to achieve this we will start to implement the VICII in our emulator with its registers in this post.

Implementing the VICII class

Lets start our discussion, by creating an outline for our VICII class:

class Vicii {
  final type_data.ByteData _regs = type_data.ByteData(0x50);
 
  int getReg(int address) {
    return _regs.getUint8(address & 0x3f);
  }

  setReg(int address, int value) {
    _regs.setInt8(address & 0x3f, value);
  }

}
We have declared 80 local registers for our VICII class. Also, we have created getReg() and setReg() registers so our Memory class can alter the contents of the registers.

We will receive the full address externally, but internally we will just look at the lower 6 bits, by adding the address with 0x3f. 

Inside the Memory class we will map the VICII in our memory space as follows:

  int getMem(int address) {
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else if ((address >> 8) == 0xDC ) {
      return cia1.getMem(address);
    } else if ((address >> 8) == 0xD0) {
      return vic.getReg(address);
    } else if (address == 1) {
      var value = _ram.getUint8(address) & 0xef;
      return value | _tape.getCassetteSense();
    } else {
      return _ram.getUint8(address);
    }
  }

  setMem(int value, int address ) {
    if ((address >> 8) == 0xDC) {
      cia1.setMem(address, value);
    } else if ((address >> 8) == 0xD0) {
      vic.setReg(address, value);
    } else if (address == 1) {
      _ram.setInt8(address, value);
      _tape.setMotor((value & 0x20) == 0 );
    } else {
      _ram.setInt8(address, value);
    }
  }

Now, let us see how the VICII class fit within our emulator:

class EmulatorController implements KeyInfo{
  final Memory memory = Memory();
  final Vicii vic = Vicii();
...
  Future<void> _init() async {
    final basicData = await rootBundle.load("assets/basic.bin");
    final characterData = await rootBundle.load("assets/characters.bin");
    final kernalData = await rootBundle.load("assets/kernal.bin");
    Cia1 cia1 = Cia1(alarms: alarms);
    cia1.setKeyInfo(this);
    Tape tape = Tape(alarms: alarms, interrupt: cia1);
    _tape = tape;
    memory.setCia1(cia1);
    memory.populateMem(basicData, characterData, kernalData);
    memory.setTape(tape);
    vic.memory = memory;
    memory.vic = vic;
    _cpu.setInterruptCallback(() => cia1.hasInterrupts());
    _cpu.reset();
  }
...
}
So, basically the VICII instance and the memory instance have a reference to each other. The VIC instance needs a reference to memory because it needs access to screen memory and bitmapped graphics.

It should be remembered that the VICII's view of the memory is different than that of the CPU. The VICII accesses memory with an address bus that is only 14 bits wide, compared to the 16 bits address bus of the CPU. 

The VIC-II can therefore only see 16KB of memory at a time. In the CIA-2 chip, there is a register that you can set telling which 16KB block should be visible within the 64KB address space to the VIC-II at any point in time.

With the above setup we will give the VIC-II banked access to the RAM. However, we will need a way way to give the VIC-II access to character ROM, so it can draw the characters of screen RAM when in Text Mode. To enable this, two of the four banks has the character ROM also mapped into the address space.

The first bank that has the character ROM mapped inside, is the one in the range 0 - 0x4000, and this range the character ROM is mapped at addresses 0x1000 - 0x1FFF. This bank is the default bank the VIC-II uses when the C64 powers up.

So, with all this in mind, let me write a method inside our memory class for for VIC-II memory access:

  int readVic(int address) {
    if (address >= 0x1000 && address < 0x2000) {
      return _character.getUint8(address & 0xfff);
    }
    return _ram.getUint8(address);
  }

Now, you will remember that previous in the memory class, we had a method renderDisplayImage(), where we rendered the contents of a C64 frame to a bytebuffer, which wrote to a HTML Canvas. We also need to move this method to our VicII class, which will handle all the frame rendering:

class Vicii {
  final type_data.ByteData _regs = type_data.ByteData(0x50);
  late final Memory memory;
  late type_data.Uint32List image;
...

  void renderDisplayImage() {
    const rowSpan = 320;
    for (int i = 0; i < 1000; i++) {
      var charCode = memory.readVic(i + 1024);
      var charAddress = charCode << 3;
      var charBitmapRow = (i ~/ 40) << 3;
      var charBitmapCol = (i % 40) << 3;
      int rawPixelPos = charBitmapRow * rowSpan + charBitmapCol;
      for (int row = 0; row < 8; row++) {
        int bitmapRow = memory.readVic((row + charAddress) | 0x1000) : 0;
        int currentRowAddress = rawPixelPos + row * rowSpan;
        for (int pixel = 0; pixel < 8; pixel++) {
          if ((bitmapRow & 0x80) != 0) {
            image[currentRowAddress + (pixel)] = 0x000000ff;
          } else {
            image[currentRowAddress + (pixel)] = 0xffffffff;
          }
          bitmapRow = bitmapRow << 1;
        }
      }
    }
  }

}
You will also note that to get the bitmap data, we OR it with 0x1000, so that our memory class will know to get the data from the character ROM.

Obviously we will need to rewire parts of our emulator to make use of the Vic instance instance instead of the memory instance, which includes passing the canvas byte array instance. To keep the discussion focused, I will not be going into this detail.

Working with scan lines

We mentioned in the beginning of this post that we will be emulating the flashing borders while loading the game "Dan Dare". These flashing borders are alternating horizintal lines that changes the whole time as the border colors are adjusted a number of times per frame.

Now, the caveat is that with our current setup, we always render a whole frame at the end of emulating a whole frame worth of CPU cycles. This means that we render every frame with only a single border color, so we will never get that flashing border effect with our current setup.

To get closer to emulating flashing borders, we need to render after emulating a scanline worth of cycles every time, instead of waiting for a whole frame of CPU cycles.

To aid us in doing this closer emulation, we will look at this write-up from Christian Bauer on the VIC-II: https://www.zimmers.net/cbmpics/cbm/c64/vic-ii.txt This is a golden resource many people used for writing emulators. Even if you look at the source code for the VICE emulator, you will find reference to Christian Bauer's work.

Let us start by looking at key figures in Bauer's write up:

The model we are interested in is the PAL-B model. The first figures that are handy is the Visible Lines and Visible pixels/line. Many seasoned C64 programmers, will say off the bat that the resolution of the C64 screen is 320x200. However, when asked what is the total resolution, including the border, which you will need to know when showing a screen during emulation, few of us will have an off the bat answer.

This table provides us the answer, which is 403x284. We can now define a screen buffer for this in our VicII class:

class Vicii {
...
  final type_data.Uint8List c64Buffer = type_data.Uint8List(400*284);
...
}
I have rounded off the horizontal resolution, just to keep things simple. You will see also that I use a buffer of bytes, instead of 32-bit integers. Instead, we will be working with 4 bit color values in each byte, which is an index to a color palette. In rendering each scanline, there is multiple writes to the same pixel, like writing the background, then the foreground, and potentially drawing sprites as well. This volume of data is just reduced using 4-bit entries instead of the 32-bit entries.

It is only once we have a full frame buffer ready for display, that we will convert it to 32-bit integer buffer.

From the table above we see that with every scanline, 63 CPU cycles gets executed. This gives us a clue that every 63 clock cycles we should render a scaline with the current state of VIC-II registers. We can do this by just adding another alarm to our alarm system that we previously developed for trigger tape pulses during tape loading emulation. Here is the code for doing that:

class Vicii {
...
  Vicii(Alarms alarms) {
    _alarms = alarms;
    setupAlarms();
  }
...
  setupAlarms() {
    _vicAlarm ??= _alarms.addAlarm( (remaining) => processVicAlarm(remaining));
    _vicAlarm!.setTicks(63);
  }
...
  processVicAlarm(int remaining) {
    _vicAlarm!.setTicks(63 + remaining);
  }
...
}
We added a constructor for our Vicii class where we pass in the alarms structure, which the Vicii class basically add itself as an extra alarm.

We have set the alarm to trigger after 63 cycles. Once it triggers, we extend it trigger after another 63 cycles.

Let us extend the method processAlarm a bit more:

  processVicAlarm(int remaining) {
    _vicAlarm!.setTicks(63 + remaining);

    if (yReg >= 17 && yReg <= 300) {
      drawScanLine();
    }

    yReg++;
    if (yReg == 312) {
      yReg = 0;
    }
  }

yReg is a raster counter we have implemented. As from Bauer's document, it counts from 0 to 312. The counter also counts during vertical blanking period. Between counts 17 and 300 is where there is visible lines. This is where we call drawScanline.

Lets end off this section by starting to implement drawScaline, with only the parts of drawing the border and the background. We will cover the drawing of the characters in a scan lined fashion in the next section.

Let us start by looking at a C64 screen, so we can visualise what border areas needs to be drawn:


Firstly, you get a border section right at the top. Every scanline in this section, is fully drawn with the border color.

We then move down to the area where the characters is drawn. Here every scan line a small section of the border is drawn on the left and on the far right. In the character area itself in the scan line we draw a solid line of the back ground color and set pixels to the foreground color that the bitmaps of the character bitmaps dictate.

Finally, at the bottom of the screen, after the character area, every scanline is drawn in the border color in full.

Now, let us start drawing the top part of the border:

  void drawScanLine() {
    int borderColor = _regs.getInt8(0x20);
    int backgroundColor = _regs.getInt8(0x21);
    // process full border
    if (yReg < 51) {
      c64Buffer.fillRange(currentPosStartLine, currentPosStartLine + 400, borderColor);
    }
    
    currentPosStartLine = currentPosStartLine + 400;
  }
We start by getting the borderColor and Background. The, if the raster counter is less than 51 we draw a full line in the border color. Raster line 51 is the last line of the top border region, so with the if statment we draw the complete top border region.

We also have introduced another variable, currentPosStartLine, which always point to the start of the current scanline.

Finally, let us draw the rest of the borders:

 void drawScanLine() {
    int borderColor = _regs.getInt8(0x20);
    int backgroundColor = _regs.getInt8(0x21);
    // process full border
    if (yReg < 51) {
      c64Buffer.fillRange(currentPosStartLine, currentPosStartLine + 400, borderColor);
    }

    visibleVerticalRegion = yReg < 251 && yReg >= 51;
    var displayEnabled = (_regs.getUint8(0x11) & 0x10) != 0 ? true : false;
    if (visibleVerticalRegion && displayEnabled) {
      c64Buffer.fillRange(currentPosStartLine, currentPosStartLine + 40, borderColor);
      c64Buffer.fillRange(currentPosStartLine + 40, currentPosStartLine + 40 + 320, backgroundColor);
      c64Buffer.fillRange(currentPosStartLine + 40 + 320, currentPosStartLine + 40 + 320 + 40, borderColor);

    } else {
      // process full border
      c64Buffer.fillRange(currentPosStartLine, currentPosStartLine + 400, borderColor);
    }
    
    currentPosStartLine = currentPosStartLine + 400;
  }
So, basically, when we are in the region where the characters are drawn, we just draw a 40 pixel border on the left and on the right, and fill the middle part with the background color. If, however, the display is blanked, we will fill the whole scanline in the border color.

Drawing the characters on the scan line

Let us now focus on drawing the characters on the screen. We will also do this in a scan line fashion.

To draw a bitmap of a character, we need two memory accesses, the character code from screen memory and a line from pixels from character ROM for the character code. In our current rendering we do draw each character at the screen at once. Changing this to a scan line fashion, forces us to rethink how we pull the info from our memory structures to get the info for drawing.

Christian Bauer describes in his write up on VIC-II how the actual VIC-II chip sequences the memory access to get the raw pixel data to draw. This description is also actual handy for us in finding a software implementation for getting the data. He states that at the first scan line of every row of characters, called a bad line, you fetch all the character codes from screen memory to be drawn for that row to a 40 byte buffer resident on the VIC-II itself.

That line is called a bad line because the VIC-II needs to halt the processor from accessing the memory bus completely on that line, because the VIC-II needs extra memory bandwidth to fetch the character codes as well as the bitmap line segments from character ROM.

For the rest of the scan lines for the character row, however, the character codes are already in the 40 byte character Buffer on the VIC-II, so no extra memory bandwidth for them.

This also sounds like a nice implementation for our emulation as well. For the first line of every character row, fetch the character codes from screen memory, and store it in a 40 byte array buffer. Drawing the scan line is then straightforward, fetches the character codes from that array.

Let us start by implementing this idea of prefetching the screen codes:

    if (visibleVerticalRegion && displayEnabled) {
      // process visible screen line
      if (charLine == 0) {
        _charCodeBuffer = memory.readVicRange(videoMatrixPos | 1024, 40);
      }
      ...
    } else {
      // process full border
      c64Buffer.fillRange(currentPosStartLine, currentPosStartLine + 400, borderColor);
    }

    if (visibleVerticalRegion) {
      charLine++;
      charLine = charLine & 7;
      if (charLine == 0) {
        videoMatrixPos = videoMatrixPos + 40;
      }
    }

Firstly, we have a variable charLine, which we increment with every scanline in the character region. This indicates which line of a character (lines 0 to 7) we are busy with. With this variable we also control another variable, videoMatrixPos, which tells us any point in time at which row we are in screen code memory. 

When charLine reaches zero, it signal the beginning of the next character row. This triggers two things, the loading of the 40 character codes from screen memory, and advancing the pointer in screen memory to the next line.

To read an array of 40 characters from screen memory, we introduced a new method to our Memory class called readVicRange(), which look like as follows:

  type_data.Uint8List readVicRange(int address, int count) {
    return storage.buffer.asUint8List(address, count);
  }
Here Flutter helps us out a bit by providing us with the oprator asUint8List on a ByteBuffer, where we specify an offset address and a number of bytes to return from that address. So, we can now read 40 characters of screen memory at the beginning of each character row.

Next, let us focus on drawing a line of bitmap data of a character. The idea that comes to mind, is to read a byte of pixel data from Character ROM, shifting it left one bit at a time, and and then draw the pixel in the foreground color, depending on whether the current MSB bit is set or not.

However, out of pure curiosity I had a look at the VICE source code, to see how they handle this, and I was quite surprised. They use this macro to do the drawing of character pixels:


This looks like a loop unrolling exercise, where instead of looping and updating variables with each iteration, you duplicate each loop iteration in each own snippet of code. There is some performance benefit in doing this, in that you cut out the extra overhead of updating variables with each iteration.

This actually motivated me to do the same in my emulation code. I know that modern day compilers does loop unrolling, and I wouldn't know how efficient it would unroll a loop where we do bit shifting and testing. Needless to say, here is my code for doing it in a loop unrolled fashion 😀:



    if (visibleVerticalRegion && displayEnabled) {
      // process visible screen line
      if (charLine == 0) {
        _charCodeBuffer = memory.readVicRange(videoMatrixPos | 1024, 40);
      }
      c64Buffer.fillRange(currentPosStartLine, currentPosStartLine + 40, borderColor);
      c64Buffer.fillRange(currentPosStartLine + 40, currentPosStartLine + 40 + 320, backgroundColor);
      c64Buffer.fillRange(currentPosStartLine + 40 + 320, currentPosStartLine + 40 + 320 + 40, borderColor);
      var charDrawPointer = currentPosStartLine + 40;
      for (var charCode in _charCodeBuffer) {
        var bitmapRow = memory.readVic((charCode << 3) | charLine | 0x1000);
        if (bitmapRow & 0x80 != 0) {
          c64Buffer[charDrawPointer] = 14;
        }
        if (bitmapRow & 0x40 != 0) {
          c64Buffer[charDrawPointer + 1] = 14;
        }
        if (bitmapRow & 0x20 != 0) {
          c64Buffer[charDrawPointer + 2] = 14;
        }
        if (bitmapRow & 0x10 != 0) {
          c64Buffer[charDrawPointer + 3] = 14;
        }
        if (bitmapRow & 0x08 != 0) {
          c64Buffer[charDrawPointer + 4] = 14;
        }
        if (bitmapRow & 0x04 != 0) {
          c64Buffer[charDrawPointer + 5] = 14;
        }
        if (bitmapRow & 0x02 != 0) {
          c64Buffer[charDrawPointer + 6] = 14;
        }
        if (bitmapRow & 0x01 != 0) {
          c64Buffer[charDrawPointer + 7] = 14;
        }
        charDrawPointer = charDrawPointer + 8;
      }

    } else {
      // process full border
      c64Buffer.fillRange(currentPosStartLine, currentPosStartLine + 400, borderColor);
    }

You will see that I have introduced a variable charDrawPointer. This points to the beginning of the character area on the line, and not the beginning of the border area.

Also for the time being, I am assigning a hard coded pallette entry for the pixels that gets set. At a later stage I will pull this value from color memory.

Testing everything

Let us now get ready to test our setup. One final thing we need to do is to convert the 4-bit bitmap to a 32-bit rgba bitmap. 

First thing is to define the color pallette in our VIC-II class:

  static const List<int> c64Colors = [
  0xFF000000, // Black
  0xFFFFFFFF, // White
  0xFF000088, // Red
  0xFFEEFFAA, // Cyan
  0xFFCC44CC, // Purple
  0xFF55CC00, // Green
  0xFFAA0000, // Blue
  0xFF77EEEE, // Yellow
  0xFF5588DD, // Orange
  0xFF004466, // Brown
  0xFF7777FF, // Light Red
  0xFF333333, // Dark Grey
  0xFF777777, // Grey
  0xFF66FFAA, // Light Green
  0xFFFF8800, // Light Blue
  0xFFBBBBBB, // Light Grey
   ];

And next, the method for creating the 32-bit bitmap:

  void renderDisplayImage() {
    for (int i = 0; i < 400 * 284; i++) {
      image[i] = c64Colors[c64Buffer[i] & 0xf];
    }
  }

There is some other minor changes, like not letting the main emulator loop decide when to render a frame, but letting the VIC-II call the shots for this. However, to keep the dicussion simple, I will not cover this here, but you can have a look at the source link I will provide at the end of this post, to get an idea of the finer details.

Starting our emulator up, I immediately feel at home:

I have added a Frame Per second counter on the top and it alternates between 59.3 to 60 FPS. So, so far so good.

Let us now see how the screen looks like when loading the game Dan Dare:


We have flashing borders! However, if you look closely, the flashing borders looks a bit unnatural, if you compare it to the how the flashing borders looked like back in the day. In, fact the summarise it, these bars looks too parallel!

Lets compare it to the loading screen on Vice itself:


Here we can clearly see the bars looks rugged in the scan lines. The reason our's doesn't look like this, s because we only use a single border color per scan line.

In the next post we will see if we can get closer to this look.

In Summary

In this post we have implemented border rendering in a scan line fashion.

We also managed to implement the flashing borders of when loading the game Dan Dare. However, the bars looks kind of artificial because we only render each scan line with one border color. 

In the next post we will see if we can deal with changing border colors on a scan line and render it correctly.

The source for this post can be found here

Until next time!

Tuesday, 24 March 2026

A Commodore 64 Emulator in Flutter: Part 16

Foreword

In the previous post we managed to implement Tape loading, and managed to emulate this process until it shows it found a file name.

We ended the post discovering that the emulator has a serious memory leak, which will attempt to solve in this post.

Unpacking the Memory Leak

In the previous post we saw that our emulator had a serious memory leak, where memory usage grew over 1 Gigabyte in less than half an hour.

I carefully went through my code but couldn't really find any obvious place where a memory leak was happening. 

I did, however, had a suspicion that the cause of the memory leak was probably related to how the frames was rendered to the screen. This is probably the process in the emulator where the most data move back and forth.

Eventually I debated with ChatGPT and Gemini which components are the best for doing rendering in flutter which would cause memory leaks. I tried all the suggestions but didn't really resulted in fixing the memory leak.

Eventually Gemini came with a suggestion to use a native HTML canvas to do the rendering. This strike as a sensible idea as I was using an HTML canvas in one of my JavaScript Emulators I used about 10 years, without any memory leak.

Also, I started to realise my current rendering implementing was perhaps on the heavy side. With a Bloc emitting a state change on every frame, part of the widget tree was being redrawn 60 times a second. This sounded very intense, so the idea of reusing an HTML canvas seemed like a way out.

I did a proof of concept, and create a small Flutter project, using the HTML canvas as described, and just rendered some simple, like a moving line, being redrawn 60 times a second. In this proof of concept I actually found that the memeory usage remained within bounds.

So, in this post I will be following this approach of rewriting the emulator to make use of an HTML Canvas, in order to eliminate the memory leak.

Bringing a native HTML canvas to Flutter

Let us pause for moment, and see how we can introduce a native HTML canvas in Flutter.

We begin with a simple class:
class EmulatorCanvas {
  late final html.CanvasElement canvas;
  late final html.CanvasRenderingContext2D ctx;

  final int width;
  final int height;
  int inc = 0;

  EmulatorCanvas(this.width, this.height) {
    canvas = html.CanvasElement(width: width, height: height);
    ctx = canvas.context2D;

    // Register with Flutter
    ui.platformViewRegistry.registerViewFactory(
      'emulator-canvas',
          (int viewId) => canvas,
    );
  }

}

In this code, html is a Dart package, and html.CanvasElement actually creates us a native HTML Canvas element. It is important to note that at this stage, the created canvas element is not attached to the HTML page at the moment.

The registerViewFactory actually allows the widget tree to have access to the created Canvas element, and we associate the name emulator-canvas with it.

Let us now see where we will use thus class:

import 'package:file_picker/file_picker.dart';
import 'package:flutter/cupertino.dart';
import 'package:flutter/material.dart';
import 'package:flutter/scheduler.dart';
import 'package:flutter_bloc/flutter_bloc.dart';

import 'emulator_canvas.dart';
import 'emulator_controller.dart';

class VideoScreen extends StatefulWidget {
  const VideoScreen({super.key});

  @override
  State<VideoScreen> createState() => _VideoScreenState();
}

class _VideoScreenState extends State<VideoScreen>
    with SingleTickerProviderStateMixin {

  late EmulatorCanvas emCanvas;

  @override
  void initState() {
    super.initState();

    emCanvas = EmulatorCanvas(320, 200);

  }

  @override
  void dispose() {
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      backgroundColor: const Color(0xFF4040E0),
      body: Column(
        children: [

          const SizedBox(
            width: 640,
            height: 400,
            child: HtmlElementView(viewType: 'emulator-canvas'),
          ),
        ],
      ),
    );
  }
}
Firstly we have VideoScreen as a StatefulWidget, which meand we will reuse this instance, and it will not be destroyed with every state change.

As the name implies, StatefulWidget, the Widget should contain state that should be mutable. For this reason our Widget ties to the class _VideoScreenState. Something interesting about this class is that it is declared with with SingleTickerProviderStateMixin. This means that ticker events is synchronised to screen refreshes and is called once per screen refresh.

Here we also declare an Instance of emCanvas. Finally with the widget we return via the build method, we also wrap the emCanvas instance into it with the label emulator-canvas. Previously we registered EmCanvas with flutter by that name, so in that way we can associate it inside our returning widget.

As an additional extra, it is interesting to inspect the HTML in the browser:


One can actually see the canvas element of what we defined in our code.

Obviously all this needs to be wired up all the way until the main screen in main.dart, which we will cover in the next section.

Moving towards a Controller Architecture

Up to now the centre of our C64 emulator in flutter was a BloC. In our BloC we emitted a new state with every frame, which instructed the front end to create a new widget instance for displaying the new frame.

As indicated earlier in this post, this is really clunky considering we need to render 60 frames a second. To get around this, we will discard our BloC idea and rather opt for a Controller architecture.

Let us start at the highest level, main.dart, which hosts our flutter application:

Future<void> main() async {
  WidgetsFlutterBinding.ensureInitialized();
  final controller = await EmulatorController.create();
  runApp(
    MaterialApp(
      home: RepositoryProvider.value(
        value: controller,
        child: const EmulatorRoot(),
      )
    ));
}
Now, as you might have guest, the core functionality of our emulator will live in the class EmulatorController. We will perform more or less the same things we performed in our old C64BloC class.

You might also remember that when we were previously writing our C64BloC class, we did some asynchronous tasks, loading the three C64 ROMS from disk, and we had to use use the keyword await to wait until everything was into memory before we continue. We need to do something similar with our controller, which necessitates us to declare our main() method as async, in order to use the await functionality.

In our main() method, we also make use of a RepositoryProvider. This enables us to inject our controller class further down in our tree where it might be needed.

To help us orientate everything let us have a look at the implementation of EmulatorRoot:

class EmulatorRoot extends StatefulWidget {
  // final String name;
  const EmulatorRoot({super.key});

  @override
  State<EmulatorRoot> createState() => _EmulatorRootState();
}

class _EmulatorRootState extends State<EmulatorRoot> {
  int _currentIndex = 0; // 0 = debug, 1 = video

  @override
  Widget build(BuildContext context) {
    EmulatorController controller = context.read<EmulatorController>();
    return Scaffold(
      appBar: AppBar(
        title: const Text("C64 Emulator"),
        actions: [
          IconButton(
            icon: const Icon(Icons.bug_report),
            onPressed: () => setState(() => _currentIndex = 0),
          ),
          IconButton(
            icon: const Icon(Icons.tv),
            onPressed: () => setState(() => _currentIndex = 1),
          ),
        ],
      ),
      body: IndexedStack(
        index: _currentIndex,
        children: [
          // DebugScreen(),
          KeyboardListener (
            // VideoScreen(),
            focusNode: controller.focusNode,
            autofocus: true,
            onKeyEvent: (event) => {
              if (event is KeyDownEvent) {
                controller.keyboardEvent(event.logicalKey, true)
                // context.read<C64Bloc>().add(KeyC64Event(keyDown: true, key: event.logicalKey))
              } else if (event is KeyUpEvent) {
                controller.keyboardEvent(event.logicalKey, false)
                // context.read<C64Bloc>().add(KeyC64Event(keyDown: false, key: event.logicalKey))
              }
            },
            child: const VideoScreen(),

          )
        ],
      ),
    );
  }
}
This is yet another StatefulWdiget with Associated state. The basic idea outlined here is to have a tabbed view, showing a debug view on one tab and the screen of the running emulator in another tab. We will not show how to implement the Debug tab in this series and is just shown as a possibility into how this emulator can develop.

For the tab showing the runnig emulator screen, we casically show VideoScreen, which we developed earlier on. We have also wrapped this screen with a KeyboarListener, for interception keystrokes. This is similar as we did in previous posts.

You will also see with the keyevents we call controller.keyboardEvent. This will interface our emulator with a keyboard.

Let us next, look at the internals of EmulatorController:

class EmulatorController implements KeyInfo{
  final Memory memory = Memory();
  final List<int> matrix = [0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff];
  late final Cpu _cpu = Cpu(memory: memory);
  late final Tape _tape;
  late final Alarms alarms = Alarms();
  FocusNode focusNode = FocusNode();
  bool tapeLoaded = false;

  EmulatorController._();

  static Future<EmulatorController> create() async {
    final instance = EmulatorController._();
    await instance._init();
    return instance;
  }

  Future<void> _init() async {
    final basicData = await rootBundle.load("assets/basic.bin");
    final characterData = await rootBundle.load("assets/characters.bin");
    final kernalData = await rootBundle.load("assets/kernal.bin");
    Cia1 cia1 = Cia1(alarms: alarms);
    cia1.setKeyInfo(this);
    Tape tape = Tape(alarms: alarms, interrupt: cia1);
    _tape = tape;
    memory.setCia1(cia1);
    memory.populateMem(basicData, characterData, kernalData);
    memory.setTape(tape);
    _cpu.setInterruptCallback(() => cia1.hasInterrupts());
    _cpu.reset();
  }
...
}
This is pretty much the same we did in our Bloc. There is, however, a couple of things we do extra. We hide the constructor and to get a new instance, we need to call create() to give us a properly initialised instance.

Implementing Screen Refreshing

So, we have just implemented the basics for a controller architecture for our Flutter C64 emulator. Let us next focus on how to render the frames.

Firstly, within video_screen.dart, we need to make this file aware of our controller within the initState() method:

  @override
  void initState() {
    super.initState();

    controller = context.read<EmulatorController>();
    emCanvas = EmulatorCanvas(320, 200);
    controller.setCanvasArray(emCanvas.getFrameBuffer());
    ...
  }
We use context.read to get the injected instance of the controller which was injected higher in the tree. Once we have the controller instance we pass it through the the framebuffer of the canvas, so that our controller do some drawing if required.

In an earlier section in this post I briefly talk about the use of SingleTickerProviderStateMixin, which syncs frame refreshes with refresh rate of the screen. We will now go further with this implementation inside the initState() method.

  void initState() {
    super.initState();

    controller = context.read<EmulatorController>();
    emCanvas = EmulatorCanvas(320, 200);
    controller.setCanvasArray(emCanvas.getFrameBuffer());

    _ticker = createTicker((Duration elapsed) {
      if ((elapsed.inMilliseconds - lastProcessed) < 16) {
        return;
      }
      lastProcessed = elapsed.inMilliseconds;

      controller.executeChunk();
      emCanvas.renderFrame();
    });

    _ticker.start();
  }

Here we create a ticker instance, which will execute with every screen refresh. With controller.executeChunk(), we tell our emulator execute one frame worth of cycles. This is more or less the same approach we implemented previously. 

You will also see that we throttle the rendering a bit to get close to the real speed of a C64, by just exiting the ticker body if it is not yet time to display the next frame. Having said that, most displays refreshes at a rate of 60Hz. So, if you take out the return code, you emulator should still run at more or less the same speed of a real C64.

Next, let us look at the implementation of controller.executeChunk():

  void executeChunk() {
    int targetCycles = _cpu.getCycles() + 16666;
    do {
      _cpu.step();
      alarms.processAlarms(_cpu.getCycles());
    } while (_cpu.getCycles() < targetCycles);
    memory.renderDisplayImage();
  }

So, in thus method we execute a frame worth of CPU cyles and we render a frame to be displayed. The Array we render to is the one we passed through earlier with controller.setCanvasArray()

Let us finally have a look at the implementation of emCanvas.renderFrame():

  void renderFrame() {
    ctx.putImageData(imageData, 0, 0);
  }

Here we are dealing raw HTML territory. WIth ctx.putImageData, we write to the actual HTML Canvas element we defined earlier.

In Summary

In this post we reworked our C64 emulator to a Controller architecture in order to fix a memory leak. With our new architecture, we don't create a new widget with every frame, but rather maintain a single HTML Canvas element throughout the life cycle of our emulator, to which we render all frames.

In the next post we will add some colors to our C64 frames, together with some borders. We will also emulate the drawing of the border in a more granular fashion, in order to accurately similate the flashing borders while loading the game from a tape image.

As usual, you can find the source for every post on my GitHub page. For this post, you can go here

Until next time! 

Saturday, 17 January 2026

A Commodore 64 Emulator in Flutter: Part 15

Foreword

In the previous post we introduced the CIA as a separate class. Previously mimicked the CIA's operation, by just forcing a hard interrupt every 1/60th of a second, just to get our emulator to work, avoiding the complexities of implementing and scheduling timers.

Thus, in the previous post we delved deeper and implement the CIA. This was actually needed as a precursor to this post where we will be implementing Tape loading functionality, which require more granular operation of the CIA.

Adding Front end Interaction

The logical place to start, is to add functionality to our front end for attaching a tape image. So within main.dart, which is basically our front end code, we add two buttons for the RunningState front end:

...
} else if (state is RunningState) {
              return KeyboardListener(
                focusNode: context.read<C64Bloc>().focusNode,
                autofocus: true,
                onKeyEvent: (event) => {
                  if (event is KeyDownEvent) {
                    context.read<C64Bloc>().add(KeyC64Event(keyDown: true, key: event.logicalKey))
                  } else if (event is KeyUpEvent) {
                    context.read<C64Bloc>().add(KeyC64Event(keyDown: false, key: event.logicalKey))
                  }
                },
                child: Column(
                  children: [
                    Row(
                      children: [
                        IconButton(
                            icon: Icon(Icons.folder),
                            onPressed: () async {
                              context.read<C64Bloc>().add(LoadTapeRequested());
                            }),
                        IconButton(
                            icon: Icon(Icons.play_arrow),
                            onPressed: !state.tapeLoaded ? null : () async {
                              context.read<C64Bloc>().add(PlayTapeRequested());
                            })
                      ],
                    ),
                    RawImage(
                      image: state.image, scale: 0.5),
                ],
              ));
            }
...
As usual, we will add LoadTapeRequested and PlayTapeRequested in c64_event.dart.

Now we need to listen for every event within our Bloc class:

    on<PlayTapeRequested>((event, emit) {
      _tape.playTape();
    });

    on<LoadTapeRequested>((event, emit) async {
      final result = await FilePicker.platform.pickFiles(
        withData: true,
        type: FileType.custom,
        allowedExtensions: ['tap', 't64'],
      );

      if (result == null) return;
      tapeLoaded = true;
      _tape.setTapeImage(result.files.single.bytes!);
    });

For PlayTapeRequested we simulate the press of a play button. _tape is an instance of a class Tape, which we will define later.

With the LoadTapeRequested event, we Basically present a file dialogue where the user select the Tape image from the local file system and also pass it to the _tape instance.

The Tape Class

Let us start to implement the Tape class, which will emulate the functionality of Tape loading.

We start with a simple class:
class Tape implements TapeMemoryInterface {
  late Iterator _tapeImage;
  bool _playSelected = false;
  Alarms alarms;
  TapeInterrupt interrupt;
  Alarm? _tapeAlarm;

  Tape({required this.alarms, required this.interrupt});
}
Before we go into detail on how to implement this class, let us take a step back and think about how Tape loading works on a C64.

On a physical tape, you used back in the day to load games on a C64, you had pulses of varying lengths. It all boils down to basically two types of pulses: A short pulse or a long pulse, which corresponds to either a 0 or 1, which is a bit. The most basic element of data on a computer 😀.

Now, when considering the loading of the data from a physical tape on C64. The end of a pulse is indicated when it changes polarity from positive to negative or vice versa. This change of polarity causes an interrupt on the CPU, via the FLAG pin on CIA1. The tape loading routines inside the Kernal ROM use one of the CIA timers to measure the pulse widths, and decide based on that, if each bit is a zero or a one.

The tape image files you can download from the Internet of old games, are a sequence of pulse widths. With all this info at hand, it is starting to become apparent on what the tape class should do. Using these pulse width, it should schedule an alarm, the same structures we used previously within the CIA, for each pulse, and trigger an interrupt when it lapses. Looking at the private fields I defined above in the Tape class, it also hints towards this.

Let us have at the variable _tapeImage. It is of type Iterator. With this data structure we can basically iterate through the tape image pulse width by pulse width, without worrying about working with a counter that you need to update every time.

At this point we are ready to implement the method _setTapeImage(), which we mentioned previously:

  setTapeImage(type_data.Uint8List tapeData) {
    _tapeImage = tapeData.iterator;
    for (var i = 0; i < 21; i++) {
      _tapeImage.moveNext();
    }
    populateRemainingPulses();
  }

Uint8List variables provides you with an Iterator. In a tape image actual pulse width data actually starts after 21 bytes.

Once we are at the actual pulse width data, we need to know the width of the first pulse. This is the function of the method populateRemainingPulses() :

  populateRemainingPulses() {
    var val = _tapeImage.current;
    if (val != 0) {
      _remainingPulseTicks = val << 3;
      _tapeImage.moveNext();
    } else {
      var byte0 = _tapeImage.current;
      _tapeImage.moveNext();
      var byte1 = _tapeImage.current;
      _tapeImage.moveNext();
      var byte2 = _tapeImage.current;
      _tapeImage.moveNext();
      _remainingPulseTicks = (byte2 << 16) | (byte1 << 8) | byte0;
    }
  }

Here we need to understand the TAP format a bit better. Usually every byte indicates one pulse width. We then need to multiply this value by 8 to get to the width in CPU clock cycles.

The excpetion to the rule is when the byte value is zero. Then the next three bytes indicate the pulse width as an absolute value of CPU clock cycles e.g. no multiplication by 8 necessary then.

You will see that I am assigning the calculated value to a private variable _remainingPulseTicks. We are following a similar approach here than with timers in the CIA which we implemented in the previous post. It functions almost as a count down timer, and is updated with the alarm subsystem.

At this point a key question is: What kicks off the tape loading process? The answer lies in memory location 1 of the C64 memory. This memory location is well known for the location of switching out banks of memory in and out of view. However, this memory location also host two bits for tape control:
  • Bit 4 - Cassette Switch Sense; 1 = Switch Closed
  • Bit 5 - Cassette Motor Control; 0 = On, 1 = Off
The key here is bit 5, turning the Cassette motor on and off, which acts as the starting point for the tape loading process. Bit 4 tells us when the user presses the play button, which we will cover later.

In this let us create the following method in our Tape class:

  @override
  setMotor(bool on) {
    if (on == _currentMotorOn) {
      return;
    }
    _currentMotorOn = on;
    if (on) {
      setupAlarms();
    } else {
      _tapeAlarm!.unlink();
      _remainingPulseTicks = _tapeAlarm?.getRemainingTicks();
    }
  }
This method will be invoke when we write to memory via our Memory class. We will deal with this plumbing later.

If the motor switched on, we need to setup alarms. This is similar what we did with timers in the previous post. Before we move onto the implementation of setupAlarms(), lets have a look at what happend in the else, when the motor is switched off. In that case we unlink the alarm from the list of alarms, and we set _remainingPusleTicks to the remaining ticks of the pulse. This is just to cater for when we resume the motor, we can carry on from where we left on in the pulse.

Now, let us look at setupAlarms():

  setupAlarms() {
    _tapeAlarm ??= alarms.addAlarm( (remaining) => processTapeAlarm(remaining));
    if (_tapeAlarm!.list == null) {
      alarms.reAddAlarm(_tapeAlarm!);
    }
    _tapeAlarm!.setTicks(_remainingPulseTicks);
  }

Here we see the actual use of _remainingPulseTicks, when the motor is resumed.

Let us now have a look at the method processTapeAlarm() :

  processTapeAlarm(int remaining) {
    interrupt.triggerInterrupt();
    populateRemainingPulses();
    _tapeAlarm!.setTicks(_remainingPulseTicks + remaining);
  }
This method is called when the pulse has expired. During this we trigger an interrupt and reschedule the next alarm.

Finally, there is one remaining method we need to implement:

  @override
  int getCassetteSense() {
    return _playSelected ? 0 : 0x10;
  }

This basically provides bit 4 of memory location 1, which will be used by our memory class. More on this later.

Changes to the CIA class

Let us now have a look at the changes required in our CIA class.

There is quite a few changes, so I will just cover it on a high level.

First of all, we will need to implement TimerB as well. The tape loading routine in Kernel ROM uses this timer quite extensively. All I will say here, is that it is basically a copy and paste excercise from TimerA.

Next, we will look at the method hasInterrupts(), which is used by our CPU class to trigger an interrupt:

  hasInterrupts() {
    if (timerAintOccurred && timerAinterruptEnabled) {
      return true;
    } else if (timerBintOccurred && timerBinterruptEnabled) {
      return true;
    } else if (tapeInterruptOccurred && tapeInterruptEnabled) {
      return true;
    } else {
      return false;
    }
  }

You will notice that I have included timerB interrupts and tape interrupts in the check as well.

Next, let us look at the setMem() function in the CIA class:

  setMem(int address, int value) {
...
      case 0xD:
        if ((value & 0x80) != 0) {
          timerAinterruptEnabled = ((value & 1) == 1) ? true : timerAinterruptEnabled;
        } else {
          timerAinterruptEnabled = ((value & 1) == 1) ? false : timerAinterruptEnabled;
        }
        if ((value & 0x80) != 0) {
          timerBinterruptEnabled = ((value & 2) == 2) ? true : timerBinterruptEnabled;
        } else {
          timerBinterruptEnabled = ((value & 2) == 2) ? false : timerBinterruptEnabled;
        }
        if ((value & 0x80) != 0) {
          tapeInterruptEnabled = ((value & 16) == 16) ? true : tapeInterruptEnabled;
        } else {
          tapeInterruptEnabled = ((value & 16) == 16) ? false : tapeInterruptEnabled;
        }
...
  }

As you might remember previously, register D in the CIA is the interrupt mask register. Here we have added timerB and the Tape Inteerupt as some interrupts we can enable or mask out.

Finally, let us look at the getMem() method:

  int getMem(int address) {
...
  case 0xD:
        var value = 0;
        if (timerAintOccurred) {
          timerAintOccurred = false;
          value = value | 0x81;
        }
        if (timerBintOccurred) {
          timerBintOccurred = false;
          value = value | 0x82;
        }
        if (tapeInterruptOccurred) {
          tapeInterruptOccurred = false;
          value = value | 0x84;
        }
        return value;
    }
...
}

Here we are reading the same register from earlier, but reading doesn't return the masks, but the actual interrupts that occurred. Once again, we have added timerB and TapeInterrupt. Once this registere has been read, we also clear all occurred interrupts.

Changes to the Memory Class

Let us now have a look at the changes required in our memory class, for implementing tape loading.

First change is in the setMem() method:

  setMem(int value, int address ) {
    if ((address >> 8) == 0xDC) {
      cia1.setMem(address, value);
    } else if (address == 1) {
      _ram.setInt8(address, value);
      _tape.setMotor((value & 0x20) == 0 );
    } else {
      _ram.setInt8(address, value);
    }
  }
So, as mentioned earlier, bit 5 of memory location 1 controls the tape motor. Here we implement it, so that during a memory write to this location, we call setMotor appropriately.

Next, let us change the getMem() method:

  int getMem(int address) {
    _readCount++;
    if (address >= 0xA000 && address <= 0xBFFF) {
      return _basic.getUint8(address & 0x1fff);
    } else if (address >= 0xE000 && address <= 0xFFFF) {
      return _kernal.getUint8(address & 0x1fff);
    } else if (address == 0xD012) {
      return (_readCount & 1024) == 0 ? 1 : 0;
    } else if ((address >> 8) == 0xDC ) {
      return cia1.getMem(address);
    } else if (address == 1) {
      var value = _ram.getUint8(address) & 0xef;
      return value | _tape.getCassetteSense();
    } else {
      return _ram.getUint8(address);
    }
  }

Here we add the Cassette sense bit when reading the byte from memory location one. As mentioned previously, the cassette sense bit indicates if we pressed the play button.

The results

With everything coded, let us see how the screens looks like when we spin up our emulator. At startup, our screen looks like this:


Notice we have two new icons at the top, a folder icon and a play button. We use the folder icon to locate the tape image file from our local file system. Once we have selected a tape image, the play button becomes enabled.

The play button if actually ressembling the play button on a real C64 Datasette unit which was hooked to a C64. So, when the screen shows "Press Play on tape", and you hit the play button, the loading process commenced.

Lets do the whole sequence. With the tape image attached, type LOAD at the flashing cursor, and then hit ENTER. Your screen will now look like this:

Now press play button next to the folder button.

With the play button pressed, the folloing prompts will popup:


After a number of seconds, the screen will look like this:


This is the Hooray moment. When seeing FOUND DAN DARE, or what the file name of the tape image you used, you know you have implemented the tape loading correctly.

One thing that immediately felt off when testing the tape loading, was that it felt much longer than usual before it showed "FOUND...". So I did some comparative benchmarks.

First, to get a realistic time, I measured how long it takes to find the file in the Vice C64 emulator. It was about 17 seconds.

Then I did the measurement in my Flutter emulator. In my Emulator, it took 24 seconds. Quite a lot slower!

I did some further depth investigations. After lots of pain, I discovered that the speed issue was caused by not building the app in release mode. I made a subtle assumption that if I start it IntelliJ, and I start it with the Play button and not with the Debug button, every thing will be optimised. Was I wrong!

Let us see how to run our project in release mode. Firstly, open a terminal window and cd into your project folder. Then run the following command:

flutter build web --release
After the build is finished, you will find the result in build/web with the project. cd into this folder. We now need a web server to serve this, and the easiest one to use is Python. So, within the build folder, run the following command:

python -m http.server 8000
Now access the emulator in the browser with http://localhost:8000/

This time around our times match up with tape loading.

While I was trying to figure out why my emulator was slow, I also discover memory usage was steadily climbing. When I fixed the issue in release, I wondered if the memory leak issue was also fixed. So, I left it running for about half an hour, and then hovered over the tab in Chrome to see the memory usage, and sadly the memory leak was there:


If you leave it running longer, it will eventually go over 1G of memory usage.

In the next post we will tackle this issue.

In Summary

In this post we implemented Tape image loading. An unfortunate issue I encountered was a memory leak.

In the next post I will see if I can fix this memory leak.

Until next time!