Monday, 14 October 2019

Implementing Sprites: Part 3

Foreword

In the previous post we have implemented the capability for our sprite to expand in both the X and Y directions. We also have implemented Sprite multicolor mode.

Up to now are our VIC-II only supported a single sprite, Sprite 0. So, in this post we will be connecting the remaining seven sprites.

With our VIC-II module able to display all eight sprites, we would be able to fully play the game Dan Dare with our emulator.

This would indeed be a very nostalgic moment for me, but raised a bit of a concern for me. If one is going to play extended periods on the Zybo with our emulator, wouldn't the Zynq SoC eventually get very hot?

My concern was driven by the fact that these days you find quite a number of videos on the Internet concerning cooling solutions for single board computers. With this in mind, when you come to the Zybo board, you cannot really find any information regarding what kind of temperatures to expect during general use of the board.

So, I will end off this post by sharing what I have found by experimentation regarding the temperature of the Zynq when run our emulator for half an hour or so.

Hooking up the remaining sprites

Currently we only have a single instance of sprite_generator for sprite 0. Let u start by adding instances for the remaining sprites. For simplicity, I am only showing the declarations for the first three:

sprite_generator sprite_0(
  .clk_in(clk_in),
  .raster_y_pos(y_pos - 5),
  .raster_x_pos(x_pos - 16),
  .sprite_x_pos({sprite_msb_x[0],sprite_0_xpos}),
  .sprite_y_pos(sprite_0_ypos),
  .store_byte(store_sprite_pixel_byte && sprite_data_region_offset[6:4] == 0),

  .x_expand(x_expand[0]),
  .y_expand(y_expand[0]),
  .multi_color_mode(multi_color_mode[0]),
  .sprite_multi_0(sprite_multi_color_0),
  .sprite_multi_1(sprite_multi_color_1),
  .primary_color(sprite_primary_color_0),

  .data(data_in[7:0]),
  .sprite_enabled(sprite_enabled[0]),
  .show_pixel(show_pixel_sprite_0),
  .output_pixel(out_pixel_sprite_0),
  .request_data(),
  .request_line_offset(sprite_0_offset)
    );

sprite_generator sprite_1(
  .clk_in(clk_in),
  .raster_y_pos(y_pos - 5),
  .raster_x_pos(x_pos - 16),
  .sprite_x_pos({sprite_msb_x[1],sprite_1_xpos}),
  .sprite_y_pos(sprite_1_ypos),
  .store_byte(store_sprite_pixel_byte && sprite_data_region_offset[6:4] == 1),

  .x_expand(x_expand[1]),
  .y_expand(y_expand[1]),
  .multi_color_mode(multi_color_mode[1]),
  .sprite_multi_0(sprite_multi_color_0),
  .sprite_multi_1(sprite_multi_color_1),
  .primary_color(sprite_primary_color_1),

  .data(data_in[7:0]),
  .sprite_enabled(sprite_enabled[1]),
  .show_pixel(show_pixel_sprite_1),
  .output_pixel(out_pixel_sprite_1),
  .request_data(),
  .request_line_offset(sprite_1_offset)
    );

sprite_generator sprite_2(
  .clk_in(clk_in),
  .raster_y_pos(y_pos - 5),
  .raster_x_pos(x_pos - 16),
  .sprite_x_pos({sprite_msb_x[2],sprite_2_xpos}),
  .sprite_y_pos(sprite_2_ypos),
  .store_byte(store_sprite_pixel_byte && sprite_data_region_offset[6:4] == 2),

  .x_expand(x_expand[2]),
  .y_expand(y_expand[2]),
  .multi_color_mode(multi_color_mode[2]),
  .sprite_multi_0(sprite_multi_color_0),
  .sprite_multi_1(sprite_multi_color_1),
  .primary_color(sprite_primary_color_2),

  .data(data_in[7:0]),
  .sprite_enabled(sprite_enabled[2]),
  .show_pixel(show_pixel_sprite_2),
  .output_pixel(out_pixel_sprite_2),
  .request_data(),
  .request_line_offset(sprite_2_offset)
    );


This is a typical copy and paste exercise. However, some of the ports is specific to the sprite itself. Here is a list of these ports:

  • sprite_x_pos/ sprite_y_pos
  • store_byte
  • x_expand/y_expand
  • multi_color_mode
  • primary_color
  • sprite_enabled
  • show_pixel/out_pixel
  • request_line_offset
Obviously some ports will get its value from a particular bit position in a register, whereas the other ports in this list have there own dedicated registers.


Let us have a look at the output ports. The first port is sprite_x_offset. We use these ports as follows:

   always @*
     case (sprite_data_region_offset[6:4])
       3'd0: sprite_offset = sprite_0_offset;
       3'd1: sprite_offset = sprite_1_offset;
       3'd2: sprite_offset = sprite_2_offset;
       3'd3: sprite_offset = sprite_3_offset;
       3'd4: sprite_offset = sprite_4_offset;
       3'd5: sprite_offset = sprite_5_offset;
       3'd6: sprite_offset = sprite_6_offset;
       3'd7: sprite_offset = sprite_7_offset;
    endcase

     always @*
       if (!sprite_data_region && (clk_counter == 6 | clk_counter == 7))
         addr = bit_data_pointer;       
       else if (sprite_data_region && (sprite_data_region_offset[3:0] < 3))
         addr = {mem_pointers[7:4], 7'h7f, sprite_data_region_offset[6:4]};
       else if (sprite_data_region)
         addr = {sprite_data_location, (sprite_offset + sprite_byte_num)}; 
       else
         addr =  {mem_pointers[7:4], screen_mem_pos};

So, we use the applicable sprite_offset when it is the data cycle for a particular sprite.

We sit with a couple of show_pixel/output_pixel pairs for each sprite. We combine these as follows:

always @*
   if (show_pixel_sprite_0)
     color_for_bit_with_sprite = out_pixel_sprite_0;
   else if (show_pixel_sprite_1)
     color_for_bit_with_sprite = out_pixel_sprite_1;
   else if (show_pixel_sprite_2)
     color_for_bit_with_sprite = out_pixel_sprite_2;
   else if (show_pixel_sprite_3)
     color_for_bit_with_sprite = out_pixel_sprite_3;
   else if (show_pixel_sprite_4)
     color_for_bit_with_sprite = out_pixel_sprite_4;
   else if (show_pixel_sprite_5)
     color_for_bit_with_sprite = out_pixel_sprite_5;
   else if (show_pixel_sprite_6)
     color_for_bit_with_sprite = out_pixel_sprite_6;
   else if (show_pixel_sprite_7)
     color_for_bit_with_sprite = out_pixel_sprite_7;
   else
     color_for_bit_with_sprite = color_for_bit;

   assign color_for_bit = multicolor_data ? multi_color :    
            (pixel_shift_reg[7] == 1 ? char_buffer_out_delayed[11:8] : background_color);
   assign final_color = (visible_vert & visible_horiz & screen_enabled) ? color_for_bit_with_sprite : border_color;


For now we only assume that all sprites are in front of the main graphics, implementing the hardcoded priority, where the lower the sprite number, the higher the priority.

When we run our implementation on the Zybo board with these changes, it looks very promising: our characters have finally appeared!

One small thing doesn't look right though. Our characters are always in front of everything! They appear in front of rocks. Also, when we walk underwater, using a reed as a snorkel, only the reed should be visible. This is not the case with our emulator in its current state:


We see Dan Dare, his pet, and the Snorkel!

OK, i agree, this shouldn't come as a surprise, since we implemented sprites to be always visible in front of the background graphics.

Fine tuning Sprite display priority

There is a couple of Sprite priority functionality that should be implemented before our game screen can render correctly.

The first priority is priority according the Sprite priority register at address D01B. Firstly we need to implement this register into our VIC-II so it be be written to or read by the 6502. This is similar to the other registers we have implemented.

We use this register as follows:

always @*
   if (show_pixel_sprite_0 && !sprite_priority[0])
     color_for_bit_with_sprite = out_pixel_sprite_0;
   else if (show_pixel_sprite_1 && !sprite_priority[1])
     color_for_bit_with_sprite = out_pixel_sprite_1;
   else if (show_pixel_sprite_2 && !sprite_priority[2])
     color_for_bit_with_sprite = out_pixel_sprite_2;
   else if (show_pixel_sprite_3 && !sprite_priority[3])
     color_for_bit_with_sprite = out_pixel_sprite_3;
   else if (show_pixel_sprite_4 && !sprite_priority[4])
     color_for_bit_with_sprite = out_pixel_sprite_4;
   else if (show_pixel_sprite_5 && !sprite_priority[5])
     color_for_bit_with_sprite = out_pixel_sprite_5;
   else if (show_pixel_sprite_6 && !sprite_priority[6])
     color_for_bit_with_sprite = out_pixel_sprite_6;
   else if (show_pixel_sprite_7 && !sprite_priority[7])
     color_for_bit_with_sprite = out_pixel_sprite_7;
   else if (pixel_shift_reg[7])
     color_for_bit_with_sprite = color_for_bit;
   else if (show_pixel_sprite_0)
     color_for_bit_with_sprite = out_pixel_sprite_0;
   else if (show_pixel_sprite_1)
     color_for_bit_with_sprite = out_pixel_sprite_1;
   else if (show_pixel_sprite_2)
     color_for_bit_with_sprite = out_pixel_sprite_2;
   else if (show_pixel_sprite_3)
     color_for_bit_with_sprite = out_pixel_sprite_3;
   else if (show_pixel_sprite_4)
     color_for_bit_with_sprite = out_pixel_sprite_4;
   else if (show_pixel_sprite_5)
     color_for_bit_with_sprite = out_pixel_sprite_5;
   else if (show_pixel_sprite_6)
     color_for_bit_with_sprite = out_pixel_sprite_6;
   else if (show_pixel_sprite_7)
     color_for_bit_with_sprite = out_pixel_sprite_7;
   else
     color_for_bit_with_sprite = color_for_bit;

Within this snippet of code, you spot another implied priority by means of the check for pixel_shift_reg[7].

So, if this background pixel has a bit value of zero, it is actually transparent, allowing the sprites with background priorities to show through.

Obviously, if there is neither a visible sprite pixel with back or front priority, we will show the applicable background color.

There is a very interesting scenario when our main graphics is in multicolor mode. In Multicolor mode the high order bit indicates whether it is a background pixel or not. This means that we can have two possible background colors in multicolor mode, pixel value 00 and 01.

Having two background colors enables us to have a sprite that is sometimes hidden behind some objects and in front of others.

The following video shows how the game screen now looks with the recent round of changes:


This time around our emulator renders the scene more realistic. We go behind the rocks and is not visible when we go underwater.

This is actually the great nostalgic moment, what all this whole series of Blog posts were about!

Temperatures on the Zynq SoC

I mentioned in the beginning of this post that I have a bit of a concern on the temperature of the Zynq SoC when you are using it for extended periods of time.

One of my key areas for this concern is our USB stack program that runs on one of the ARM cores, which catches keystrokes from the USB keyboard and send it to our emulator hosted within the FPGA. To get an overall context, here is the main method of our USB stack program:

int main()
{
    Xil_DCacheDisable();
    init_platform();
    initint();
    initUsb();
    status = 0;
    state_machine();
    usleep(100000000);
    cleanup_platform();
    return 0;
}

We do some initialisation, and then we sleep for a long period of time (which in this case is 100 seconds). This sleep is necessary so our program is not terminated as a whole.

The code that does the actual work is the method state_machine, which is invoked every 10 milliseconds by a timer interrupt.

It should be noted that this program is running in standalone mode, and the usleep library call is implemented using busy waiting.

With busy waiting your CPU runs at full speed checking in a loop for something something to happen, which in this case is for 100 seconds to past.

As we know with busy waiting, your CPU is effectively running at 100% utilisation all the time, which uses more energy and produces more heat.

So, how heat will be produced by above program when we run for about half an hour?

Vivado provides some tools for us to answer this question. On the Hardware dashboard, temperature is one of the probes you can add.

When I started the emulator on the Zybo board, the temperature was around 51oC. Within minutes temperature has risen to about 54oC.

After about half an hour, the temperature settled to about 58oC.

This wasn't as bad a I have expected. For interest sake, I was wondering whether you could do some overclocking on the Zybo.

Some fiddling of the settings in the Vivado Block design, it doesn't really look like there is any real overclocking options. The only options that I could see, was to set the frequency of an ARM core between 50MHz to 667MHz.

So, in short, it doesn't look like using the Zybo board for long periods would cause any kinds of overheating.

Also, busy waiting didn't appear to be a big issue after all. However, I was still wondering what kind of temperature difference it would make if we could avoid busy waiting.

On ARM processors, the instruction WFI (wait for interrupt) is provided for this purpose. As per the documentation on ARM's web site:

WFI (Wait For Interrupt) makes the processor suspend execution (Clock is stopped) until one of the following events take place:


  • An IRQ interrupt
  • An FIQ interrupt
  • A Debug Entry request made to the processor.

So, in our case when we call WFI, our CPU would freeze until our timer interrupt fires externally:

int main()
{
    Xil_DCacheDisable();
    init_platform();
    initint();
    initUsb();
    status = 0;
    state_machine();
    asm("loop: wfi");
    asm("b loop");
    cleanup_platform();
    return 0;
}

Here we with added some inline assembly for invoking wfi. It should be remembered once an external interrupt has occurred and has been served, code execution will continue just after the wfi instruction.

It is therefore important that we loop back to the wfi instruction. If we don't, our main method will run to completion.

When we monitor the temperature when we use the WFI method, the Zynq definitely runs cooler. During this run I saw a temperature between 52oC and 53oC.  About a 5 degree difference!

In Summary

In this post we implemented all eight sprites within our VIC-II module. We also implemented the different priorities between Sprites and the Background.

This indeed brought us to the point where we could fully play the game Dan Dare within our emulator.

With this we are nearing almost the end of this Blog Series. There is, however, one more thing I would like to do, and this is to see if it is possible to add sound to the emulator.

So, in the next post we will start to implement sound.

Till next time!

2 comments:

  1. Once again, very interesting; also the part "CPU-heating while waiting";-)
    I've got Terrasic DE0-nano (cyclone 4) board, although I need to check, I expect no ARM SW-IP on it. Will see how far we come in porting your code to this FPGA.

    Looking forward seeing the progress on your last task!

    ReplyDelete
    Replies
    1. Good luck with your endeavour.

      Would be nice to read a write-up on your progress :-)

      Delete