ZalaXa 13 — You Gotta Roll With It

In the last tutorial I looked at creating a randomized starfield for the main menu. Since then I’ve been playing around with techniques to scroll it vertically, like the starfield on the Galaga arcade machine.

I found a bug in the code I wrote. Instead of using a pair of random(ish) numbers for the X and Y coordinates of each star, I was actually using the Y coordinate of the previous star for the X coordinate of the next one. This was causing unnecessary vertically-repeating patterns.

After fiddling around with this for a while, I decided to use a third randomish number to represent the star position with the character. This leaves me with some spare bits I can use for attribute colours without introducing more repeating patterns.

I also added a seed randomizer in my test routine—instead of advancing the ROM seed sequence by one every time you press a key, I completely randomized it to any address between $0000 and $3FFF. I used Patrik Rak’s Complementary Multiply With Carry pseudo-random number generator again for this.

Same deal as last time, except this time it scrolls vertically:

Most of the code is pretty similar to last time. The section that draws the stars is duplicated. In the first iteration, the part that creates the set n, a opcode now creates a res n, a instruction instead, which undraws the previous star.

37
38
39
40
41
42
; stars.asm
 
                        ld a, (hl)                      ; Read the byte represnting the star pixel,
                        or %10000111                    ;      turn it into a "res n, a" instruction ($80 10nnn111) by setting
                        and %10111111                   ;      and clearing bits,
                        ld (ResetBit), a                ; SMC> then write this into the pixel-drawing code.

In the second iteration, we keep the set n, a instruction as it was in part 12. But before we draw the star pixel, we manipulate the Y coordinate by including the ScrollStar macro:

72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
; macros.asm
 
ScrollStar              macro()
                        ld a, d                         ; Retrieve the byte containing bits 0..2 of the Y coordinate.
                        and %00000111                   ; Calculate Y mod 8 (0..7).
                        cp %00000111                    ; If 7 then
                        jp z, CharScroll                ;   do a character scroll.
                        ld a, d                         ; Otherwise do
                        inc a                           ;   a pixel
                        ld d, a                         ;   scroll.
                        inc (hl)                        ; Save px-scrolled offset back to StarField.Table.
                        jp Continue
CharScroll:
                        ld a, e                         ; Retrieve the byte containing bits 3..4 of the Y coordinate.
                        add 32                          ; Increment those two bits by one
                        ld e, a                         ;   and save back the byte. This moves into the next char row.
                        dec l                           ; Now retrieve the byte
                        ld (hl), a                      ;   containing bits 0..2 of the Y coordinate,
                        inc l                           ;   and
                        xor a                           ;   reset it to zero, ensuring we start at the top of that char row.
                        ld d, a                         ; Save that back to the byte, and also
                        ld (hl), a                      ;   save the px-scrolled offset back to StarField.Table.
Continue:
mend

Doing this as a macro isn’t really necessary at all, but when reading the code, it does emphasise that the pixel-setting half is almost the same as the pixel-clearing half. It achieves this by abstracting the extra code into a different place—without any of the call/ret/jp overhead a separate routine would entail.

The stuff in the macro decides whether we’re scrolling within a character boundary, or scrolling between character boundaries. Then it sets the appropriate bits of the Y coordinate accordingly. The specific maths here is needed because of how the Y coordinate is distributed within the screen address:

Image from the L Break Into Program blog, with thanks to Dean Belfield.

In the Spectrum’s distinctive display file, the bits labelled Y0, Y1 and Y2 govern the position (0..7) within a character row. We set these bits in the first half of the macro.

The bits labelled Y3, Y4 and Y5 govern the position of the character row (0..7) within the current screen third. We set these bits in the second half of the macro.

The bits labelled Y6 and Y7 govern which screen third the character row appears in. Because we’re repeating the same stars in all three thirds, we set these bits in the SetupStars routine:

84
85
86
87
88
89
90
; stars.asm
 
                        ld b, 8                         ; Fast way of doing ld bc, $0800 (size of a screen third).
                        add hl, bc                      ; hl is now a pixel address in the middle screen third.
                        ld (hl), a                      ; Draw the same single pixel star here, too.
                        add hl, bc                      ; hl is now a pixel address in the bottom screen third.
                        ld (hl), a                      ; Draw the same single pixel star here, too.

Actually, adding $0800 to the screen address using the add hl, bc instruction is an inefficient way of doing this. All we really need is to add 8 to h. Or even better, flip one or two bits of h. I might optimise this part another time.

I made a list of 32 of my favourite starfields using this new version of the randomizer test routine. I particularly like how it tends towards drifts of stars that look like constellations.

This is my favourite starfield (seed address $26B1), complete with the scroll effect:

Next time, I’ll explore setting attribute colours to make the stars twinkle. Cheers!

ZalaXa 12 — Towards a Scrolling Starfield

The technology in Namco’s arcade version of Galaga (1981) is from the same era as the ZX Spectrum, but is actually a little beefier. The main board has three Z80 processors, clocked at 3Mhz—each one just under the speed the Spectrum’s single processor runs! Sadly, I think that rules out being able to code an authentic arcade version of Galaga in this tutorial series. In particular, if you recall, NIRVANA+’s multicolour raster-chasing technique uses about 78% of the available processor T-states we normally have when writing Spectrum programs. That said, this should still be a fun exercise in pushing the machine to its limits 🙂

Arcade Galaga has a rather nice winking multicolour starfield scrolling down the screen during gameplay. This isn’t really necessary for the game, and won’t demonstrate any NIRVANA+ techniques, but I am quite attached to the effect, so I thought I’d take a short detour and try and get this working on the start menu.

I did some rough calculations, and decided we need about 100 stars, give or take. My first thought was to write a short BASIC program to prototype this, using RND and PLOT in a loop, but I decided to do it in machine code, which I find a bit easier.

My first attempt used Patrick Rak’s 8-bit Complementary-Multiply-With-Carry (CMWC) random number generator, to generate a table of random bytes. This proved the concept nicely, but then I remembered people often use the ZX Spectrum ROM as a source of weakly-random numbers. Surely this would be good enough for my purposes, as it’s only a few stars. The trick would be to pick a short sequence of numbers from somewhere in the ROM that looks natural enough. We have 16KB to play with—more on the later models, but let’s try to limit ourselves to something that will work and give the same results on all Spectrum models.

Thinking further, I realised the vertical scrolling will be the trickiest part of the problem. The Spectrum screen is laid out in vertical thirds, and having the pixels jump between third-boundaries will complicate things. For our purposes, though, we should be able to duplicate the same stars across all three. If each third wraps around, then individual stars will seem to scroll continuously into the next third at the same time as wrapping round. Einfach genial!!

Better still, we might be able to mitigate any obvious repetitive patterns, by using different attribute colours for each third. The entire starfield will appear to be shimmering because of the colour effects, and that should provide some distraction.

I came up with a routine that does this, noting the “seed” ROM address on the debug screen for reference. It waits till you press a key, then adds one byte to the seed address and does it all over again. Like this:

Excellent! There’s a bit of flicker in the middle third, but that’s just a side effect of the way I coded the keypress and screen re-clear loop.

It turns out that most of the seed values are not very random at all. But, persevering, I found a few seed values that gave fairly uniform results, without any ugly double pixels or tight bunchings. This was the most promising one, $03F3:

Not perfect, but good enough.

I’ll develop this a bit further in the next tutorial, but let’s finish up by looking at my code:

3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
; stars.asm
 
SetupStars              proc
                        ld a, DimWhiteBlackP            ; Custom colour for ClsAttr.
                        call ClsAttr.WithCustomColour   ; Alternate entry point without setting colour.
RandomLoop:
                        call ClsPixels                  ; Clear all the pixels.
Seed equ $+1:           ld hl, $0000                    ; Starting point of ROM "psuedo-random" table (2 bytes per star). Try $03F3!
                        zeusdatabreakpoint 1, "zeusprinthex(1, hl)", $ ; Log current value of ROM table to debug window.
                        ld a, NumberOfStars             ; Loop through this many times, once for each star.
DrawLoop:
                        ex af, af'                      ; Save number of stars (loop counter).
                        ld e, (hl)                      ; Read 2 bytes,
                        inc hl                          ;   for this star,
                        ld a, (hl)                      ;   into de.
                        and %00000111                   ; Constrain de between 0..2047
                        ld d, a                         ;   (size of top screen third).
                        ld a, e                         ; Mask out a bit number (0..7) from the X coordinate,
                        or %11000111                    ;      turn it into a "set n, a" instruction ($CB 11nnn111),
                        ld (SetBit), a                  ; SMC> then write this into the pixel-drawing code.
                        ex de, hl                       ; Sawp the reading addr into de, and the writing addr into hl.
                        ld b, high(PixelAddress)        ; c stays 0, so this is a fast way of doing ld bc, $4000.
                        add hl, bc                      ; Calculate a pixel address in the top screen third.
                        xor a                           ; Draw a single pixel star,
SetBit equ $+1:         set SMC, a                      ; <SMC  by setting that pseudo-random
                        ld (hl), a                      ;       bit (0..7) from earlier.
                        ld b, 8                         ; Fast way of doing ld bc, $0800 (size of a screen third).
                        add hl, bc                      ; hl is now a pixel address in the middle screen third.
                        ld (hl), a                      ; Draw the same single pixel star here, too.
                        add hl, bc                      ; hl is now a pixel address in the bottom screen third.
                        ld (hl), a                      ; Draw the same single pixel star here, too.
                        ex de, hl                       ; Swap the writing addr back into de, and reading addr into hl.
                        ex af, af'                      ; Retrieve the number of stars (loop counter),
                        dec a                           ;   decrease it,
                        jp nz, DrawLoop                 ;   and do all over again if there are any stars left.
 
                        call WaitForAnyKeyPress         ; Spin until any key is pressed
 
                        ld hl, (Seed)                   ; Increase starting point of ROM "psuedo-random" table
                        inc hl                          ;      by one byte,
                        ld (Seed), hl                   ; SMC> save it into the routine,
                        jp RandomLoop                   ;      and rerun the routine again.
 
NumberOfStars           equ 32                          ; Constant declared locally to keep it handy.
pend

Once again, apologies for the syntax highlighter I’m using. My pair of ex af, af' opcodes apparently turned a whole bunch of the midsection into one long string. Fortunately the Z80 is not fooled by such shenanigans…

Anyway, this rather dense chunk of code clears the screen at the beginning, then on line 10 sets the ROM address seed value ($0000 initially).

Line 11 is a sweet Zeus feature that can break, or print values, based on expressions you write. Here, I’m writing an expression in slot 1, and setting it to be active on line $—the next opcode, which in this case is ld a, NumberOfStars. It doesn’t matter too much where you set it—in this case, the important thing is that it’s after hl has been set, but before it changes again.

The expression itself is zeusprinthex(1, hl). The first 1 means “always evaluate the expression1“. Everything else inside the brackets is a comma-separated list of things to print—in this case, just hl. Because I used zeusprinthex, the arguments are printed as hex values—as you might expect, zeusprint(1, hl) would print the decimal value of hl.

Referring back to the video, this is indeed what happens.

The next section (lines 12-19) sets up a loop counter, stores it away in the a' alternative register, then reads a couple of bytes from the ROM. These bytes, taken together, are effectively a random number between 0 and 65535. The size of our screen thirds is 2KB, or 2048 bytes. It turns out we can zeroise the leftmost five bytes, to turn it into a random number between 0 and 2047. Once again, powers of two are the bomb!

10
11
12
13
14
15
16
17
18
19
; stars.asm
 
                        ld a, NumberOfStars             ; Loop through this many times, once for each star.
DrawLoop:
                        ex af, af'                      ; Save number of stars (loop counter).
                        ld e, (hl)                      ; Read 2 bytes,
                        inc hl                          ;   for this star,
                        ld a, (hl)                      ;   into de.
                        and %00000111                   ; Constrain de between 0..2047
                        ld d, a                         ;   (size of top screen third).

The following section (lines 20-22) Grabs three of the bits from the part of the screen address that governs the X coordinate (meaning it won’t change as it vertically scrolls), and picks one of the 8 pixels within the pixel address to plonk a star into.

18
19
20
21
22
; stars.asm
 
                        ld a, e                         ; Mask out a bit number (0..7) from the X coordinate,
                        or %11000111                    ;      turn it into a "set n, a" instruction ($CB 11nnn111),
                        ld (SetBit), a                  ; SMC> then write this into the pixel-drawing code.

This is really fancy stuff. The Z80 processor is wired up internally in a very logical and consistent way, which means there are many relationships between similar opcodes. The opcodes for setting a bit all follow this rule, or formula: SET b, r is a two-byte instruction, the first of which is always $CB. The second is $C0+(8*b)+r, where b is a bit number between 0 and 7, and r is a register chosen from one of these values:

Register    Value
A           7
B           0
C           1
D	    2
E	    3
H	    4
L	    5

We’re using register a, which is %00000111, to set the star pixel. $C0 also happens to be $11000000. Adding these two together gives us a mask of %11000111. We take the three bits we chopped out of the X coordinate, which are in the exact position to fill the three zero bits in our mask, then or them together in line 22. The result is an opcode that’s always one of the following:

set 0, a
set 1, a
set 2, a
set 3, a
set 4, a
set 5, a
set 6, a
set 7, a

Having done that, line 22 writes it into the correct place in the program (line 27), ready to be run shortly. This is similar to how an assembler works, but we used the technique in our own program.

The code between lines 23 and 35 calculates a pixel byte based on our X and Y coordinates, executes our hand-assembled set instruction, and writes the star byte to the display file. It then does it two more times to the other two screen thirds, taking advantage of the fact that the screen thirds are all separated by exactly $0800 bytes (2048 in decimal).

The rest of it is just boring glue code that waits for a key to be pressed, moves to the next ROM seed value, clears the screen again, and reruns the whole process.

In the next tutorial I’ll explore ways to vertically scroll the stars. Cheers!

Zeus Data Breakpoints — Part 1

One of the nice things about the Zeus Z80 cross-assembler is the debugger built into its integrated emulator.

OMOIDE, one of the ZX Spectrum games I’m working on has a huge conceit—the entire game is presented as if it’s an obscure Japanese game that never got translated into English. The text is written in English, in katakana script, as the Japanese do for foreign loanwords, and often also for videogames.

I softened slightly on the menu items, as it’s all too easy for players to accidentally select a joystick option that renders them a) unable to play the game, if not just b) confused. I added a little marquee at the top that discreetly cycles through the menu options in English.

The text for option 0 is supposed to say 0: PLAY in a tiny 3×5 pixel font (actually one of the Robotron 2084 fonts), but sadly it doesn’t. It’s something more akin to N. Qi Nw, which is no good to anybody, apart from possibly a klingon.

I checked all the obvious things, and couldn’t for the life of me figure out why this item (and only this item) gets corrupted. It happens consistently, even if you assign that text to a different option—the problem moves with the text, not with the menu slot. It’s semi-legible, like the lines got shuffled around, which makes it worse than random garbage, as there’s obviously some logic to it, albeit wrong logic.

Let’s look at the data that’s copied onto the screen. This is a multicolour NIRVANA+ menu, which means there’s only about 15,000 T-states available per frame to do everything, instead of the usual 70,000-odd. For speed of reading, writing and address calculation, I’m storing the data in the same format the screen does. Which, if you’ve ever watched a loading screen appear line by line, you’ll know is not laid out in a linear coordinate-based fashion.

I write the text in a standard .SCR file, and load it into memory at a convenient place. It turns out I don’t need the while file, only about 3/5ths of the first third of the pixels, up until the green dots. And none of the attributes – they’re just they’re to make it easier to work with in my graphics app. By checking in a hex editor, I can see I only need the first 1139 bytes—still a bit wasteful, but I have the space and I need every T-state.

This data is referenced in a table, where zxpixeladdr() is a Zeus helper function that converts pixels into addresses—zxpixeladdr(0, 0) would emit $4000, etc. Only the low byte of each address needs to be stored, because the high byte is the same for all the entries—halving the size of the table.

align 256
MenuExplanation proc Table:
 
  ;                                   Low   Index   Function
  db MenuText.Offset+zxpixeladdr(  0,  0)   ;   0   0: Play
  db MenuText.Offset+zxpixeladdr( 72,  0)   ;   1   1: Keyboard
  db MenuText.Offset+zxpixeladdr(144,  0)   ;   2   2: Kempston
  db MenuText.Offset+zxpixeladdr(  0,  8)   ;   3   2: Sinclair
  db MenuText.Offset+zxpixeladdr( 72,  8)   ;   4   2: Cursor
  db MenuText.Offset+zxpixeladdr(144,  8)   ;   5   2: Fuller
  db MenuText.Offset+zxpixeladdr(  0, 16)   ;   6   2: Kempston Mouse
  db MenuText.Offset+zxpixeladdr( 72, 16)   ;   7   2: AMX Mouse
  db MenuText.Offset+zxpixeladdr(144, 16)   ;   8   3: Help
  db MenuText.Offset+zxpixeladdr(  0, 24)   ;   9   4: High Scores
  db MenuText.Offset+zxpixeladdr( 72, 24)   ;  10   5: Credits
 
  struct
    Low         ds 1
  Size send
 
  Len           equ $-Table
  Count         equ Len/Size
  High          equ high( MenuText.Offset+zxpixeladdr(0, 0))
  Joystick      equ 2
  Items         equ 6
 
pend

The code that reads and prints the date looks like this. PageBank() is a macro I use to switch the 128K upper RAM bank. As you can see, it uses ldir to do a zigzag block copy of the data—the first, third and fifth rows from left to right, and the second and fourth rows from right to left.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
PrintMenuExplanation    proc                            ; MenuExplanation.Index is passed in L
                        PageBank(0, true)
                        ld h, high(MenuExplanation)
                        ld l, (hl)
                        ld h, MenuExplanation.High
                        ld de, zxpixeladdr(80, 8)
                        ld bc, 9
                        ldir
                        inc h
                        inc d
                        dec l
                        dec e
                        ld bc, 9
                        lddr
                        inc h
                        inc d
                        inc l
                        inc e
                        ld bc, 9
                        ldir
                        inc h
                        inc d
                        dec l
                        dec e
                        ld bc, 9
                        lddr
                        inc h
                        inc d
                        inc l
                        inc e
                        ld bc, 9
                        ldir
                        ret
pend

As I always do when I hit a brick wall, I reached for the Zeus debugger and its zeusdatabreakpoint feature, inserting them into the code like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
PrintMenuExplanation    proc                            ; MenuExplanation.Index is passed in L
                        PageBank(0, true)
 
                        ld a, l                         ; Save L to print in the breakpoints
                                                        ; (not needed in the final code)
                        ld h, high(MenuExplanation)
                        ld l, (hl)
                        ld h, MenuExplanation.High
                        ld de, zxpixeladdr(80, 8)
                        ld bc, 9
                        zeusdatabreakpoint 2, "zeusprinthex(1, a, hl, de, bc)", $
                        ldir
                        inc h
                        inc d
                        dec l
                        dec e
                        ld bc, 9
                        zeusdatabreakpoint 2, $
                        lddr
                        inc h
                        inc d
                        inc l
                        inc e
                        ld bc, 9
                        zeusdatabreakpoint 2, $
                        ldir
                        inc h
                        inc d
                        dec l
                        dec e
                        ld bc, 9
                        zeusdatabreakpoint 2, $
                        lddr
                        inc h
                        inc d
                        inc l
                        inc e
                        ld bc, 9
                        zeusdatabreakpoint 2, $
                        ldir
                        ret
pend

There are nine general purpose slots that you can write data-driven breakpoint expressions in—plus slots to break on expressions involving data reads, writes, IO port reads and writes, and RAM page changes. Expressions can be set in the UI or in code—the latter allowing them to persist across multiple debugging sessions.

The expressions can be extremely complicated, and can read and change memory if you need to. Here I’m keeping it simple. zeusprinthex(1, a, hl, de, bc) means always (1) print these expressions (the values of a, hl, de and be) to the debug output window, whenever the emulator’s PC is pointing at these addresses (the five values of $, which equates to the five lines following each expression). I’m using the same slot (2) for all of them, as the expression is the same.

Running through the first three menu items, it looks like this:

I put a general non-data breakpoint in at the start of the routine too, purely because it separates out the debug output nicely.

Immediately you can see the pattern is wrong. For the second and third menu items, hl (the source address) increases by $100 for each of the five lines. But for the first item it doesn’t! It’s an edge-case—the lines that go wrong start on a 256-byte boundary (i.e. has an $NN00 address).

0000 D100 402A 0009
0000 D208 4132 0009
0000 D200 422A 0009
0000 D308 4332 0009
0000 D300 442A 0009
 
0001 D109 402A 0009
0001 D211 4132 0009
0001 D309 422A 0009
0001 D411 4332 0009
0001 D509 442A 0009
 
0002 D112 402A 0009
0002 D21A 4132 0009
0002 D312 422A 0009
0002 D41A 4332 0009
0002 D512 442A 0009

I could fix this in the code with special handling, but it’s much easier to shift everything along one byte—which equates to 8 pixels to the right—in the .SCR file! After all, I have the space 🙂

  ;                                   Low   Index   Function
  db MenuText.Offset+zxpixeladdr(  8,  0)   ;   0   0: Play
  db MenuText.Offset+zxpixeladdr( 80,  0)   ;   1   1: Keyboard
  db MenuText.Offset+zxpixeladdr(152,  0)   ;   2   2: Kempston
  db MenuText.Offset+zxpixeladdr(  8,  8)   ;   3   2: Sinclair
  db MenuText.Offset+zxpixeladdr( 80,  8)   ;   4   2: Cursor
  db MenuText.Offset+zxpixeladdr(152,  8)   ;   5   2: Fuller
  db MenuText.Offset+zxpixeladdr(  8, 16)   ;   6   2: Kempston Mouse
  db MenuText.Offset+zxpixeladdr( 80, 16)   ;   7   2: AMX Mouse
  db MenuText.Offset+zxpixeladdr(152, 16)   ;   8   3: Help
  db MenuText.Offset+zxpixeladdr(  8, 24)   ;   9   4: High Scores
  db MenuText.Offset+zxpixeladdr( 80, 24)   ;  10   5: Credits

Bingo, bug fixed! This took about a minute and half from starting to write the databreakpoint expressions to testing the fix—waaay shorter than it took to write up for the blog! Sure, I could have stepped through the code in any emulator, and figured out the same thing, but this approach scales up very well when the problem is more complicated, particularly when it is spread out over multiple routines across a longer timeframe.