Archive.org has been working to collect old software and emulators to digitally preserve our early roots in computing. In particular, they’ve been working to preserve Apple II software, some of which has had very creative DRM. One way to preserve software with DRM that depended on unformatted data on disks was to create a means of modelling that. This is the WOZ file format. They’ve been recently posting preserved disks under the title “woz-a-day” and yesterday was apparently my day!
I donated a disk of mine to the archive which has the original version that I tried to sell as a game. The story of how it came to be is in the description. Today, what I’m going to do is take apart one bit blitter, a routine to move pixels around the screen, and show you how it works. I wrote this blitter when I was 14, so this is going to be a good solid dose of personal WTF.
Here’s the code as it exists on that original disk (I’ll take it apart piece by piece after).
CLC LDA #$11 ADC $300 STA $302 LDY #$00 STY $304 LDX $300 L1 LDA $6000,X STA $01 LDA $6100,X STA $00 LDY $304 LDA SHAPE,Y STY $304 LDY $301 STA ($00),Y LDY $304 INY STY $304 INX CPX $302 BNE L1 RTS SHAPE .HS 0063141C3E7F6B6B7F5D633E1414367722
What does this do? From a high level, it takes coordinates from locations $300 and $301 and uses those to draw a shape on the screen. $300 is the Y coordinate and runs from 0-181. $301 is the X coordinate and is runs from 0-39. The actual X value on the screen is X * 7, this is because this draws on byte boundaries.
Here’s my initial reaction to this: holy crap, what was I thinking? Yes, it will work but there is no clipping at the bottom and this routine is wasteful as hell. It’s a crash waiting to happen.
OK – let’s take it apart piece by piece:
CLC LDA #$11 ADC $300 STA $302 LDA #$00 STY $304
The shape is 17 pixels high (that’s the $11). So what I’m doing is adding $11 to the Y start coordinate and storing it in $302. That CLC clears the carry flag because addition on the 6502 always includes a carry. If you don’t clear it explicitly, you get an off by one error. What I’m doing here is setting an end condition for when to stop drawing. This routing will always draw $11 lines. $304 is the index into the table of data for the shape.
LDX $300 L1 LDA $6000,X STA $01 LDA $6100,X STA $00
This is setting up a pointer for the destination on the screen. I initialize the X register to the initial Y coordinate then lookup the address of the start of that scanline in two tables at $6000 and $6100. These get stored in locations $01 and $00 which are in the “zero page” (the bottom 256 bytes), which is special on the 6502 for a couple of reasons. The one I’m taking advantage of here is that if you want to dereference a pointer without self-modifying code, you need to use 0 page memory.
LDY $304 LDA SHAPE,Y STY $304 LDY $301 STA ($00),Y
I’m doing a lot of register juggling here. The current shape index is in $304. We read a byte of shape, reload the X coordinate (x 7) from $301, then write the byte to the screen. Remember that $00 and $01 point to the first byte of the scanline. The STA ($00),Y instruction takes that address, adds Y to it, then stores A at the location.
LDY $304 INY STY $304
This reloads the shape index into Y and stores it back into $304. This is not great code. I should have done that back after I read the shape and I wouldn’t have needed the redundant load or done something more creative (dun-dun-DUNH! foreshadowing!)
INX CPX $302 BNE L1 RTS
This adds 1 to X and compares it to the largest Y coordinate. If it’s not equal, we go back and repeat.
So how bad is this code? Well, not too bad, but it could be better. The main loop takes 52 cycles and run to completion, the main loop will take roughly 882 cycles or .8 milliseconds. Let’s see if we can do better.
LDA #$11 STA $02 LDA #SHAPE STA L2+1 LDX $300 LDY $301 L1 LDA $6000,X STA $00 LDA $6100,X STA $01 L2 LDA SHAPE STA $(00),Y INC L2+1 INY INX DEC $02 BNE L1 RTS
A couple changes. Notably, I’m now using self modifying code:
LDA #SHAPE STA L2+1
This takes the low byte of the address of the shape and stores it into the LDA SHAPE instruction. This has the effect of restoring the code to what it was when it was assembled. Rather doing register juggling, this code does an INC L2+1, which changes the instruction to look at the next byte. As long as the shape is less than 256 bytes high (it is) and as long as the shape doesn’t cross a page boundary (it doesn’t), this is totally cool. Disgusting, but cool.
The main loop takes 39 cycles, 663 cycles total, or .64 ms, or 80% of the time of the previous code. There’s still no clipping, but I would have put that in the set up and change the value stored at $02 depending.
Now let’s look at the shape data:
There are 17 bytes of data for the shape. The Apple II displayed 7 bits out of every byte as pixels. The 0th bit was the far left in every group of 7. I lovingly translated that in PhotoShop and made this:
The process of redrawing this from the data took me 5 minutes, I think. When I was 14, I would have drawn this on graph paper and converted it by hand into base 16 using a look up table I had. Later on I wrote a rudimentary paint program that automated this process somewhat.
It’s missing the 0 on the top, which is a one pixel high bar of nothing. So, this shape was being used for drawing a creature that only went top to bottom in one pixel increments, erasing what it left behind as it went. My guess is that this is the creature that shows up every now and again in the attract mode of the game.