Learning Assembly Language: Moving Data

One of the primary classes of things you can do in assembly language is move data from one place to another. Different processors have different takes on how this works. For example, some processors use a load/store model where the two operations are completely separate. Other processors use a move model where data gets moved from one place to another.

Before we delve in, let’s get some vocabulary:

  • register – a chunk of super-fast memory built in to the CPU. Most of the time, registers don’t have addresses and don’t correspond to actual memory, but are referred to by name.
  • memory – an area where data can be read or written. Memory is typically arranged in a sequence with a numerical address that refers to a location in memory.

I’m going to start with the load/store model first. In the load/store model, when you want to read from memory, you load a value in to a register.

LOAD A, $300

In a hypothetical computer, this instruction reads from memory location $300 and copies the value into register A. When you are reading from memory this way, it’s usually called “absolute” addressing because $300 is the actual address with no other changes. In the 6502, one of my favorite processors, this gets written a little more compactly:

LDA $0300

When this code gets assembled, it turns into this data:

AD 00 03

Here’s what happens inside the CPU:

The CPU reads AD and knows that the next two bytes are an absolute address. It reads 00 and 03 and combines them into $0300. It then sends that value out to memory along with a signal that says “READ”. A little while later, it picks up the value that comes back and copies it into A.

When you want to copy a value from a register to memory, you use a store instruct like this:

STORE A, $301

This copies the value that’s in the A register and writes it into memory at address $301. In the 6502, this gets written more compactly like this:

STA $0301

When this code gets assembled, it turns into this data:

8D 00 03

And it operates identically to LDA except that in addition to the address, it also sends out the value of A and a signal to write it into memory.

This is a very simple example. Absolute addressing is the second simplest way of reading from or writing to memory. The simplest way is called immediate. In immediate addressing, the value to be read is part of the instruction itself. For example, in 6502, if you had this:

LDA #$3F

It gets assembled as:

A9 3F

The CPU sees the A9 and knows that it needs to read the next byte from memory and copies it into register A.

The other ways of reading from memory vary from CPU to CPU, but most of them are similar enough. Here is a list of some of them:

Indexed: LDA $300,X – in this case the address $300 is taken and then the value of register X gets added to it, then memory gets read. This is one way that you can do typical array indexing (hence the name).

Indirect: LDA ($300) – memory is read at location $300 and the value that’s there gets used as the final address to read from. If you’re thinking that this feels an awful lot like C pointer dereferencing, you’re right.

Indirect indexed: LDA ($300),Y – memory is read at location $300 and the value that’s there then gets added to Y and gets used as the final address to read from. This can be used as another form of array indexing.

Indexed indirect: LDA ($300, X) – add X to $300 and the value that’s there gets used as the final address to read from. This could be used to read from an array of pointers.

The rest of the variations that you see on there are things that make it easy to skip across memory in various increments. I won’t go into them just yet.

Next let’s look at the move model. There are typically two ways that the move model gets represented: assignment statement or from/to. Which it is depends entirely on the assembler (I’ve seen both of these for the same CPU). The assignment model looks like this:

MOVE destination, source

Whereas the from/to model looks like this:

MOVE source, destination

I’m going to use from/to. So what are source and destination? That depends on the CPU, but typically, it’s either a register or some way of addressing.

In the 680×0 processors, you can copy one register to another using a move instruction:

MOV.B D0, D1

This moves a byte of memory from register D0 to register D1. The move model is very convenient when an the source and destination address can be any addressing mode. Not all CPUs allow this, though. For any of a number of reasons, the designers of the CPU make decisions to limit the addressing. For example, some might decide that either one or both must be a register. And it’s these limitations that make writing decent assembly tricky because what you think of in C:

*p = *(q + x);

Might require a little bouncing around in registers to make happen

MOVE.L q, A0 ; q goes into register A0
ADDA x, A0 ; add x to q
MOVE.B (A0), D0 ; move *(q + x) into D0
MOVE p, A0 ; move p into A0
MOVE.B D0, (A0) ; store the value (finally) into *p

And therein lies one of the common problems with writing assembly: it’s easy to get caught up in the minutiae and lose track of you big picture goal.

Still, we can see that by using either load/store or move, we can model an assignment statement or a binding in a functional programming language.

Learning Assembly, What Can You Do Really?

When I was in college, one of the CS professors presented a model of the Manchester Baby (it was presented as the Manchester Mark I, but that was a significantly more complicated computer), one of the earliest computers. It had 7 instructions (and honestly, it didn’t need them all).

For the purposes of this, here is some vocabulary for you:

  • register – a small chunk of memory built into the computer.
  • accumulator – a register that is typically used for arithmetic or logic. Simple computers will only have one.
  • memory – short-term storage for data, readily accessible to the processor.
  • address – a number that refers to a location in memory.

LDNEG address – read the value from memory at address and negate it and store it in the accumulator.

STORE address – store the accumulator into memory at address.

SUB address – subtract the value in memory at address from the accumulator and leave the result in the accumulator.

SKNEG – if the result of the last operator was negative, skip the next instruction, otherwise perform the next instruction.

BR address – continue execution at address

BRREL offset – add offset to current address and continue execution there

HALT – stop execution

That’s it. And honestly, everything that a modern computer can do is really a variation on this set in some way. The cool thing is that this computer was Turing complete. You can compute any computable function on it provided that there is enough memory (spoiler: there really wasn’t). The problem is that programs end up being very wordy to do even simple things. For example, let’s say that you wanted to write a program to add three and four. It should look like this:

STORE Scratch
STORE Scratch1
SUB Scratch
STORE Result
Three: 3
Four: 4
Scratch: 0
Scratch1: 0
Result: 0

What this does is reads 3 from a memory location, which becomes -3 because LDNEG negates it. It puts this into Scratch. Then it reads 4 from a memory location which becomes -4. It puts this into Scratch1, then it re-loads it and it becomes 4. Then it subtracts -3 which is 4 – -3 = 7 which we store in Result and halt. Ta-da!

And you immediately see the problem with this assembly language: it’s very chatty. It takes 2-3x the work of most other assembly languages. It does, however, break down the capabilities of a typical computer into these categories:

  • Move values from/to memory and/or registers
  • Perform (limited) arithmetic/logic
  • Change flow unconditionally or unconditionally
  • Other

The last one is where HALT falls. In addition, many processors include an instruction called NOP or NOOP which stands for No Operation. It’s an instruction that does nothing except consume time. Some processors include dedicated instructions for input and output which could go into either “move values” or “other”.

And while that seems like a depressingly small number of things, that’s really it.

In the class where this was presented, the professor offered a prize of $1 for the student who could come up with the shortest (correct) program (including data) to calculate factorial. For a computer that didn’t have multiplication, let alone addition, this was a pain, but straight forward. And quite honestly, this is where most of assembly language programming falls: it can be a pain, but it’s typically straight forward.

Learning Assembly Language, Introduction

One of my co-workers has a goal to learn assembly language. I thought I would write up my experiences with it.

My coworker ask me how I learned. I started with 6502 on the Apple ][. I knew BASIC and I started learning 6502 when I was home sick from school. I sat down and wrote a program to print “HI ” in an infinite loop. It was a 3 instruction program. When I ran it, it was so incredibly fast compared to BASIC.

After that, I started writing programs with the Apple manual on lap. Most of what I did was treat the Apple ROM routines as my API and I coded to that. Later, when I got more advanced, I wanted to do things that I saw in games, but I didn’t know how, so I disassembled the games and read their code. In addition to writing my own code, I got pretty good at reading other people’s assembly which is something I still have to do.

As I wrote code, I built a model in my head of how it worked. It turned out to wrong in some details, but the most part it was close enough.

Do I still write assembly? From time to time. I have to read it frequently in my current job. For me, it’s been an indispensable tool to use when I can’t otherwise explain a particular behavior. There have also been cases where I have worked with a licensed library and found bugs in their code that I couldn’t work around. In addition to reporting the bug I could also tell them where the bug was. Finally, understanding how typical functions get implemented has been useful when I needed to write code that was especially performant without having to write assembly.

I’m Old, Part LXXXIII: Liberating Restriction

While working on Acrobat 2.0, I remember having a conversation with Gordon Dow about Lego kits. He said that he preferred the simpler bricks as he enjoyed the liberating restriction of having to work with all rectangular pieces. I love that phrase because of it’s inherent contradiction and because it is apt to many things.

For example, I took a CS class at Oberlin in Automata Theory taught by Rhys Price Jones. Rhys covered a generalized state machine and gave us as assignment to implement code to generate the state machine code from a text description of the state machine. He wanted us to work in Scheme and narrowly defined the scope of the assignment that each state must itself be a process, where a process was really just a lambda expression that handled the input and picked another lambda expression to continue on.

At that point, Scheme was seriously on my shit list for languages at the time and I really didn’t want to implement this assignment in Scheme. I pressed about doing it in C and that each state could be a function since really, each of those lambda expressions were functions too. Rhys said no, they had to be be processes.


I still did the program in C. The top level C program read the machine description and for each state transition, wrote another C program. The main() of that program read a character from stdin, did a switch on it and forked a new process running the next program, piping in stdin and stdout. After all the state transition programs were written, the top level program compiled each in turn.

It took some juggling and debugging, but I made it work. And the whole process was fun for me. I speculate probably more fun than the rest of the class who were dutifully working in Scheme. I’m not saying that they weren’t enjoying the assignment, just that they weren’t giggling at the absurdity of the process.

And at this point in my career, I’m writing code that does static analysis on other code and writes more code in two different languages that can call across to each other, so I guess that was a nice little bit of prep for that.