Learning Assembly Language: Moving Data

One of the primary classes of things you can do in assembly language is move data from one place to another. Different processors have different takes on how this works. For example, some processors use a load/store model where the two operations are completely separate. Other processors use a move model where data gets moved from one place to another.

Before we delve in, let’s get some vocabulary:

  • register – a chunk of super-fast memory built in to the CPU. Most of the time, registers don’t have addresses and don’t correspond to actual memory, but are referred to by name.
  • memory – an area where data can be read or written. Memory is typically arranged in a sequence with a numerical address that refers to a location in memory.

I’m going to start with the load/store model first. In the load/store model, when you want to read from memory, you load a value in to a register.

LOAD A, $300

In a hypothetical computer, this instruction reads from memory location $300 and copies the value into register A. When you are reading from memory this way, it’s usually called “absolute” addressing because $300 is the actual address with no other changes. In the 6502, one of my favorite processors, this gets written a little more compactly:

LDA $0300

When this code gets assembled, it turns into this data:

AD 00 03

Here’s what happens inside the CPU:

The CPU reads AD and knows that the next two bytes are an absolute address. It reads 00 and 03 and combines them into $0300. It then sends that value out to memory along with a signal that says “READ”. A little while later, it picks up the value that comes back and copies it into A.

When you want to copy a value from a register to memory, you use a store instruct like this:

STORE A, $301

This copies the value that’s in the A register and writes it into memory at address $301. In the 6502, this gets written more compactly like this:

STA $0301

When this code gets assembled, it turns into this data:

8D 00 03

And it operates identically to LDA except that in addition to the address, it also sends out the value of A and a signal to write it into memory.

This is a very simple example. Absolute addressing is the second simplest way of reading from or writing to memory. The simplest way is called immediate. In immediate addressing, the value to be read is part of the instruction itself. For example, in 6502, if you had this:

LDA #$3F

It gets assembled as:

A9 3F

The CPU sees the A9 and knows that it needs to read the next byte from memory and copies it into register A.

The other ways of reading from memory vary from CPU to CPU, but most of them are similar enough. Here is a list of some of them:

Indexed: LDA $300,X – in this case the address $300 is taken and then the value of register X gets added to it, then memory gets read. This is one way that you can do typical array indexing (hence the name).

Indirect: LDA ($300) – memory is read at location $300 and the value that’s there gets used as the final address to read from. If you’re thinking that this feels an awful lot like C pointer dereferencing, you’re right.

Indirect indexed: LDA ($300),Y – memory is read at location $300 and the value that’s there then gets added to Y and gets used as the final address to read from. This can be used as another form of array indexing.

Indexed indirect: LDA ($300, X) – add X to $300 and the value that’s there gets used as the final address to read from. This could be used to read from an array of pointers.

The rest of the variations that you see on there are things that make it easy to skip across memory in various increments. I won’t go into them just yet.

Next let’s look at the move model. There are typically two ways that the move model gets represented: assignment statement or from/to. Which it is depends entirely on the assembler (I’ve seen both of these for the same CPU). The assignment model looks like this:

MOVE destination, source

Whereas the from/to model looks like this:

MOVE source, destination

I’m going to use from/to. So what are source and destination? That depends on the CPU, but typically, it’s either a register or some way of addressing.

In the 680×0 processors, you can copy one register to another using a move instruction:

MOV.B D0, D1

This moves a byte of memory from register D0 to register D1. The move model is very convenient when an the source and destination address can be any addressing mode. Not all CPUs allow this, though. For any of a number of reasons, the designers of the CPU make decisions to limit the addressing. For example, some might decide that either one or both must be a register. And it’s these limitations that make writing decent assembly tricky because what you think of in C:

*p = *(q + x);

Might require a little bouncing around in registers to make happen

MOVE.L q, A0 ; q goes into register A0
ADDA x, A0 ; add x to q
MOVE.B (A0), D0 ; move *(q + x) into D0
MOVE p, A0 ; move p into A0
MOVE.B D0, (A0) ; store the value (finally) into *p

And therein lies one of the common problems with writing assembly: it’s easy to get caught up in the minutiae and lose track of you big picture goal.

Still, we can see that by using either load/store or move, we can model an assignment statement or a binding in a functional programming language.

One thought on “Learning Assembly Language: Moving Data”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.