Perils of Open Source Compilers

For my job, I work on Binding Tools for Swift. This is a suite of tools that does static analysis of compiled Swift modules and generates adapters in Swift and bindings in C#. In short, I make a Swift library look like its written in C#.

When I started on the project, I needed the ability to reflect on the front-facing interface of a Swift module. I couldn’t do this in Swift. The first reason is that there was no useful reflection in Swift, the second is that running reflection only gives you a view of the API from the point of view of the platform on which the reflection is being done.

Instead, I took advantage of the fact that Apple had kindly released their compiler as Open Source. I hooked into the command line arguments and wrote code that used their abstract syntax tree visitor to generate an XML document that represents the API.

There’s a problem with this, though. The compiler with reflection has to match what Apple ships. Why? Because if it doesn’t than we can’t consume the modules and if we use the custom compiler to compile Swift glue code, it won’t link with the original module. Fortunately, it was easy enough to do, since there was a branch with the release compiler.

I’m bringing my tools up to Swift 5.3 and discovered that this had changed. In looking at the shipping compiler, I got this information for the version:

Apple Swift version 5.3 (swiftlang-1200.0.29.2 clang-1200.0.30.1)
Target: x86_64-apple-darwin19.6.0

When I look at the custom compiler built from the release/5.3 branch, I get this for the version:

Swift version 5.3-dev (LLVM 6811d14c28, Swift 9cfbe5490b)
Target: x86_64-apple-darwin19.6.0

They’re no longer the same and as a result, there is a problem: I can’t consume libraries compiled by the system compiler and it can’t consume libraries compiled by my version of the compiler.

I have a workaround, I think, but let’s talk about why this change happened and it is a reflection of a really interesting conflict of interest in Open Source, especially with compilers. To be clear, I don’t know the actual reason why, but I have a pretty good guess.

I’ll direct you to the Turing Award Lecture, given by Ken Thompson. In the lecture he describes how you can create a Trojan Horse that can change the system compiler such that it injects a security flaw into the operating system on its next build.

From this knowledge, it is actually pretty dangerous to be able to make a compiler that has the same signature as the system compiler. Since I have the source to the compiler, I could make one that can inject security holes into compiled code and not be detected. The next step is to make a Trojan Horse that delivers the modified compiler into the system. If the spreads to the systems of developers or to build servers, then there’s a big problem.

On the other side of the coin, if I release code into Open Source, shouldn’t it be possible to build a bit-identical version of the released product? And if I can’t, is this truly Open Source? Otherwise the compiler I built is really just an artist’s interpretation.

Apple clearly chose to do this on the side of trust, and I get that. At the same time, I’ve been at best inconvenienced and at worst, I’m dead in the water. In a practical sense, it doesn’t matter since (1) I’m not really a customer of Apple so it’s not their job to make me happy (2) their actual priorities are creating a solid compiler that efficiently generates correct code.

The fix on my end is to take advantage of the ABI stability promise and turn on the flags “-enable-library-evolution” and “-emit-module-interface”. These at least allow the compilers to be able to talk to each other. It’s not exactly ideal from the point of view of people who are going to use Binding Tools for Swift, but it’s not nothing.

Remembering Mike Hawley

My brother Mike passed away early yesterday morning. It’s heartwarming to read all the kind words from people posted on social media. One common thing is that Mike often challenged you go the extra step in whatever you were doing. It could feel aggravating if you took in the context of, “hey – I just made this cool thing and it’s not good enough?” but what he was trying to do was say, “wow! That’s is really cool! I’m so excited by it that I can see the possibility that it holds.”

Mike has always had a great affect on me from the very beginning. One of my earliest memories was in kindergarten. We’d had an “assignment” to learn how to tie your shoes and I had forgotten to practice. I was in tears before school when I remembered and Mike patiently took the time to teach me.

Mike was an explorer for sure. As a kid, he routinely took bike trips to the limit of a kid’s range – some around 15 miles to Jockey Hollow. He was active in scouting and created the Kowabundee Chicken Patrol and went on camping adventures including down the Alagash river. In high school, he was part of a group of intellectual hell-raisers who published an underground newspaper, routinely absconded with “useful” chemicals and wrote a tome called “The Honors Physics Consortium Poetry and Songbook”. They printed several copies and went as far as making phony card catalog cards an embedded a copy in our local library where it was shelved until a fire destroyed the collection. In and after college, he and some friends planned trips to canoe down the Mistassini River in Quebec and then in a later year down the Yukon river in Alaska.

As has been mentioned, he was an accomplished pianist. He shared that trait with my mom. My mom thought everyone should have piano lessons and barring that, everyone within her reach. I lacked that particular talent. Interestingly enough, right around age 14 Mike was getting bored with piano. It was too easy, apparently. My mom caught on and introduced him to the music of Liszt. That woke Mike up and he brought Lizst’s piano works to his teacher, Heidi Grob, who had to tell him that she couldn’t teach him Liszt since she couldn’t play it. Keep in mind that not many people can play Liszt, but Mike was determined. Mom found him another teacher, John Quinn, who helped Mike reach further.

The downside to having a talented brother practicing piano at all hours is that it was hard to fall asleep while he practiced. Fair’s fair though – I play trumpet and I’m sure I drove him nuts with my practicing too. I didn’t realize how much of a heartbeat it was in our house until he went off to college and it was missing.

The three of us all worked in high school at Murray Hill Bell Labs (home of the transistor!). All of us were into coding and writing software. Mike made friends with people in the UNIX group and he learned C and did a bunch of work on data searching and retrieval. I recall him writing a tool called “hmm” that was the way of entering information. I remember most that I/O coverup – as it churned information, it would print “hmm” and a series of dots followed by “ok” when it finished. He worked on a scrolling calendar program for the Blit terminal – a smart terminal with a 68K processor and a legal sized screen. which led to him spending a year working at IRCAM in Paris on music composition software on an imported Blit. There was some discussion as what the proper pronoun should be for the Blit. I think they settled on feminine. Mike also played one part of a two piano piece for Pierre Boulez before returning. He described it as “your basic banging on the keyboard with cheeseburgers” and that he and the other pianist had a signal to jump to the end and play the last chord.

Mike continued his exploration and innovation for years. He built schools in Cambodia and named one after our mom. One of the things that was interesting about having an overlap in careers is that I kept running into people that knew Mike. Every place I worked, I bumped into someone who knew Mike in some capacity. And that was one of Mike’s aspects: it was almost as if he collected people. I recall a conversation with him that started off “I was having lunch last week with the Librarian of Congress…” for example. It wasn’t without it’s benefits, however.

When Adam Savage was doing his stage show “Brain Candy” with Michael Stevens, I knew that Mike and Adam were friends and I pinged Mike about the show in Worcester. Mike also knew the producer of the show, got a ticket and joined my son and I at the show and afterwards we met Adam (who is a lovely person).

The past couple of years, I’ve been trying to host Thanksgiving dinner because our families have drifted apart a bit. It’s not surprising: we all live very busy lives. This past year, we managed to all get together and it was a joy.

We had nearly all the Hawley men of my dad’s line in one place.

Mike was so happy to be there with Nika and his son Tycho and my dad was thrilled to see all of us together. I spent most of the day cooking (which I love), but I did find some time to snap some selfies.

No, there’s no family resemblance at all.

Where I’ll end this is that Mike hosted a dinner for close friends earlier that November for his anniversary with Nika. Again, I snapped a selfie.

Before the dinner, Mike introduced me to all these people that he knew in medicine, publishing, politics, industry, music, and software. Invariably, he would introduce me as his “smarter brother”, which I really appreciated. And if you’ve made it this far and if you’ve known Mike, you probably know that in addition to collecting people and introducing them to others, he could make you feel like the most important person in the room when he talked to you.

We lost a good one, yesterday.

The Mathematics of Contagion

This is something I wrote up for people I know on facebook to understand the mathematics of contagion and why you should care.

Let’s talk about covid-19 in the US from a strictly math point of view. I’m going to keep the math as simple as I can. This isn’t going to get any more complicated than high school math. This is going to be long, but it’s important to understand.

Stick with me.

Many things in life are proportional relationships. For example, if I’m putting together a set of hand outs for a conference and each hand out is 5 pages, I can find out how many sheets of paper I need by multiplying the number of pages (5) by the number of attendees. If someone decides to show up, add 5 pages. 2 people cancel, subtract 10 pages. This is also called a linear relationship because if you draw a graph with “number of people” on the x axis and “number of pages” on the y axis and fill in the points, they make a lovely straight line.

Disease propagation does not work that way. It’d be great if it did, but it doesn’t. Instead it follows a pattern called exponential. Here’s an illustration of one exponential relationship.

Let’s say I have a checkerboard which is 32 squares. I ask you to put 1 penny on the first square, 2 on the second, 4 on the third, 8 on the fourth and keep doubling for each square. By square 26, you’d need more than $1,000,000 in pennies. On the last square, you’d need a stack of $21,474,836.48 in pennies.

Let’s compare that to a proportional relationship. On the first square you put 5 pennies, 10 on the second, 15 on the third, etc. Doing this, on the last square you would have $1.60.

That’s some difference, right?

How do we represent that in a way we can calculate? Let’s figure it out.
Here are your days and their relationship to pennies:
1 : 1
2 : 2 * 1 = 2
3 : 2 * 2 * 1 = 4
4 : 2 * 2 * 2 * 1 = 8
8 : 2 * 2 * 2 * 2 * 1 = 16
Since we know that 2 * 2 is 2 squared or 22 and that 2 * 2 * 2 is 2 cubed or 23, we can simplify the table:
1 : 1
2 : 21 * 1 = 2
3 = 22 * 1 = 4
4 = 23 * 1 = 8
8 = 24 * 1 = 16
In general, the number of pennies on a given square is 2n * 1, where n is the number of the square on the checkerboard.

You can see that the number of pennies needed grows really fast. The good news is that COVID-19 doesn’t grow this fast. The bad news is that it is still an exponential relationship. The problem is almost the same as the pennies, but instead pennies it’s confirmed cases of COVID-19 and instead of squares, it’s days. The number we don’t have is called the base. In the penny problem, the base was 2 and I gave it to you. Can we figure out the base for COVID-19? Yes. Take the number of cases on any given day and divide it by the number of cases on the previous day. Why does this work? Remember with the pennies, I said multiply the previous number by 2. So if I have the equation:

x * prevday = nextday

I can solve for x:

x = nextday / prevday


I did just that for COVID-19. CNN (and several other sources) have been posting the number of positive tests each day. I put that into a spread sheet and did the calculation you see here for each day since March 1st. The result is not always consistent, so I calculated the trend and it comes out to about 1.3. That means to predict the number of cases tomorrow, multiply today’s cases by 1.3. That means that the number of cases doubles about every 2.5 days. We are on a pace to have 181,000 cases by the end of the month.

There are estimated to be 160,000 ventilators in the US.

See the problem? And even though not all of those 181,000 cases will need ventilators, just wait 10 days and you’ll have 2.8 million cases.

Now let’s talk about the death rate. The estimates vary a lot. The lowest is around 1.4% and the highest is 4%. The reason why this is such a wide range is that it depends on treatment and it depends on patient age. This means that by the end of the month, you can expect to see between 2240 and 6400 deaths in the US.

There’s good news and bad news. The good news is that COVID-19 will not spread without limit. The limit is the number of people available to be infected. If we had unlimited people, we would infect 330,000,000 people in by May 1st. But we don’t have unlimited people, so the curve can’t grow without limit. Also not every case requires hospitalization.

Further good news – this model is incomplete. The world is a sticky place with many more complications in it. For example, not every person who is infected will need to go to the hospital or even need ventilation. Not only that, once someone is infected, there is a good chance they will recover and the number of cases goes down.

The bad news is that we have a very real limit on the number of doctors, the number of hospital beds, the number of masks, the number of test kits and so on. If we don’t reduce the base, you will see growth to the point where our medical system can’t handle the number of cases it gets. This will get worse as people who work in hospitals get sick. At some point (and it’s going to happen soon), doctors are going to have to make decisions like “which patients will last without ventilation?” or “which patient will recover faster?” or “which patients do we have to let die because we don’t have the staffing or equipment to properly care for them?”

You should ask yourself, do you want you or someone close to you to be one of the patients on the short end of one of those decisions?

Let’s finish this with a call to action. What can you do? Wash your goddamn hands. Seriously. Soap and water wreaks havoc on COVID-19 and is dirt cheap. Keep away from groups and group activities and limit contact (reduce shopping). Feel sick? See if you can use telemedecine. Call your government and demand more testing.

For the curious – here’s my math, which I will try to keep updated.

Introducing Binding Tools for Swift

If you follow my blog, you’ve probably noticed two things: it’s been quiet here as of late and I haven’t spoken directly about what I’ve been working on since I joined Xamarin. Let’s change both of those things.

Today a vast majority of the APIs and libraries powering Apple’s ecosystem (iOS,macOS, tvOS, watchOS) come in the form of Objective-C libraries. Xamarin’s binding project support can expose these APIs in C# for use the same as any other managed library. The same infrastructure powers the platform assemblies, like Xamarin.iOS.dll.

A few years ago Apple introduced a new programming language called Swift, which has been under active development with a number of breaking changes over the years due to the evolution of the language. Swift by default has a very different runtime design than other languages. This makes calling Swift code exceptionally difficult from C# in most cases.

Solving this problem required writing a tool significantly more complex than our existing binding infrastructure. One advantage of Swift libraries, however, is that they contain enough information to automatically create a reasonable API mapping to C#. This means we can improve upon the experience of binding Objective-C by skipping the need for a user to define the interface in C# by hand.

After years of development and 1200 unit tests and tests on 3rd party libraries, we’re happy to announce that we’re open sourcing our work:
https://github.com/xamarin/binding-tools-for-swift/

Current Status

Binding Tools for Swift currently works on most of the common cases for in Swift 5.0 code, but there are limitations that we are actively working on:

  • Project and MSBuild support to make it easier to use
  • Support for Protocols with Associated Types
  • Support for Swift 5.1
  • Support for non-escaping closures
  • Support for closures in bound generic types
  • Improving documentation and samples

There is also work that needs to be done to fully integrate BTfS into Visual Studio for Mac and bring it into a public preview.

Get Involved!

It’s been a lot of work and a lot of typing. I’m very happy with where the code is today and am looking forward to getting my first community pull request. Get involved! I’ve written a lot of documentation on how the tool works and you can read the quickstart guide here.

I’m proud of the work that I’ve done and I’m very happy to bring Swift interoperability into the .NET ecosystem. Come help me out!

I’m Old, Part LXXXIV: The Spice Girls

I was talking to my spouse last night the topic of the Spice Girls came up and she was trying to remember each of their personae. I knew from the gate that this was not going to happen for me.

When I worked for Newfire, one of my co-workers, Andy Hess I think, went to a conference in Great Britain at a time when the Spice Girls were a thing. I think at that time I was listening to either grunge or NPR so I didn’t hear any Spice Girls music, but I knew of them. When Andy came back from his trip, he brought at least one packet of crisps from his trip.


For sure he brought the one on the left. To this day I prefer to believe that one of the Spice Girls was Smoky Bacon Spice.

Learning Assembly: Arithmetic

If you’re in coding at any level, you have some familiarity with arithmetic. Most of the arithmetic you’re likely to do in assembly is will be familiar and that should be no surprise at all. Some operations may be unfamiliar and some familiar operations may have nuance that feel funny at first, but trust me that the nuance is there for a reason.

Fundamentally, there are three models of arithmetic in assembly language and the differences all have to do with a pair of simple questions:

  1. Where do the parameter(s) come from?
  2. Where does the result go?

There are three variations and each is a logical extension of the other.

Accumulator Model

In the accumulator model, there is a dedicated register (or even two) which is used as the result for (nearly) all arithmetic and often as one of the parameters. A simple example is basic addition. You might see some code like this:

LDA Position
ADD #$5
STA Position

In this example, we’re reading the value of Position into register A (LoaD A). If this is unfamiliar to you, see the previous post on moving memory. Then we add 5 to it and store it back into Position. So, in essence, this is equivalent to:

Position += 5;

You’ll notice that we had to store the result explicitly. This is because the result ends up in register A. The accumulator model is very simple, but your code feels particularly wordy. That’s perfectly fine. It’s just the lay of the land.

Second Is Result

In this model, the second parameter is also the result. If I rewrite the previous example in this way it will look like this:

ADD #$5, Position

Second Is Result differs from the accumulator model in that no register gets directly modified, which is a nice convenience. It’s similar in that the CPU is going to do nearly an identical amount of work in both cases. Second Is Result is very common in current CPUs.

Third Is Result

In this model, the first and second elements are the parameters and the third is the result:

ADD #$5, Position, Position

Again, this is equivalent to both models and on the face it doesn’t look particularly useful, since in this example the result is the same as one of the parameters. But what if instead we started from C code like this:

newPosition = oldPosition + 5;

In this case, the assembly would look like this:

ADD #$5, oldPosition, newPosition

From this point of view, it looks like a very handy format, but there are problems – the number of places that you might use aren’t quite so common and the instruction decoding that the CPU needs to do starts to get more complicated. You can see that you could write this using Second Is Result with the following:

MOV oldPosition, newPosition
ADD #$5, newPosition

Or with Accumulator Model like this:

LDA oldPosition
ADD #$5
STA newPosition

Note that the complexity of accumulator model has not changed at all. The important takeaway is that there are usually very direct mechanisms of implementing basic arithmetic as expressed in programming languages and this shouldn’t be a surprise since assembly languages have influenced high level languages, which in turn informed assembly languages.

Besides addition, what’s available? It depends on the CPU, but you can expect to see instructions that do the following on lower end CPUs

  • addition
  • subtration
  • logical and
  • logical or
  • logical exclusive or
  • bit shifting left or right
  • bit rotation
  • complement
  • negation

On higher end CPUs, you would expect to see all of the previous as well as the following:

  • multiplication
  • division
  • floating point versions of arithmetic
  • sign extension
  • conversion to/from floating point

Where Things Get Funny (not funny ha-ha)

When you’re working with a budget CPU, you might not always have multiplication or division. Sucks to be you. You can implement them – the algorithms are straight forward, but multiplication is slow and division is slower. The CPU designers knew that it was likely that the CPUs would have to interact with people and people tend to be much happier with base 10 rather than base 16, so they included a means of doing base 10 called Binary Coded Decimal, or BCD. In BCD, 4 bits out of every byte are used to represent a decimal digit, but you are limited to 0-9 in each nibble. Unfortunately, BCD tends to be pretty clunky. On the 6502, you first had to put the CPU in a special mode with the instruct SED for “SEt Decimal”. This sets a bit in processor and when ever you did addition or subtraction, if the result of operation would leave a value greater than 9, it would adjust the the result. Once done, you had to undo the SED instruction with the CLD instruction (CLear Decimal), otherwise you affect your later code. The 6800 processor had a special bit that was set when the result of BCD arithmetic was incorrect and if so, you would use a special instruction after to fix it. Also gross. Both of these methods, however, were far cheaper to implement than multiplication and division.

Most CPUs have the ability to tell when things may have gone wrong or need attention. For example, after most operations there is a flag that sets set if the result was 0 or if the result was negative or if the result might have caused an overflow or underflow. I’ll talk about that more in a different post.

Most higher end CPUs perform math on several different sizes of input. For example, an x64 processor can perform math operations on 8 bit, 16 bit, 32 bit and 64 bit inputs.

Finally, one thing you see in assembly that you rarely see in high level languages is the ability to cheaply do arbitrary precision arithmetic. The way this is managed is that the CPU has a flag in it that represents binary carry. If you add two values and the result is bigger than what fits in the machine word, then the carry flag gets set. If you subtract two values and the result is less than what fits in the machine word, then the carry gets set.

Most CPUs have a separate instruction for addition with carry and without carry (or subtraction with borrow and without borrow). Why? Because you don’t always want to add with carry, sometimes you just want to add numbers and ignore the carry. The 6502 was not one of those CPUs. It didn’t have separate add and add with carry instructions. Instead, it only had add with carry. This was irritating because if you ever wanted to do plain addition, you have to make sure that you cleared the carry flag first. If you forgot to do this, then your result might be off by one depending on what happened beforehand. The 6502 was the first CPU I coded for and I didn’t understand this at first, which led to some very perplexing bugs.

Why would you leave out multiplication and division? For the most part, it comes down to cost and limitations in the registers. In a typical 8-bit processor, you’re not likely to see either because your registers are typically 8 bit. When you add two 8 bit numbers, the result is at most 9 bits (8 bits plus the carry). When you multiply two 8 bit numbers, the result is at most 16 bits. Where does the result go? In the accumulator model, if you only have 1 accumulator (like the 6502), where do you put the other 8 bits? Furthermore, the 6502 implemented its math through dedicated silicon called an Arithmetic Logic Unit or ALU. All of the math in the 6502 is 8 bit plus carry. Having two arithmetic operations that need 16 bits is suddenly expensive. We’ll cover the ALU

Specialty Math

Some processors put in specialty math operations. For example, it’s extremely common to want to add/subtract 1 from a register or from memory. You’ll sometimes see special instructions for doing this named INCrement and DECrement. Other processors make special “fast” versions of add and subtract that will only add or subtract a small constant value.

We see that performing arithmetic in assembly language is conceptually straight forward, but in practice you need to be aware of the quirks in any given CPU to ensure that you get the result that you intended.

Learning Assembly Language: Moving Data

One of the primary classes of things you can do in assembly language is move data from one place to another. Different processors have different takes on how this works. For example, some processors use a load/store model where the two operations are completely separate. Other processors use a move model where data gets moved from one place to another.

Before we delve in, let’s get some vocabulary:

  • register – a chunk of super-fast memory built in to the CPU. Most of the time, registers don’t have addresses and don’t correspond to actual memory, but are referred to by name.
  • memory – an area where data can be read or written. Memory is typically arranged in a sequence with a numerical address that refers to a location in memory.

I’m going to start with the load/store model first. In the load/store model, when you want to read from memory, you load a value in to a register.

LOAD A, $300

In a hypothetical computer, this instruction reads from memory location $300 and copies the value into register A. When you are reading from memory this way, it’s usually called “absolute” addressing because $300 is the actual address with no other changes. In the 6502, one of my favorite processors, this gets written a little more compactly:

LDA $0300

When this code gets assembled, it turns into this data:

AD 00 03

Here’s what happens inside the CPU:

The CPU reads AD and knows that the next two bytes are an absolute address. It reads 00 and 03 and combines them into $0300. It then sends that value out to memory along with a signal that says “READ”. A little while later, it picks up the value that comes back and copies it into A.

When you want to copy a value from a register to memory, you use a store instruct like this:

STORE A, $301

This copies the value that’s in the A register and writes it into memory at address $301. In the 6502, this gets written more compactly like this:

STA $0301

When this code gets assembled, it turns into this data:

8D 00 03

And it operates identically to LDA except that in addition to the address, it also sends out the value of A and a signal to write it into memory.

This is a very simple example. Absolute addressing is the second simplest way of reading from or writing to memory. The simplest way is called immediate. In immediate addressing, the value to be read is part of the instruction itself. For example, in 6502, if you had this:

LDA #$3F

It gets assembled as:

A9 3F

The CPU sees the A9 and knows that it needs to read the next byte from memory and copies it into register A.

The other ways of reading from memory vary from CPU to CPU, but most of them are similar enough. Here is a list of some of them:

Indexed: LDA $300,X – in this case the address $300 is taken and then the value of register X gets added to it, then memory gets read. This is one way that you can do typical array indexing (hence the name).

Indirect: LDA ($300) – memory is read at location $300 and the value that’s there gets used as the final address to read from. If you’re thinking that this feels an awful lot like C pointer dereferencing, you’re right.

Indirect indexed: LDA ($300),Y – memory is read at location $300 and the value that’s there then gets added to Y and gets used as the final address to read from. This can be used as another form of array indexing.

Indexed indirect: LDA ($300, X) – add X to $300 and the value that’s there gets used as the final address to read from. This could be used to read from an array of pointers.

The rest of the variations that you see on there are things that make it easy to skip across memory in various increments. I won’t go into them just yet.

Next let’s look at the move model. There are typically two ways that the move model gets represented: assignment statement or from/to. Which it is depends entirely on the assembler (I’ve seen both of these for the same CPU). The assignment model looks like this:

MOVE destination, source

Whereas the from/to model looks like this:

MOVE source, destination

I’m going to use from/to. So what are source and destination? That depends on the CPU, but typically, it’s either a register or some way of addressing.

In the 680×0 processors, you can copy one register to another using a move instruction:

MOV.B D0, D1

This moves a byte of memory from register D0 to register D1. The move model is very convenient when an the source and destination address can be any addressing mode. Not all CPUs allow this, though. For any of a number of reasons, the designers of the CPU make decisions to limit the addressing. For example, some might decide that either one or both must be a register. And it’s these limitations that make writing decent assembly tricky because what you think of in C:

*p = *(q + x);

Might require a little bouncing around in registers to make happen

MOVE.L q, A0 ; q goes into register A0
ADDA x, A0 ; add x to q
MOVE.B (A0), D0 ; move *(q + x) into D0
MOVE p, A0 ; move p into A0
MOVE.B D0, (A0) ; store the value (finally) into *p

And therein lies one of the common problems with writing assembly: it’s easy to get caught up in the minutiae and lose track of you big picture goal.

Still, we can see that by using either load/store or move, we can model an assignment statement or a binding in a functional programming language.

Learning Assembly, What Can You Do Really?

When I was in college, one of the CS professors presented a model of the Manchester Baby (it was presented as the Manchester Mark I, but that was a significantly more complicated computer), one of the earliest computers. It had 7 instructions (and honestly, it didn’t need them all).

For the purposes of this, here is some vocabulary for you:

  • register – a small chunk of memory built into the computer.
  • accumulator – a register that is typically used for arithmetic or logic. Simple computers will only have one.
  • memory – short-term storage for data, readily accessible to the processor.
  • address – a number that refers to a location in memory.

LDNEG address – read the value from memory at address and negate it and store it in the accumulator.

STORE address – store the accumulator into memory at address.

SUB address – subtract the value in memory at address from the accumulator and leave the result in the accumulator.

SKNEG – if the result of the last operator was negative, skip the next instruction, otherwise perform the next instruction.

BR address – continue execution at address

BRREL offset – add offset to current address and continue execution there

HALT – stop execution

That’s it. And honestly, everything that a modern computer can do is really a variation on this set in some way. The cool thing is that this computer was Turing complete. You can compute any computable function on it provided that there is enough memory (spoiler: there really wasn’t). The problem is that programs end up being very wordy to do even simple things. For example, let’s say that you wanted to write a program to add three and four. It should look like this:

LDNEG Three
STORE Scratch
LDNEG Four
STORE Scratch1
LDNEG Four
SUB Scratch
STORE Result
HALT
Three: 3
Four: 4
Scratch: 0
Scratch1: 0
Result: 0

What this does is reads 3 from a memory location, which becomes -3 because LDNEG negates it. It puts this into Scratch. Then it reads 4 from a memory location which becomes -4. It puts this into Scratch1, then it re-loads it and it becomes 4. Then it subtracts -3 which is 4 – -3 = 7 which we store in Result and halt. Ta-da!

And you immediately see the problem with this assembly language: it’s very chatty. It takes 2-3x the work of most other assembly languages. It does, however, break down the capabilities of a typical computer into these categories:

  • Move values from/to memory and/or registers
  • Perform (limited) arithmetic/logic
  • Change flow unconditionally or unconditionally
  • Other

The last one is where HALT falls. In addition, many processors include an instruction called NOP or NOOP which stands for No Operation. It’s an instruction that does nothing except consume time. Some processors include dedicated instructions for input and output which could go into either “move values” or “other”.

And while that seems like a depressingly small number of things, that’s really it.

In the class where this was presented, the professor offered a prize of $1 for the student who could come up with the shortest (correct) program (including data) to calculate factorial. For a computer that didn’t have multiplication, let alone addition, this was a pain, but straight forward. And quite honestly, this is where most of assembly language programming falls: it can be a pain, but it’s typically straight forward.

Learning Assembly Language, Introduction

One of my co-workers has a goal to learn assembly language. I thought I would write up my experiences with it.

My coworker ask me how I learned. I started with 6502 on the Apple ][. I knew BASIC and I started learning 6502 when I was home sick from school. I sat down and wrote a program to print “HI ” in an infinite loop. It was a 3 instruction program. When I ran it, it was so incredibly fast compared to BASIC.

After that, I started writing programs with the Apple manual on lap. Most of what I did was treat the Apple ROM routines as my API and I coded to that. Later, when I got more advanced, I wanted to do things that I saw in games, but I didn’t know how, so I disassembled the games and read their code. In addition to writing my own code, I got pretty good at reading other people’s assembly which is something I still have to do.

As I wrote code, I built a model in my head of how it worked. It turned out to wrong in some details, but the most part it was close enough.

Do I still write assembly? From time to time. I have to read it frequently in my current job. For me, it’s been an indispensable tool to use when I can’t otherwise explain a particular behavior. There have also been cases where I have worked with a licensed library and found bugs in their code that I couldn’t work around. In addition to reporting the bug I could also tell them where the bug was. Finally, understanding how typical functions get implemented has been useful when I needed to write code that was especially performant without having to write assembly.

I’m Old, Part LXXXIII: Liberating Restriction

While working on Acrobat 2.0, I remember having a conversation with Gordon Dow about Lego kits. He said that he preferred the simpler bricks as he enjoyed the liberating restriction of having to work with all rectangular pieces. I love that phrase because of it’s inherent contradiction and because it is apt to many things.

For example, I took a CS class at Oberlin in Automata Theory taught by Rhys Price Jones. Rhys covered a generalized state machine and gave us as assignment to implement code to generate the state machine code from a text description of the state machine. He wanted us to work in Scheme and narrowly defined the scope of the assignment that each state must itself be a process, where a process was really just a lambda expression that handled the input and picked another lambda expression to continue on.

At that point, Scheme was seriously on my shit list for languages at the time and I really didn’t want to implement this assignment in Scheme. I pressed about doing it in C and that each state could be a function since really, each of those lambda expressions were functions too. Rhys said no, they had to be be processes.

Fine.

I still did the program in C. The top level C program read the machine description and for each state transition, wrote another C program. The main() of that program read a character from stdin, did a switch on it and forked a new process running the next program, piping in stdin and stdout. After all the state transition programs were written, the top level program compiled each in turn.

It took some juggling and debugging, but I made it work. And the whole process was fun for me. I speculate probably more fun than the rest of the class who were dutifully working in Scheme. I’m not saying that they weren’t enjoying the assignment, just that they weren’t giggling at the absurdity of the process.

And at this point in my career, I’m writing code that does static analysis on other code and writes more code in two different languages that can call across to each other, so I guess that was a nice little bit of prep for that.