Introducing Binding Tools for Swift

If you follow my blog, you’ve probably noticed two things: it’s been quiet here as of late and I haven’t spoken directly about what I’ve been working on since I joined Xamarin. Let’s change both of those things.

Today a vast majority of the APIs and libraries powering Apple’s ecosystem (iOS,macOS, tvOS, watchOS) come in the form of Objective-C libraries. Xamarin’s binding project support can expose these APIs in C# for use the same as any other managed library. The same infrastructure powers the platform assemblies, like Xamarin.iOS.dll.

A few years ago Apple introduced a new programming language called Swift, which has been under active development with a number of breaking changes over the years due to the evolution of the language. Swift by default has a very different runtime design than other languages. This makes calling Swift code exceptionally difficult from C# in most cases.

Solving this problem required writing a tool significantly more complex than our existing binding infrastructure. One advantage of Swift libraries, however, is that they contain enough information to automatically create a reasonable API mapping to C#. This means we can improve upon the experience of binding Objective-C by skipping the need for a user to define the interface in C# by hand.

After years of development and 1200 unit tests and tests on 3rd party libraries, we’re happy to announce that we’re open sourcing our work:
https://github.com/xamarin/binding-tools-for-swift/

Current Status

Binding Tools for Swift currently works on most of the common cases for in Swift 5.0 code, but there are limitations that we are actively working on:

  • Project and MSBuild support to make it easier to use
  • Support for Protocols with Associated Types
  • Support for Swift 5.1
  • Support for non-escaping closures
  • Support for closures in bound generic types
  • Improving documentation and samples

There is also work that needs to be done to fully integrate BTfS into Visual Studio for Mac and bring it into a public preview.

Get Involved!

It’s been a lot of work and a lot of typing. I’m very happy with where the code is today and am looking forward to getting my first community pull request. Get involved! I’ve written a lot of documentation on how the tool works and you can read the quickstart guide here.

I’m proud of the work that I’ve done and I’m very happy to bring Swift interoperability into the .NET ecosystem. Come help me out!

I’m Old, Part LXXXIV: The Spice Girls

I was talking to my spouse last night the topic of the Spice Girls came up and she was trying to remember each of their personae. I knew from the gate that this was not going to happen for me.

When I worked for Newfire, one of my co-workers, Andy Hess I think, went to a conference in Great Britain at a time when the Spice Girls were a thing. I think at that time I was listening to either grunge or NPR so I didn’t hear any Spice Girls music, but I knew of them. When Andy came back from his trip, he brought at least one packet of crisps from his trip.


For sure he brought the one on the left. To this day I prefer to believe that one of the Spice Girls was Smoky Bacon Spice.

Learning Assembly: Arithmetic

If you’re in coding at any level, you have some familiarity with arithmetic. Most of the arithmetic you’re likely to do in assembly is will be familiar and that should be no surprise at all. Some operations may be unfamiliar and some familiar operations may have nuance that feel funny at first, but trust me that the nuance is there for a reason.

Fundamentally, there are three models of arithmetic in assembly language and the differences all have to do with a pair of simple questions:

  1. Where do the parameter(s) come from?
  2. Where does the result go?

There are three variations and each is a logical extension of the other.

Accumulator Model

In the accumulator model, there is a dedicated register (or even two) which is used as the result for (nearly) all arithmetic and often as one of the parameters. A simple example is basic addition. You might see some code like this:

LDA Position
ADD #$5
STA Position

In this example, we’re reading the value of Position into register A (LoaD A). If this is unfamiliar to you, see the previous post on moving memory. Then we add 5 to it and store it back into Position. So, in essence, this is equivalent to:

Position += 5;

You’ll notice that we had to store the result explicitly. This is because the result ends up in register A. The accumulator model is very simple, but your code feels particularly wordy. That’s perfectly fine. It’s just the lay of the land.

Second Is Result

In this model, the second parameter is also the result. If I rewrite the previous example in this way it will look like this:

ADD #$5, Position

Second Is Result differs from the accumulator model in that no register gets directly modified, which is a nice convenience. It’s similar in that the CPU is going to do nearly an identical amount of work in both cases. Second Is Result is very common in current CPUs.

Third Is Result

In this model, the first and second elements are the parameters and the third is the result:

ADD #$5, Position, Position

Again, this is equivalent to both models and on the face it doesn’t look particularly useful, since in this example the result is the same as one of the parameters. But what if instead we started from C code like this:

newPosition = oldPosition + 5;

In this case, the assembly would look like this:

ADD #$5, oldPosition, newPosition

From this point of view, it looks like a very handy format, but there are problems – the number of places that you might use aren’t quite so common and the instruction decoding that the CPU needs to do starts to get more complicated. You can see that you could write this using Second Is Result with the following:

MOV oldPosition, newPosition
ADD #$5, newPosition

Or with Accumulator Model like this:

LDA oldPosition
ADD #$5
STA newPosition

Note that the complexity of accumulator model has not changed at all. The important takeaway is that there are usually very direct mechanisms of implementing basic arithmetic as expressed in programming languages and this shouldn’t be a surprise since assembly languages have influenced high level languages, which in turn informed assembly languages.

Besides addition, what’s available? It depends on the CPU, but you can expect to see instructions that do the following on lower end CPUs

  • addition
  • subtration
  • logical and
  • logical or
  • logical exclusive or
  • bit shifting left or right
  • bit rotation
  • complement
  • negation

On higher end CPUs, you would expect to see all of the previous as well as the following:

  • multiplication
  • division
  • floating point versions of arithmetic
  • sign extension
  • conversion to/from floating point

Where Things Get Funny (not funny ha-ha)

When you’re working with a budget CPU, you might not always have multiplication or division. Sucks to be you. You can implement them – the algorithms are straight forward, but multiplication is slow and division is slower. The CPU designers knew that it was likely that the CPUs would have to interact with people and people tend to be much happier with base 10 rather than base 16, so they included a means of doing base 10 called Binary Coded Decimal, or BCD. In BCD, 4 bits out of every byte are used to represent a decimal digit, but you are limited to 0-9 in each nibble. Unfortunately, BCD tends to be pretty clunky. On the 6502, you first had to put the CPU in a special mode with the instruct SED for “SEt Decimal”. This sets a bit in processor and when ever you did addition or subtraction, if the result of operation would leave a value greater than 9, it would adjust the the result. Once done, you had to undo the SED instruction with the CLD instruction (CLear Decimal), otherwise you affect your later code. The 6800 processor had a special bit that was set when the result of BCD arithmetic was incorrect and if so, you would use a special instruction after to fix it. Also gross. Both of these methods, however, were far cheaper to implement than multiplication and division.

Most CPUs have the ability to tell when things may have gone wrong or need attention. For example, after most operations there is a flag that sets set if the result was 0 or if the result was negative or if the result might have caused an overflow or underflow. I’ll talk about that more in a different post.

Most higher end CPUs perform math on several different sizes of input. For example, an x64 processor can perform math operations on 8 bit, 16 bit, 32 bit and 64 bit inputs.

Finally, one thing you see in assembly that you rarely see in high level languages is the ability to cheaply do arbitrary precision arithmetic. The way this is managed is that the CPU has a flag in it that represents binary carry. If you add two values and the result is bigger than what fits in the machine word, then the carry flag gets set. If you subtract two values and the result is less than what fits in the machine word, then the carry gets set.

Most CPUs have a separate instruction for addition with carry and without carry (or subtraction with borrow and without borrow). Why? Because you don’t always want to add with carry, sometimes you just want to add numbers and ignore the carry. The 6502 was not one of those CPUs. It didn’t have separate add and add with carry instructions. Instead, it only had add with carry. This was irritating because if you ever wanted to do plain addition, you have to make sure that you cleared the carry flag first. If you forgot to do this, then your result might be off by one depending on what happened beforehand. The 6502 was the first CPU I coded for and I didn’t understand this at first, which led to some very perplexing bugs.

Why would you leave out multiplication and division? For the most part, it comes down to cost and limitations in the registers. In a typical 8-bit processor, you’re not likely to see either because your registers are typically 8 bit. When you add two 8 bit numbers, the result is at most 9 bits (8 bits plus the carry). When you multiply two 8 bit numbers, the result is at most 16 bits. Where does the result go? In the accumulator model, if you only have 1 accumulator (like the 6502), where do you put the other 8 bits? Furthermore, the 6502 implemented its math through dedicated silicon called an Arithmetic Logic Unit or ALU. All of the math in the 6502 is 8 bit plus carry. Having two arithmetic operations that need 16 bits is suddenly expensive. We’ll cover the ALU

Specialty Math

Some processors put in specialty math operations. For example, it’s extremely common to want to add/subtract 1 from a register or from memory. You’ll sometimes see special instructions for doing this named INCrement and DECrement. Other processors make special “fast” versions of add and subtract that will only add or subtract a small constant value.

We see that performing arithmetic in assembly language is conceptually straight forward, but in practice you need to be aware of the quirks in any given CPU to ensure that you get the result that you intended.

Learning Assembly Language: Moving Data

One of the primary classes of things you can do in assembly language is move data from one place to another. Different processors have different takes on how this works. For example, some processors use a load/store model where the two operations are completely separate. Other processors use a move model where data gets moved from one place to another.

Before we delve in, let’s get some vocabulary:

  • register – a chunk of super-fast memory built in to the CPU. Most of the time, registers don’t have addresses and don’t correspond to actual memory, but are referred to by name.
  • memory – an area where data can be read or written. Memory is typically arranged in a sequence with a numerical address that refers to a location in memory.

I’m going to start with the load/store model first. In the load/store model, when you want to read from memory, you load a value in to a register.

LOAD A, $300

In a hypothetical computer, this instruction reads from memory location $300 and copies the value into register A. When you are reading from memory this way, it’s usually called “absolute” addressing because $300 is the actual address with no other changes. In the 6502, one of my favorite processors, this gets written a little more compactly:

LDA $0300

When this code gets assembled, it turns into this data:

AD 00 03

Here’s what happens inside the CPU:

The CPU reads AD and knows that the next two bytes are an absolute address. It reads 00 and 03 and combines them into $0300. It then sends that value out to memory along with a signal that says “READ”. A little while later, it picks up the value that comes back and copies it into A.

When you want to copy a value from a register to memory, you use a store instruct like this:

STORE A, $301

This copies the value that’s in the A register and writes it into memory at address $301. In the 6502, this gets written more compactly like this:

STA $0301

When this code gets assembled, it turns into this data:

8D 00 03

And it operates identically to LDA except that in addition to the address, it also sends out the value of A and a signal to write it into memory.

This is a very simple example. Absolute addressing is the second simplest way of reading from or writing to memory. The simplest way is called immediate. In immediate addressing, the value to be read is part of the instruction itself. For example, in 6502, if you had this:

LDA #$3F

It gets assembled as:

A9 3F

The CPU sees the A9 and knows that it needs to read the next byte from memory and copies it into register A.

The other ways of reading from memory vary from CPU to CPU, but most of them are similar enough. Here is a list of some of them:

Indexed: LDA $300,X – in this case the address $300 is taken and then the value of register X gets added to it, then memory gets read. This is one way that you can do typical array indexing (hence the name).

Indirect: LDA ($300) – memory is read at location $300 and the value that’s there gets used as the final address to read from. If you’re thinking that this feels an awful lot like C pointer dereferencing, you’re right.

Indirect indexed: LDA ($300),Y – memory is read at location $300 and the value that’s there then gets added to Y and gets used as the final address to read from. This can be used as another form of array indexing.

Indexed indirect: LDA ($300, X) – add X to $300 and the value that’s there gets used as the final address to read from. This could be used to read from an array of pointers.

The rest of the variations that you see on there are things that make it easy to skip across memory in various increments. I won’t go into them just yet.

Next let’s look at the move model. There are typically two ways that the move model gets represented: assignment statement or from/to. Which it is depends entirely on the assembler (I’ve seen both of these for the same CPU). The assignment model looks like this:

MOVE destination, source

Whereas the from/to model looks like this:

MOVE source, destination

I’m going to use from/to. So what are source and destination? That depends on the CPU, but typically, it’s either a register or some way of addressing.

In the 680×0 processors, you can copy one register to another using a move instruction:

MOV.B D0, D1

This moves a byte of memory from register D0 to register D1. The move model is very convenient when an the source and destination address can be any addressing mode. Not all CPUs allow this, though. For any of a number of reasons, the designers of the CPU make decisions to limit the addressing. For example, some might decide that either one or both must be a register. And it’s these limitations that make writing decent assembly tricky because what you think of in C:

*p = *(q + x);

Might require a little bouncing around in registers to make happen

MOVE.L q, A0 ; q goes into register A0
ADDA x, A0 ; add x to q
MOVE.B (A0), D0 ; move *(q + x) into D0
MOVE p, A0 ; move p into A0
MOVE.B D0, (A0) ; store the value (finally) into *p

And therein lies one of the common problems with writing assembly: it’s easy to get caught up in the minutiae and lose track of you big picture goal.

Still, we can see that by using either load/store or move, we can model an assignment statement or a binding in a functional programming language.

Learning Assembly, What Can You Do Really?

When I was in college, one of the CS professors presented a model of the Manchester Baby (it was presented as the Manchester Mark I, but that was a significantly more complicated computer), one of the earliest computers. It had 7 instructions (and honestly, it didn’t need them all).

For the purposes of this, here is some vocabulary for you:

  • register – a small chunk of memory built into the computer.
  • accumulator – a register that is typically used for arithmetic or logic. Simple computers will only have one.
  • memory – short-term storage for data, readily accessible to the processor.
  • address – a number that refers to a location in memory.

LDNEG address – read the value from memory at address and negate it and store it in the accumulator.

STORE address – store the accumulator into memory at address.

SUB address – subtract the value in memory at address from the accumulator and leave the result in the accumulator.

SKNEG – if the result of the last operator was negative, skip the next instruction, otherwise perform the next instruction.

BR address – continue execution at address

BRREL offset – add offset to current address and continue execution there

HALT – stop execution

That’s it. And honestly, everything that a modern computer can do is really a variation on this set in some way. The cool thing is that this computer was Turing complete. You can compute any computable function on it provided that there is enough memory (spoiler: there really wasn’t). The problem is that programs end up being very wordy to do even simple things. For example, let’s say that you wanted to write a program to add three and four. It should look like this:

LDNEG Three
STORE Scratch
LDNEG Four
STORE Scratch1
LDNEG Four
SUB Scratch
STORE Result
HALT
Three: 3
Four: 4
Scratch: 0
Scratch1: 0
Result: 0

What this does is reads 3 from a memory location, which becomes -3 because LDNEG negates it. It puts this into Scratch. Then it reads 4 from a memory location which becomes -4. It puts this into Scratch1, then it re-loads it and it becomes 4. Then it subtracts -3 which is 4 – -3 = 7 which we store in Result and halt. Ta-da!

And you immediately see the problem with this assembly language: it’s very chatty. It takes 2-3x the work of most other assembly languages. It does, however, break down the capabilities of a typical computer into these categories:

  • Move values from/to memory and/or registers
  • Perform (limited) arithmetic/logic
  • Change flow unconditionally or unconditionally
  • Other

The last one is where HALT falls. In addition, many processors include an instruction called NOP or NOOP which stands for No Operation. It’s an instruction that does nothing except consume time. Some processors include dedicated instructions for input and output which could go into either “move values” or “other”.

And while that seems like a depressingly small number of things, that’s really it.

In the class where this was presented, the professor offered a prize of $1 for the student who could come up with the shortest (correct) program (including data) to calculate factorial. For a computer that didn’t have multiplication, let alone addition, this was a pain, but straight forward. And quite honestly, this is where most of assembly language programming falls: it can be a pain, but it’s typically straight forward.

Learning Assembly Language, Introduction

One of my co-workers has a goal to learn assembly language. I thought I would write up my experiences with it.

My coworker ask me how I learned. I started with 6502 on the Apple ][. I knew BASIC and I started learning 6502 when I was home sick from school. I sat down and wrote a program to print “HI ” in an infinite loop. It was a 3 instruction program. When I ran it, it was so incredibly fast compared to BASIC.

After that, I started writing programs with the Apple manual on lap. Most of what I did was treat the Apple ROM routines as my API and I coded to that. Later, when I got more advanced, I wanted to do things that I saw in games, but I didn’t know how, so I disassembled the games and read their code. In addition to writing my own code, I got pretty good at reading other people’s assembly which is something I still have to do.

As I wrote code, I built a model in my head of how it worked. It turned out to wrong in some details, but the most part it was close enough.

Do I still write assembly? From time to time. I have to read it frequently in my current job. For me, it’s been an indispensable tool to use when I can’t otherwise explain a particular behavior. There have also been cases where I have worked with a licensed library and found bugs in their code that I couldn’t work around. In addition to reporting the bug I could also tell them where the bug was. Finally, understanding how typical functions get implemented has been useful when I needed to write code that was especially performant without having to write assembly.

I’m Old, Part LXXXIII: Liberating Restriction

While working on Acrobat 2.0, I remember having a conversation with Gordon Dow about Lego kits. He said that he preferred the simpler bricks as he enjoyed the liberating restriction of having to work with all rectangular pieces. I love that phrase because of it’s inherent contradiction and because it is apt to many things.

For example, I took a CS class at Oberlin in Automata Theory taught by Rhys Price Jones. Rhys covered a generalized state machine and gave us as assignment to implement code to generate the state machine code from a text description of the state machine. He wanted us to work in Scheme and narrowly defined the scope of the assignment that each state must itself be a process, where a process was really just a lambda expression that handled the input and picked another lambda expression to continue on.

At that point, Scheme was seriously on my shit list for languages at the time and I really didn’t want to implement this assignment in Scheme. I pressed about doing it in C and that each state could be a function since really, each of those lambda expressions were functions too. Rhys said no, they had to be be processes.

Fine.

I still did the program in C. The top level C program read the machine description and for each state transition, wrote another C program. The main() of that program read a character from stdin, did a switch on it and forked a new process running the next program, piping in stdin and stdout. After all the state transition programs were written, the top level program compiled each in turn.

It took some juggling and debugging, but I made it work. And the whole process was fun for me. I speculate probably more fun than the rest of the class who were dutifully working in Scheme. I’m not saying that they weren’t enjoying the assignment, just that they weren’t giggling at the absurdity of the process.

And at this point in my career, I’m writing code that does static analysis on other code and writes more code in two different languages that can call across to each other, so I guess that was a nice little bit of prep for that.

I’m Old, Part LXXXII: Making Friends

At Atalasoft, we frequently tapped into local colleges for filling paid internships and we ended up hiring a fair percentage of those students once they had finished their degree. One of our first interns was Sean McKenna who came to us from UMass Amherst. At the time he was brought on, the company was barely single digits and crammed into an office in an old mill building in Northampton.

One day, we were having a conversation about beer pong and other drinking games. I discovered from Sean that at UMass, they called it ‘Beirut’ instead of beer pong. Sean and Dave Cilley started getting into a friendly argument over the point of drinking games in general and beer pong specifically. At one point, Sean said, “Beirut isn’t about drinking; it’s about making friends.” Dave immediate retorted deadpan, “I don’t need friends.” He immediately put the argument to bed. All of us doubled over in laughter except Dave who walked away.

This event entered into our collective lore and would come up every year or so.

PDF Redaction

As of January, 2019, lawyers for Paul Manafort released a “redacted” PDF that allowed one to easily read the redacted text. I will not speculate as to how this could have happened beyond the chestnut, “Never ascribe to malice that which can adequately be explained by incompetence”. Instead, I will go into detail as to how difficult a problem redaction is to solve properly in PDF beyond pilot error.

To begin with, let’s start with the elements of content of PDF. PDF was designed to represent all that could be represented on a printed page. There are three basic elements that can appear in a PDF: paths, bit-mapped images, and text. Paths can be either Bezier curves or rectangles (and in fact, rectangles are shorthand for paths). Paths can be stroked, filled, both, or used as a mask to clip out other elements.

These elements are represented in a postfix program within the file. One technique to display a page is to execute the program and collect all the elements into a display list which can then be rendered onto a screen on into print.

The problem here is that each of those three elements can contain something that is visually text. To be able to perfectly redact a PDF, there are two approaches. The first is to turn the page into just an image and paint over the text to be redacted and create a new document with only the image on the page. This a cheesy approach and it works, but there is a cost. Any of the text that was actual text is no longer text and can’t be selected or indexed. Furthermore, the final document is likely to be substantially larger than the original.

To do this in a non-cheesy way you need to handle each of the elements. Why? Because any of the three elements could render as text that is readable to a human. For example, an image could contain text (or a face). A set of paths could be a logo or the shapes of letters. To handle text, you need to be able to iterate over all the text in a page and create an equivalent program on the page that no longer includes any text within the area to be redacted. This is a tricky problem on its own, especially when text elements span regions to be redacted. Next, for images you need to be able to decode all the possible image formats with the PDF: JPG, LZW, CCITT, RLE, JPEG2000, and JBIG. The latter two are non-trivial to decode. Then replace the image with a new one with the areas painted over. Finally, there are path items. Ideally, you would remove any paths that intersect redaction areas, but that gets tricky because changing the paths that are only partially occluded is a very hard problem to do well.

In addition to these basic elements, there are others including composite objects, layers, annotations – any of which can be manipulated to appear as text or other information that should be redacted.

When I was working at Atalasoft on PDF tools, I considered the task of redaction and chose to pass on it. It lost us a sale or two, but I would rather lose a sale than get blamed for an incorrectly redacted document.

The commercial version of Adobe Acrobat has redaction tools built in and these do the job quite well. Best to depend on that for the time being.

I’m Old, Part LXXXI: Notes

When I worked for Newfire in that late 90’s, there were a number of things we did that were clearly designed to make the company more enticing for purchase. We had continuous integration, source code control, QA, marketing, etc.

In addition, there was the issue of intellectual property. We were implementing some really interesting things in terms of 3D code and a bunch of it was novel. We had patent attorneys come in and interview us and we applied for a stack of patents on our technology. In addition, my boss Marty gave every engineer a notebook with two very special criteria: they were bound and the pages were numbered. This would make it clear if the notebooks had been altered.

If we came up with any good ideas, we were encouraged to write them down and with the date and if they were particularly good, we had to put the book in front of another engineer who signed and dated the entry as a witness. Although it is possible to forge the books, this was supposed to provide a modicum of protected against intellectual property issues.

My book had a bunch of inventions in it that I really enjoyed. One was a process to turn a polygon into a minimal set of triangles, the goal being to turn text into meshes or indexed face sets. Another was a programming language built on finite-state automata for defining object behaviors in games. It was designed to be easy to read, interoperate with our game engine, and to be trivial to JIT compile.

Years later, at Atalasoft, I carried this over. I ordered a stack of notebooks and whenever we brought in a new engineer, I game them a notebook with the same instructions I was given.

What I think I really liked about the process was seeing how each engineer used the notebooks. It was an interesting reflection on their working styles. Some were planners and very much filled their notebooks with everything. Some were journalers and just wrote down a few bullet points for the day. Some were save game slots: wrote just enough to make it easier to pick up the next day. Some didn’t really use the notebooks at all. I was totally OK with all of those approaches.