Perils of Open Source Compilers

For my job, I work on Binding Tools for Swift. This is a suite of tools that does static analysis of compiled Swift modules and generates adapters in Swift and bindings in C#. In short, I make a Swift library look like its written in C#.

When I started on the project, I needed the ability to reflect on the front-facing interface of a Swift module. I couldn’t do this in Swift. The first reason is that there was no useful reflection in Swift, the second is that running reflection only gives you a view of the API from the point of view of the platform on which the reflection is being done.

Instead, I took advantage of the fact that Apple had kindly released their compiler as Open Source. I hooked into the command line arguments and wrote code that used their abstract syntax tree visitor to generate an XML document that represents the API.

There’s a problem with this, though. The compiler with reflection has to match what Apple ships. Why? Because if it doesn’t than we can’t consume the modules and if we use the custom compiler to compile Swift glue code, it won’t link with the original module. Fortunately, it was easy enough to do, since there was a branch with the release compiler.

I’m bringing my tools up to Swift 5.3 and discovered that this had changed. In looking at the shipping compiler, I got this information for the version:

Apple Swift version 5.3 (swiftlang-1200.0.29.2 clang-1200.0.30.1)
Target: x86_64-apple-darwin19.6.0

When I look at the custom compiler built from the release/5.3 branch, I get this for the version:

Swift version 5.3-dev (LLVM 6811d14c28, Swift 9cfbe5490b)
Target: x86_64-apple-darwin19.6.0

They’re no longer the same and as a result, there is a problem: I can’t consume libraries compiled by the system compiler and it can’t consume libraries compiled by my version of the compiler.

I have a workaround, I think, but let’s talk about why this change happened and it is a reflection of a really interesting conflict of interest in Open Source, especially with compilers. To be clear, I don’t know the actual reason why, but I have a pretty good guess.

I’ll direct you to the Turing Award Lecture, given by Ken Thompson. In the lecture he describes how you can create a Trojan Horse that can change the system compiler such that it injects a security flaw into the operating system on its next build.

From this knowledge, it is actually pretty dangerous to be able to make a compiler that has the same signature as the system compiler. Since I have the source to the compiler, I could make one that can inject security holes into compiled code and not be detected. The next step is to make a Trojan Horse that delivers the modified compiler into the system. If the spreads to the systems of developers or to build servers, then there’s a big problem.

On the other side of the coin, if I release code into Open Source, shouldn’t it be possible to build a bit-identical version of the released product? And if I can’t, is this truly Open Source? Otherwise the compiler I built is really just an artist’s interpretation.

Apple clearly chose to do this on the side of trust, and I get that. At the same time, I’ve been at best inconvenienced and at worst, I’m dead in the water. In a practical sense, it doesn’t matter since (1) I’m not really a customer of Apple so it’s not their job to make me happy (2) their actual priorities are creating a solid compiler that efficiently generates correct code.

The fix on my end is to take advantage of the ABI stability promise and turn on the flags “-enable-library-evolution” and “-emit-module-interface”. These at least allow the compilers to be able to talk to each other. It’s not exactly ideal from the point of view of people who are going to use Binding Tools for Swift, but it’s not nothing.