Riddle me this, Batman: when is a Swift protocol not a protocol?
Answer: when it’s an Error.
I think the main thing about Swift types bugs me the most is that it’s type system seems to be a raft of special cases when there are perfectly good alternatives. Swift exceptions are a perfect example of this.
Let’s start with a quote from the Swift documentation from Apple:
In Swift, errors are represented by values of types that conform to the Error protocol. This empty protocol indicates that a type can be used for error handling.
Spoiler: this is an outright lie. Let’s prove it.
To start, let’s start with how swift represents protocols (in particular, this is the protocol list type)
---value--- ---value--- ---value--- -Metadata-- -PWitness0- (PWitness1)
It’s a block with 3 machine pointers set aside for the value represented by the Protocol (if the value does not fit in 3 machine pointers, there will be a pointer to allocated memory for the value), then there’s a pointer to the class Metadata for the value, and finally there is a list of pointers to protocol witness tables for each protocol that is represented. If Swift uses a protocol for the Error type, we should be able to test this side by side with another empty protocol and see what the compiler does with this.
public protocol EmptyProtocol { } public enum MyError : Error, EmptyProtocol { case itsAnError case itsAHorribleError case itsTheTotalCollapseOfCivilization } public func doNothing(_ a: Error) { } public func alsoDoNothing(_ a: EmptyProtocol) { } let x = MyError.itsAHorribleError alsoDoNothing(x) doNothing(x)
I compiled this and dumped out the resulting assembly and this is what I got:
_main: ; these 3 instructions set up the stack frame 0000000100000e40 push rbp 0000000100000e41 mov rbp, rsp 0000000100000e44 sub rsp, 0x40 ; rax is a pointer to stack memory for a protocol 0000000100000e48 lea rax, qword [rbp+var_28] ; this is the protocol witness table - it goes into rcx 0000000100000e4c lea rcx, qword [__TWPO5None17MyErrorS_13EmptyProtocolS_] 0000000100000e53 lea rdx, qword [__TMfO5None17MyError] ; after this instruction rdx will point to the class metadata MyError 0000000100000e5a add rdx, 0x8 ; this is the value of itsAHorribleError, storing into x 0000000100000e5e mov byte [__Tv5None11xOS_7MyError], 0x1 0000000100000e65 mov qword [rbp+var_10], rdx 0000000100000e69 mov qword [rbp+var_8], rcx ; this stores x into the stack memory 0000000100000e6d mov r8b, byte [__Tv5None11xOS_7MyError] 0000000100000e74 mov byte [rbp+var_28], r8b 0000000100000e78 mov dword [rbp+var_2C], edi 0000000100000e7b mov rdi, rax ; rax points to stack memory which contains this: 0x0000000000000001 -> value word 0 (itsAHorribleError) 0x0000000000000000 -> value word 1 0x0000000000000000 -> value word 2 0x00000001000043f8 -> Metadata for MyError 0x0000000100003710 -> Protocol witness table for MyError.EmptyError ; which is a protocol list type with 1 protocol 0000000100000e7e mov qword [rbp+var_38], rsi 0000000100000e82 call __TF5None113alsoDoNothingFPS_13EmptyProtocol_T_ ; None1.alsoDoNothing (None1.EmptyProtocol) -> () ; now here's the set up for calling doNothing 0000000100000e87 lea rax, qword [__TMfO5None17MyError] ; after this rax will have the Metadata for MyError in rax 0000000100000e8e add rax, 0x8 ; rsi will have the protocol witness table for MyError.Error 0000000100000e92 lea rsi, qword [__TWPO5None17MyErrors5ErrorS_] 0000000100000e99 xor r9d, r9d 0000000100000e9c mov edx, r9d 0000000100000e9f xor ecx, ecx 0000000100000ea1 mov rdi, rax ; now this is different... 0000000100000ea4 call imp___stubs__swift_allocError 0000000100000ea9 mov r8b, byte [__Tv5None11xOS_7MyError] 0000000100000eb0 mov byte [rdx], r8b 0000000100000eb3 mov rdi, rax // rax/rdi point to heap memory which contains: 0x001d800100655179 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x00000001000043f8 -> Metadata for MyError 0x00000001000042a8 -> Protocol witness table for MyError.Error 0x0000000000000000 0x0000000000000000 0x0000000000000001 -> value of itsAHorribleError 0000000100000eb6 call __TF5None19doNothingFPs5Error_T_ ; None1.doNothing (Swift.Error) -> ()
So here’s the dirty little secret about Swift.Error: it acts like a protocol, but in use it is not a protocol whatsoever. It’s a heap-allocated container that has information about the encapsulated type. In digging around, Apple appears to be claiming that this representation is so that an Objective C NSError and a Swift Error can co-exist in the same type. So if I call Objective C code which reports an NSError, the Swift code can treat it like a throw (and vice-versa).
I consider this to be a mistake on Apple’s part on a number of fronts. To explain why, we need to understand Swift exceptions a little better.
If I have a function in Swift that throws an exception, in theory there is a transformation available of the function to match Swift’s functional approach.
public func someFunc() throws -> SomeType { } public enum ExceptionHolder<T> { case normalReturn(T) case exception(Error) } public func someFunc() -> ExceptionHolder<SomeType> { }
In this example, the second incarnation of someFunc is equivalent to the first incarnation except that instead of throwing, we’ve transformed it into a discriminated union of the either a return value or a thrown exception. For Swift, this makes complete sense because exceptions are not exceptions in the sense of typical OOP languages like C++/Java/C#. This is because Swift is a reference counted language and returning from a function abnormally wreaks absolute havoc on your reference counting. The solution to this is that an exception thrown is actually a normal return and that allows Swift to clean up references counts. The transformation above is more or less what Swift 2 used to do (at least I think so – I looked at it very briefly and put it aside and haven’t looked back). This is no longer the case. In Swift 3, Apple has deviated from the standard ABI yet again and has also broken from the functional approach of one argument, one return (see this previous blog). When a function can throw, it will return 2 values. One in standard registers (rax etc), and an exception in r12. If there is no exception, r12 is 0 and the standard registers have the return. If r12 is non-zero, it contains a pointer to a heap-allocated error object.
So why is Swift doing the wrong thing, in my oh-so-humble opinion? Well, first the calling conventions break the ABI. Second, Apple has decided to use one format for Swift.Error forcing all Swift code to suffer just to get interoperability with Objective C. Finally, it’s inefficient: transformations should instead happen at the point of interface, not for everyone. The reason being that in Swift error handling code, the typical pattern matching code ends up calling a library routine, swift_dynamicCast (bet you didn’t know that swift even had dynamic casting like this) for every single case and the way that swift_dynamicCast it written, this particular type gets forced through several cases that are guaranteed to fail and ends up eventually running through a recursive call. So much wasted time.
Now, the Swift ABI is not finalized yet and this may yet change, but jeez.
This doesn’t even touch on the problem that Swift exceptions are (still) fundamentally broken from a language point of view, but that’s a topic for another day.
So we see that Swift exceptions at first look appear to be a protocol that acts similarly to exceptions in other languages, but in implementation are a misfeature that breaks the language model and the ABI and hurts efficiency.