Functional Image Processing, Are You F#ing Kidding Me?

I’ve approached 3 or 4 Functional Programming conferences with a talk entitled, “Functional Image Processing, Are You F#ing Kidding Me?” I haven’t gotten a nod on that (yet).

The story behind it had to do with work I did at Atalasoft. We had an image processing library for .NET that was written largely in C++ with a friendly veneer in C#. This was to get performance. At one point, I spent a couple weeks doing measurements to find out what the cost would be to work completely in a managed language. My results came down to the magic number 1.5. Typically, image processing code written in C# ran 1.5 times slower than the equivalent C++, which is honestly pretty dang good, but we didn’t have a lot of incentive to do that.

At one point, it looked like we needed to, though. A lot of our business came from people who were doing server side image processing and we had customers who wanted to use Managed hosting, which meant no C++. Around the same time, Microsoft released Silverlight, which was a version of .NET that ran in the browser and required an even more limited set of .NET and certainly no C++.

In deciding how we were going to do this, Rick Minerich was getting heavily into F# and he suggested we look at that. I spent a week learning F# and then started doing some experiments. Rick seemed pretty surprised at how quickly I picked it up, but I had done some work in college in Miranda so this type of coding wasn’t precisely new to me, even though the syntax was brand new. The first thing I did was a port of an image processing tool we had called Dynamic Threshold. This was a tool that turned gray images into black and white and had a decent amount of smarts to give better results than the industry standard Adaptive Threshold. I chose this command specifically because it was the worst choice for a functional implementation since it had side-effects out the wazoo and was the antithesis of functional style. Still, with some effort, I got a functional version of it working that was a pixel-perfect match to the C++ code and it ran about 1.5 times slower.

If the worst possible command could work out, then I was sure that this would work with most of the typical image processing code. I threw out the F# implementation and redid it in C# instead, but then I went ahead with the rest of the library.

Since we already had a bunch of code that was just in C#, this created an interesting problem. We needed a hybrid assembly built by merging an F# assembly and a C# assembly. This involved some unusual build steps. One of my goals was that our resulting API should be 100% identical to our existing one. There was a problem, though. In a lot of my code, I like to use something akin to the GOF Template Method design pattern by building out a set of protected methods that implement reasonable default behaviors for each step in a process. Because they’re protected, an end user can sub-class the class and customize behaviors. For many things, this is a powerful tool. Unfortunately, F# (at the time) had a compiler bug where methods declared protected ended up public, which would give us an API that didn’t match and also violated the encapsulation. So for that, I did something truly grotesque. I put a .NET attribute on each method that was to be protected and wrote a tool that use the Mono tool Cecil to tear apart the compiled assembly and find all methods marked with the attribute and make them protected and then rebuild the assembly. Surprisingly, it worked quite well.

What really came in handy in the process was that F# has a proper way to declare functions as inline, which allowed me to make code read well, but still performant. This was used in code to set/get a 1-bit pixel from a scan line, for example or set/clear a segment of pixels in a scan line. I really wanted macros for a lot of my work, but that just wasn’t there. I did, however, make heavy use of partial function application. One issue we had was that we supported 12 different pixel formats and many commands needed to operate on each of them. It was especially bad when you had to combine two images in two different pixel formats because that represented 144 cases to handle. What I could do was put the general work in a nice little tail-recursive function with a function to apply on each scan line. The function was inevitably the result of an F# match statement that used partial application to build a function that had the signature I needed.

When all was said and done, most of the code ran in the 1.5 x slower range, but surprisingly, there was a bunch of code that ran faster than the C++. In those cases, there was inline code that the F# compiler could optimize the hell out of in a way that the C++ compiler could not. This is one of the big secrets of functional programming is that with proper attention to side-effect free code, much of the computational work can be done ahead of time to throw away cases that can’t possibly happen.

One of the big problems in doing this work was handling the problem of “how do I know I did this right?” Fortunately, a few years earlier I put in a suite of anti-unit tests. We had more than a hundred objects for performing various image processing tasks and each could work on all or a subset of the 12 pixel formats. I saw this as a pain point and wrote a suite of tests that ran every one of these tasks on every possible combination of pixel formats and checked to see that the results had not changed since the last time the test ran. These were the “Image Stability Tests” and I could leverage them to ensure that I was doing my job correctly. In the end, nearly all the tests ran unchanged from the C++ implementation to the F# equivalent. There were some exceptions that had to do with junk data in padding on the end of scan lines or differences in floating point rounding and I also found some bugs in the C++ or the C# wrapping code that had been undiscovered for years.

Of all that work, I think the code that I enjoyed doing the most was the Octree Color Quantization. This was particularly fun because the C++ code was incredibly grody for the whole purpose of being memory stingy and fast. I was able to undo a lot of that filth by using F# discriminated unions to represent the nodes and leaves in a tree and some nice tail-recursive code to handle the construction and reduction of the tree. It was no small feat, but in the end, it worked very well.

Not all of the code was pretty. There were times when I chose to play fast and loose with arrays passed for the sake of mutating state. There were tail-recursive functions that had entirely too many arguments and I shrugged hoping that the compiler would magically fix all that under the hood, but in the end it worked quite well.

We had two products that fell out of this – DotImage Pure (which would run in a ‘pure’ managed environment) and DotImage Sterling (which would run in a Silverlight environment).

Eventually, when we produced JoltImage, a version of DotImage for Java, I did the work by porting the F# to Java. I entertained the notion of working in Scala, but passed. But all of that is a story for another day.

In the end, there were several things that made this project work: an understanding of how to use the tools of a modern functional programming language; the ability to choose between doing something in F# or C#; a robust set of tests; and a solid expectation of the performance a priori to allow me to attend to performance hot spots as needed.


I’m Old, Part LX: The Humble Mouse

When I was at Adobe, they started me off with a Sun workstation for writing code. It was OK. It ran the compilers that I needed and was good enough for my day-to-day work. Most of the engineers in the department had similar machines. It came with Sun’s optical mouse that required a special metal mouse pad with a red and blue plaid pattern on it. This was a crappy misfeature. The mouse felt soupy and if a hair got into the optical sensor (which happened every couple of days), the mouse tracking went wonky.

In my second year, Computer Santa Claus came by and brought me a brand-spanking new DECStation 3100, which was a shit ton faster than the Sun and the monitor was in color and it had a better mouse. Well, in some respects. It had a unique mechanism which had two wheels, one canted around the X axis, the other canted around the Y axis. Unlike a ball mouse, no part of the rollers which come in contact with your work surface ever enters into the mouse itself. Unlike mechanical mice with balls, it rarely needs cleaning. Unfortunately, the form factor is terrible. It’s too wide and too low. A soap bar shape would have been much better.

A few months in, I was getting wrist pain, so I figured I would try to find a replacement for it. DEC didn’t supply the pin outs on the mouse connector. But hey – it’s a Hawley mouse – I know that name! I found that the mouse had been designed by Jack Hawley, who had also worked on the Xerox Alto and lived in Berkeley. I called his company, The Mouse House, and left a message. He called me back and we had a nice long chat about it. He didn’t have the pin outs either and suspected that it wouldn’t help if I did. The form factor and the communications were all DEC’s doing, the mechanism was his work. He made some suggestions on modifying the mouse itself.

We got to talking about family history and we couldn’t find any common relatives going back a few generations. Oh well. He asked me if I knew the Hawley coat of arms. My grandparents had a copy of one, but I never thought it was correct. He said he had a very old heraldry book and because the Hawley coat of arms in it was so simple, he suspected that it was the correct one. He asked for the company fax number so he could send me copies of the pages.

The cover sheet had his business card which was done up in a gilded age style with the words: “Jack Hawley – Berkeley’s Great, Though Humble, Inventor” Yes, he is definitely a Hawley. The pages that followed include the title page of the book and the page with the Hawley coat of arms. The book was over 300 years old! I couldn’t believe he put a 300 year old book onto a copier!

He was quite an interesting fellow and although he couldn’t help me directly, it was nice to talk to him and find out a little more about my family name.

I’m Old, Part LIX: It’s OK to Explore

Much in software engineering is based on what you make. What is your product? What did you ship? This is flawed thinking. Sometimes exploration is it’s own reward and the dividends get paid much later.

When I was in 9th grade, I convinced my dad to buy me a toolkit for the Apple II called Bill Budge’s 3D Graphics System. The manual is here. For the time and the capabilities of the Apple, this was an astounding piece of software. Budge provided an editor for 3D shapes, a library for displaying them, and interfacing for the library in Integer Basic, Applesoft Basic, and assembly language. Neat!

I had an idea for a game called virus. It involved a virus that lands on a cell wall and you controlled a turret that shot antibodies at the virus. If you broke down the virus in time,  no problem. If you didn’t the top would crack open and create new viruses. I never finished it. And that was probably for the best. The game concept was missing a lot. It was not particularly a playable idea, although I imagine that it might work as a framework for a game in the tower defense genre.

I was shaken that I couldn’t finish this game and that I couldn’t think of another game that fit in this genre. I was also frustrated at the performance in Basic and I had a rough time getting anything solid in assembly language. Really, I was overly hard on myself.

What I didn’t know at the time was how my experience was going to affect me later. 4 years later in college, I created a 3D graphics application in MacPascal that ran on the original Mac. It turned out to be a stretch for the system. I discovered bugs and limitations in MacPascal, I learned about issues with clipping and problems with divide by zero in perspective transformations. Still, it worked.

While I didn’t know it at the time, Budge’s library was based on a display list – this is a cool concept that allows either a self-drawing entity or a structure that can be ripped with a fairly simple state machine. I’ve used display lists in a number of projects. DotPDF uses display lists internally to represent page contents.

So yes, in my youth, it was easy to give myself grief for not finishing a little game, but that was only because I didn’t know the true value of the learning experience that I would carry forward. So I don’t dwell long on abandoned projects, but I do celebrate what I have learned.