Functional Image Processing, Are You F#ing Kidding Me?

I’ve approached 3 or 4 Functional Programming conferences with a talk entitled, “Functional Image Processing, Are You F#ing Kidding Me?” I haven’t gotten a nod on that (yet).

The story behind it had to do with work I did at Atalasoft. We had an image processing library for .NET that was written largely in C++ with a friendly veneer in C#. This was to get performance. At one point, I spent a couple weeks doing measurements to find out what the cost would be to work completely in a managed language. My results came down to the magic number 1.5. Typically, image processing code written in C# ran 1.5 times slower than the equivalent C++, which is honestly pretty dang good, but we didn’t have a lot of incentive to do that.

At one point, it looked like we needed to, though. A lot of our business came from people who were doing server side image processing and we had customers who wanted to use Managed hosting, which meant no C++. Around the same time, Microsoft released Silverlight, which was a version of .NET that ran in the browser and required an even more limited set of .NET and certainly no C++.

In deciding how we were going to do this, Rick Minerich was getting heavily into F# and he suggested we look at that. I spent a week learning F# and then started doing some experiments. Rick seemed pretty surprised at how quickly I picked it up, but I had done some work in college in Miranda so this type of coding wasn’t precisely new to me, even though the syntax was brand new. The first thing I did was a port of an image processing tool we had called Dynamic Threshold. This was a tool that turned gray images into black and white and had a decent amount of smarts to give better results than the industry standard Adaptive Threshold. I chose this command specifically because it was the worst choice for a functional implementation since it had side-effects out the wazoo and was the antithesis of functional style. Still, with some effort, I got a functional version of it working that was a pixel-perfect match to the C++ code and it ran about 1.5 times slower.

If the worst possible command could work out, then I was sure that this would work with most of the typical image processing code. I threw out the F# implementation and redid it in C# instead, but then I went ahead with the rest of the library.

Since we already had a bunch of code that was just in C#, this created an interesting problem. We needed a hybrid assembly built by merging an F# assembly and a C# assembly. This involved some unusual build steps. One of my goals was that our resulting API should be 100% identical to our existing one. There was a problem, though. In a lot of my code, I like to use something akin to the GOF Template Method design pattern by building out a set of protected methods that implement reasonable default behaviors for each step in a process. Because they’re protected, an end user can sub-class the class and customize behaviors. For many things, this is a powerful tool. Unfortunately, F# (at the time) had a compiler bug where methods declared protected ended up public, which would give us an API that didn’t match and also violated the encapsulation. So for that, I did something truly grotesque. I put a .NET attribute on each method that was to be protected and wrote a tool that use the Mono tool Cecil to tear apart the compiled assembly and find all methods marked with the attribute and make them protected and then rebuild the assembly. Surprisingly, it worked quite well.

What really came in handy in the process was that F# has a proper way to declare functions as inline, which allowed me to make code read well, but still performant. This was used in code to set/get a 1-bit pixel from a scan line, for example or set/clear a segment of pixels in a scan line. I really wanted macros for a lot of my work, but that just wasn’t there. I did, however, make heavy use of partial function application. One issue we had was that we supported 12 different pixel formats and many commands needed to operate on each of them. It was especially bad when you had to combine two images in two different pixel formats because that represented 144 cases to handle. What I could do was put the general work in a nice little tail-recursive function with a function to apply on each scan line. The function was inevitably the result of an F# match statement that used partial application to build a function that had the signature I needed.

When all was said and done, most of the code ran in the 1.5 x slower range, but surprisingly, there was a bunch of code that ran faster than the C++. In those cases, there was inline code that the F# compiler could optimize the hell out of in a way that the C++ compiler could not. This is one of the big secrets of functional programming is that with proper attention to side-effect free code, much of the computational work can be done ahead of time to throw away cases that can’t possibly happen.

One of the big problems in doing this work was handling the problem of “how do I know I did this right?” Fortunately, a few years earlier I put in a suite of anti-unit tests. We had more than a hundred objects for performing various image processing tasks and each could work on all or a subset of the 12 pixel formats. I saw this as a pain point and wrote a suite of tests that ran every one of these tasks on every possible combination of pixel formats and checked to see that the results had not changed since the last time the test ran. These were the “Image Stability Tests” and I could leverage them to ensure that I was doing my job correctly. In the end, nearly all the tests ran unchanged from the C++ implementation to the F# equivalent. There were some exceptions that had to do with junk data in padding on the end of scan lines or differences in floating point rounding and I also found some bugs in the C++ or the C# wrapping code that had been undiscovered for years.

Of all that work, I think the code that I enjoyed doing the most was the Octree Color Quantization. This was particularly fun because the C++ code was incredibly grody for the whole purpose of being memory stingy and fast. I was able to undo a lot of that filth by using F# discriminated unions to represent the nodes and leaves in a tree and some nice tail-recursive code to handle the construction and reduction of the tree. It was no small feat, but in the end, it worked very well.

Not all of the code was pretty. There were times when I chose to play fast and loose with arrays passed for the sake of mutating state. There were tail-recursive functions that had entirely too many arguments and I shrugged hoping that the compiler would magically fix all that under the hood, but in the end it worked quite well.

We had two products that fell out of this – DotImage Pure (which would run in a ‘pure’ managed environment) and DotImage Sterling (which would run in a Silverlight environment).

Eventually, when we produced JoltImage, a version of DotImage for Java, I did the work by porting the F# to Java. I entertained the notion of working in Scala, but passed. But all of that is a story for another day.

In the end, there were several things that made this project work: an understanding of how to use the tools of a modern functional programming language; the ability to choose between doing something in F# or C#; a robust set of tests; and a solid expectation of the performance a priori to allow me to attend to performance hot spots as needed.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.