I’m Old, Part LVIII: The Evolution of DotPDF

When I started at Atalasoft, there were a number of features that were in broad use, but needed love and attention to grow. We had a small PdfEncoder class written by Glenn. It could take a single image and turn it into a single page PDF. It had some very simple options that would allow you to set the page size, center the image on the page, fill the entire page with the image, and so on.

It had so many places where it needed to grow. Since I had worked for Adobe and had worked on Acrobat, I was selected to implement these features. Fine. Time to dust off a copy of the PDF spec and get the knowledge out of my long-term swap space. I looked at the code that implemented the PdfEncoder and I was impralled. This is a combination of impressed and appalled. Glenn had done a fine job matching the spec, but the code essentially just a set of print calls to the output stream with some code to grab the current output stream position to later build the cross reference table. It did what it needed to do in a terse, prosaic way, but it was fragile as hell. PDF is a sort of object-oriented file format that depends very heavily on the absolute location of objects in the file. The definition of the page is an object. The contents of the page is an object. The tree that refers to all the pages is an object. Each time Glenn started to write an object, he would store the object number in a private member variable and then he could point to it when he wrote the cross reference table. Every time a new feature was added that required more objects, the code just got worse and worse. I think I did one or two features this way then I asked Bill for some time to do this “right”.

I started creating a set of objects that represented PDF entities. I did this via two primary ways. The first was to create an interface, IPdfStreamable, which if an object implemented, would write the object to a stream as PDF. Then, since C# doesn’t let you apply interfaces to existing objects (see extension protocols in Swift), I wrote a general purpose Write method that took an object and a Stream. If the object was IPdfStreamable, it went to that. If it was some of the built-in types (string, double, float, int, bool, etc), it would vector to methods to stream those types. I had code to automatically handle the PDF way of referencing objects. You let the cross reference table know that an object needed to be a reference and it would automatically write it as a reference and not inline.

I made a base class to represent the PDF dictionary type and I did something super sneaky. I made it mirror C# classes. I put an attribute on every object property which signaled that this was part of the PDF specification. The attribute encapsulated information in the spec, including things like what version of PDF is it part of, whether or not it was required, what it’s default value was (if any), and if it had to be a reference to an object. Then I implemented IPdfStreamable on it and put in code to handle most of the reasonable defaults. Now I could practically type the PDF spec in as C# code and it was automatically written for me. Sweet.

After 3 weeks of this, I rewrote the existing PdfEncoder in terms of the new objects. Adding new features was way easier. Encode multiple images into a PDF? Easy. Support for more esoteric images types? Done. Full support for document metadata? Got it. ICC color profiles? Tricky, but doable. Pretty soon we had some really nice example code – for example, to take a directory full of images and turn it into a single PDF was now trivial. Or how about reading directly from a scanner to PDF? Not that much different, to tell the truth.

We had an OCR engine that we worked with that had an option for PDF output. It worked, but it looked cruddy. Part of the issue was that the PDF encoding didn’t happen until late in the process, so the image was always 1 bit per pixel no matter what it was scanned at. And they charged a lot for it.

I wrote a new module for our veneer over the OCR engine which was a new PDF output module and since we owned the image pipeline, I could make sure that the output image in the PDF matched the quality that the end user wanted. And it was all based on the same code used for the PdfEncoder.

Over years, I grew the functionality of the underlying code to include consuming PDF. My initial cut of public functionality, was an object that represented a PDF document in a very coarse sense. You got a collection of flyweight pages, document metadata and not much else. But, you could rip pages out of the document and put them into another document. You could combine multiple documents. You could reorder pages. These were all things that our customers wanted and we made them easy to use, easy to understand and performant. It was hell to get right. When you executed the “save” method, the work under the hood was akin to picking up a huge fishing net at one knot, clipping some sections, sewing in others and setting it down somewhere else perfectly folded. There were bugs. There were problems with crappy PDF files from well-intentioned people. But it worked pretty damn well. I think that release also included support for PDF/A (archival PDF) thanks to Rick Minerich.

Our customers were pleased with this. How do I know? Because they kept asking for more features. I prioritized them and added them in. Because I had meta information about the document structure from the PDF spec, I put in code that could find, classify, and automatically repair shit PDF.

All this time, I hid my library from our users. We could have made it all public which is what iText does, but when you do that you require your customers to understand the PDF spec and trust me when I say that that is always a mistake. The burden on support alone would have been egregious.

As the toolset grew, I added in annotation support and we now had hooks into our DotAnnotate product. We could read and write annotations to and from PDF files and let users view/create/edit them. Nice. I put a veneer onto that code into PDF document as well.

One thing missing from the picture was generating and editing PDF. This is not a small task because now we had to have a model of PDF available to our customers that they could work with but still not know the PDF spec. I put in cushy abstractions onto the core types and created a means of making documents from whole cloth with just about all the PDF elements available. I created a shape abstraction and a bunch of default shapes to make it easy to plop down rectangles, ellipses, Bezier curves and so on. Kevin Hulse wrote text objects onto these abstractions including a text formatting engine and code that could run text alone an arbitrary path.

To check on performance, I wrote some code that took the text from Moby Dick from Project Gutenberg and formatted it into a book complete with drop caps at the chapters and page numbers. The code was not complicated and it formatted and wrote the book in about a second, if I recall correctly.

I put in better support for annotations including custom appearances and actions. For grins, I wrote some code that created a document with a button annotation labeled “Pull My Finger” that when you clicked on the button played a fart noise. Probably the best use for PDF ever. I sent the document to Elaine and waited for the laughter across the office. I also wrote some code that took an image and created a colored rectangle annotation on the page for every pixel. It was tens of thousands of annotations. It took a few seconds to write and my code could round trip it in similar time. Acrobat took minutes to open the file.

I made it so it was possible to round trip documents created by this code, so now we had limited editing. If a page had been made by my tool, I could recreate the nice, fluffy objects for you. I started putting into place the infrastructure to make all PDF editable, and started exposing that through some internal tools that could now do text extraction. I think it took me a few weeks to do. The original Wordy algorithm by Daryoush Paknad took significantly longer to create, but he was inventing it and he had to work in C. I stood on his shoulders and did the work in C#. For grins, I built the code using a page quadtree subdivision algorithm that had automatic annealing of split words based on font position and similarity. And again, it ran very well.

One internal demo I did was to write an interactive drawing app. It gave you a view of a page and let you sketch out shapes on the page and then write them out. And I wrote it in the worst possible way: when you started to draw a shape, it would translate the mouse movements into PDF, write a PDF document, then use our PDF renderer to draw it in the window. It could do this and the UI felt perfectly fluid until you got about a dozen objects on the page, then it started to feel mushy.  Again, I was impralled.

The last bit of technology that went into the code was digital signatures. I had put off that bit of code for a very long time because I was trying to take my own advice to never write time management code, never get in a land war in Asia, and never write digital signature code.

Honestly, it was a hard decision to walk away from this product. It was reasonably easy to use, and at the high level it was nearly impossible to write a bad PDF (and there was some preflight code that let you catch errors), it wasn’t particularly huge, and it ran well even though it was all in C#.

In many ways, it felt like the Master Control Program from Tron, but you know, without all the evil overlord stuff. DotPdf started out as a small tool to make my life easier, but grew and invaded the rest of our product line. At the end of my tenure, there were 4 or 5 products that all had significant features rooted in DotPdf or its underlying library. I will always be proud of that code.

I’m Old, Part LVII: API Chess

When I started at Atalasoft, I had some initial tasks that were good to get me used to C# and to their code base. In the latter, I went over the code to get a sense of the organization. Much of it was pretty good, but some of it needed help. The problem with making sweeping changes is that even if it’s for the best, your established customers will suffer. In order to make sure that this pain was minimized, I set up a couple of strategies to prepare our customers for the changes.

The first approach is the honey pot: make something irresistible compared to what is already there. For example, there was a class call ImageCollection, which is exactly what it sounds like. The problem with it was that it requires that every image in the collection must be in memory at the same time. A naive developer could exhaust the memory on the target system very quickly. My first strategy was to offer something much more tasty than ImageCollection. ImageCollection was very simple, which is hard to compete with. I created an abstract class called ImageSource that represented a theoretically infinite generator of images. I tried to make it as easy as possible to use, modeling it off UNIX pipes but with an added layer of Acquire/Release semantics for images. I built a more specific version called RandomAccessImageSource for circumstances when you knew how many images you had a priori and could load any arbitrary image at any time. From there I built an even more specific version called FileSystemImageSource, which let you point at a directory of files and it would give you all the images. This is a very common task and I had made it trivial compared to doing it with ImageCollection. I encouraged Glenn and Dave to use this and to shed ImageCollection in their tools.

Next is “this is for your own good”. An example of this is in the main image class AtalaImage was a relatively thin veneer over a big honking block of memory referenced with public property of type IntPtr. This needed to go. I was fairly certain that our customers rarely, if ever, used it. So I put an Obsolete attribute on it with a warning to the effect that it was going to go away. In its place, I put an abstraction for a type called PixelMemory which had a factory method for getting a PixelAccessor class to get at the actual data in the image. I rewrote all of the codecs and image manipulators in terms of these two classes. This took a great deal of time and spanned multiple releases, but when I was done and changed the Obsolete warning to a compile failure when you touched the old IntPtr property, our customers had been aware of the upcoming change for close to two years. When I made the final change, the total number of support cases we got was 0.

The next approach I used to improve the code base was Ninja Interfacing. One example was the image manipulation classes. There were something like 143 commands, each of which was very similar, but shared very little in common, so it was difficult to work with them generically. So I took stock of every step in an image transformation and rewrote them in a series of consistent, coherent steps. I made sure that common tasks operated with reasonable defaults and that anything that was screwy for a particular command could be overridden without affecting the other commands. When I was done, not only were the commands much more consistent, a lot of common work was centralized in one place instead of 20 places (sometimes the same bug was distributed 20 or more times, I discovered). In order to make sure that I didn’t introduce problems, I wrote the antithesis of a unit test. This was a test that queried every single image processing command available and monkeyed with it away from the default settings and then ran it on every pixel format available and did a hash of the image data and compared that against a baseline that I had made before the change. When the dust settled, we had passing tests that very equivalent to the previous version, but now were architected much more cleanly. One thing that came from this was that a large number of these image processing commands could be extended to run across multiple threads for a performance boost and that could be turned on and was transparent. All of this was done Ninja style – our customers were never aware that the change happened, but they could reap the benefits.

All of these changes made it easier for us to integrate 3rd party tools or codecs as well as making it easier to port the code base to a completely managed environment or to Silverlight or to Java.

How is this a chess game? The amount of work described here was several person years. At the time, Atalasoft was an engineer-heavy company, but it was tiny. So the chess game was to figure out which of these behind-the-scenes features should go in when without damaging the experience of our existing customers and still allow new products and features to be released. It was both daunting and exciting: when I set out to do this, I knew that this was the right approach but the sheer amount of work was scary and even though the benefits of the work were apparent, it was a long road to get there and the benefits were transparent.

At then end of the changes, I had restructured most of the code base in a way that made the overall more logical and useful, but at the same time I had minimized how this affected our customers. We honestly had more problems when we moved the code base from .NET 1.0 to .NET 2.0 and that is no small feat.

I’m Old, Part LVI: Dithering

On one of the original disks for the Apple II, there was a digitized image of Hopalong Cassidy. In one of the manuals, there was a description of the process that was used to create the image as well as to convert it from grayscale to a 1-bit per pixel image. The process was called Ordered Dithering. I thought it was cool and it was something I stowed away in my head for the future (like a great many things that are in the attic of my brain).

Ordered Dithering is very simple, very efficient way to do a particular type of half-toning. It works by carefully building a kind of variable mesh into a 2 dimensional array and comparing gray values with what’s in the array using the pixel coordinates. If the gray value is greater (or less), a pixel gets set to black, white otherwise. The filtering code ends up being 1 line of C.

In 1985, I had the opportunity to do some coding for the Macintosh and I wrote a program that simulated energy being deposited on a rectangular wall from a point source using inverse squared for the amount of energy hitting a point on the rectangle. I thought that I was inventing ray tracing (I was, but the math was wrong, something corrected a year later in college by Amnon Silverstein one evening in a dining hall after dinner). To display it on the black and white Macintosh, I used Ordered Dithering.

In researching more about Dithering, I stumbled upon references to a book  Digital Halftoning by Bob Ulichney from DEC. In the world of the 80’s with my budget, I couldn’t get a copy of the book, but I had seen bits and pieces about his version of error diffusion dithering (Floyd-Steinberg being the most common).

In 1989, Andrew Glassner posted to a netnews group that he was looking for people to write chapters for a collaborative book called Graphics Gems. I wrote up a small chapter explaining Ordered Dithering as well as some code by Jim Blandy that parametrically generated ordered dithering matrices.

At Adobe, I worked on PostScript printers for DEC. They wanted to ship a low cost PostScript printer with a metric butt-load of features that were usually found only in higher end printers. One of them was DECImage (pronounced Dess-ih-maj, not to be confused with DECImage, a different DEC product pronounced DECK-Image), which was an implementation of Bob Ulichney’s halftoning algorithm. It ported easily into the PostScript engine. Unfortunately, the algorithm was very floating point dependent and this printer was running on a 68K with no native floating point. It ran like molasses in January. I rewrote the code to used fixed point. It ran way faster and while not identical in the output, it was indistinguishable from arm’s length. We sent the changes off to Bob along with the performance difference. Sadly, he rejected the changes because the output wasn’t identical to the pixel. I was told that this was necessary in order to meet the requirements of the patent.

This is an image from the printer’s preliminary start page. I had a small gray scale image about the size of a postage stamp. Using the built-in halftoning, the output was OK, but blurry. With DECImage turned on, the output was far sharper and recognizable.

It was disappointing that my optimizations didn’t go in, but it was still great to have a chance to work with the code that created by the man who wrote the actual book on halftoning.

I’m Old, Part LV: The Holy Trinity of Late Software

I posted the following to twitter not to long ago:

The holy trinity of managing late projects:

  1. Lose features
  2. Lose quality
  3. Ship later

Pick one. If you don’t pick, prepare to get all 3.

I’m going to expound on this and give a couple examples.

The trinity has come from experience in working with projects on “aggressive” schedules. In many cases, the management solution was “work longer hours” or “add more engineering” or both. Working longer hours leads to lower quality because engineers make more mistakes when they’re tired. Working longer hours leads to burn-out, which means less productivity, which means shipping later (or dropping an incomplete/broken feature). Adding engineering leads to more time spent bringing the new people up to speed, which can lead to shipping later. New engineers don’t always see the big picture, which leads to reduced quality. In addition, longer hours and burn-out leads to disgruntled engineers and you will lose disgruntled engineers.

One main solution is to avoid the problem entirely by estimating project scope better up front. I’m pretty good at this and at Atalasoft, I did this for several of our releases: we’d have a list of things that we wanted to do and I would fill a white board with the time each would take, generously rounded up to the next week: for example, if it would take 2 days: it gets estimated as a week. I was usually pretty accurate. If a feature had a ton of unknowns, I put in SWAG padding for the time. If a feature had one or more dependencies on external libraries, I put in more SWAG. At one point, Bill Bither and Lou Franco realized that I had become a crutch for the company – nobody else was doing estimates since I was mostly right. We changed that. We had the assigned engineer do the estimates and I would help them correct it with explanations for any adjustments.

At one point, we were tight on one release and Bill wanted to add in a feature. Both Lou and I were against it, so we presented Bill with the trinity and let him pick what to do from the 3. Bill was adamant about putting in the feature, but he also wasn’t willing to accept any of the 3. Eventually he relented. We shipped on time and with good quality.

Later, when the company was purchased by Kofax, we had new management. I was working on a feature to add digital signatures to our PDF tools. This is no small feat (and kids, if you can avoid it, never write date/time calculation code or encryption code) and there were a number of challenges including the fact that digital signatures are not well-specified in PDF reference and the tools that are available to examine a produced signature only tell you that they’re valid or invalid. It’s a horrible black box. Since I was the only one on this, I had to let them know that I was stuck and not making progress, nor did I know when I would be unstuck. This would affect the schedule. I had a conference call where I got raked over the coals (as if that would motivate me), and I suggested the trinity and I had a feature on my plate that I recommended dropping. I was told in no uncertain terms that was a no go. I ended up in a shouting match with someone a couple levels higher than me. Ultimately, the product shipped, but it was lesser quality than I wanted and I still wasn’t confident that I hadn’t made mistakes (for example, a security vulnerability such as being able to read private keys out of my address space). The feature I suggested dropping was a port of the code from C# to Java. Ken Walpurgis, who was doing engineering management (not product management), stepped up to do the lion’s share of the port and when I had finished the C# work, I finished it and tested/debugged it with him.

Management did the wrong thing here and shortly thereafter, Ken left. Immediately after the conference call with the shouting match, I updated my resume and engaged a head hunter and in time I left too. As a result, they lost product growth and innovation. I just looked at the current product features for that product and they haven’t changed in any way that I can see in the past several years. Too bad.

Of course, I’m simplifying the circumstances but the essence is still true: I presented what was going on, the consequences and options. I was shut down against my better judgement and without sufficient evidence that they knew better. Any more detail would be against my ethical standards for this blog – I’m happy to praise or kindly needle the people that I’ve worked with in the past, but for the things that might be damaging to someone’s reputation, I will mention neither names nor sufficient detail to identify them directly.

So in sum, estimate your projects accurately, staff appropriately, and if you come upon schedule issues let the trinity guide your decisions. Trust and support your engineers and you will produce products that your customers will trust.

I’m Old, Part LIV: Finding Your Thing

I’ve always found a deep joy in software engineering. In looking at the roots of it, I find myself thinking about playing the original Adventure game. My dad had brought home a TI Silent 700 terminal and we were able log in to a Bell Labs account that gave us access to games. Adventure was so engaging – getting hooked by trying to figure out how to get past the traps and puzzles: a hollow voice says, “plugh”. The essence of coding was like spell casting. We put pieces together and now we had a new magic word: plugh or xyyzzy which unwrapped new possibilities.

The problem is that software work fulfills a particular set of intellectual needs, but it isn’t comprehensive. People need more breadth of experience to fill the empty spaces. Working on Acrobat was ruinous in that the hours spent left precious little time for other things.

I tried a number of things. One of my co-workers was tired of endless tech talk so she started a salon of a sort where a group of us met weekly at the British Bankers Club in Menlo Park and tried but ultimately failed to avoid talking about work. I tried looking into the art scene in the area, but it didn’t work for me. On the east coast, I had worked as a volunteer on a local rescue squad so I looked to see if there was something similar in Silicon Valley – nope: all the ambulance crews were paid professionals. The answer ultimately was in my closet gathering dust: my trumpet.

I had played since I was 10 through college, but had all but given up the instrument for lack of time. I took it out and spent a few weeks doing lip drills and scale exercises to get my lip back. In looking around, I found that there was a local blues bar that did an open mic, so I started going and sitting in. Blues is not my wheelhouse. I was classically trained and especially love Baroque which is about as far from blues as you can get, but I was playing again and it felt like I had a missing limb reattached.

After the open mic, the bar usually had a band come in for the evening. Most of them were guitar heavy, but every now and then there were some groups that had small horn sections.  One night Chris Cain and his band were playing and he had a good horn player – Modesto Briseno – who was playing pretty well. Between phrases, I saw him turn his back and go through some motions that were all too familiar: sticky valves and he wasn’t going for valve oil. Uh oh. I pulled a bottle out of my case and snuck it up to the stage and set it down at his feet. He had a look of utter relief and quickly remedied the problem. Between sets he returned the bottle and thanked me.

I played a couple weddings and through word of mouth I found a really good brass shop in South San Francisco where I could take my horns for service. I brought in my old piccolo trumpet to get some major work done on it. I bought the instrument from a pawn shop when I was in high school and at this point the bell needed straightening and the finish had some pitting, so I had them fix it up and strip the lacquer plate it silver. While they looked over the horn, I saw a couple Eb/D trumpets that were an unfamiliar design. What I was used to in this type of horn was either a horn that played well in Eb and you swap 1st and 3rd slides and the D side is out of tune, or vice versa, or both sides are bad. These horns had Ed and D tuning slides as well as separate bells that could be swapped in. Both sides played in tune! I tried both horns, one by Schilke and one by Yamaha. The Schilke horn was good – clearly a professional instrument, but it felt wrong in my hands – just not comfortable. The Yamaha felt fantastic and in playing it, the horn fell in love with me. Crap. Time to save up. This pattern ended up happening two other times.

I ended up joining a big band in the area and that worked really well for me. We played weekly in an old warehouse and I was really able to build my lip back up. I was able to play through a whole rehearsal and still be ready for more. More to the point, one of the big deficits in my life was getting filled. Since that time, I’ve made sure that making music is a part of my life. It has helped me feel more balanced and relaxed even on the nights when my mind is stuck on other things.

This has been my own particular journey, but if you are in engineering like me I encourage you to look at things that are outside of your specific vein of geekery and find other activities that balance out the time spent being a code jockey. Taking time away from being a code monkey will help you as a person as well as improve your skill as an engineer by forcing you to take your blinders off for a while.

I’m Old, Part LIII: Losing a Colleague

This is not a light subject and if you are triggered by suicide, read no further.

When I started at Atalasoft, there were 3 other full-time engineers:

Dave – who specialized in web controls

Glen – who specialized in .NET Win controls and TWAIN/ISIS scanning

Seungyeon – who specialized in image analysis algorithms

I came in with a lot of PDF expertise as well as a great deal of architecture and bit-banging/optimization experience

It was a great time settling in with the group. Each of us had our own specialties as well as weaknesses, but between us we covered our product space very well. I got along with all of them pretty well, although I think I got along best with Dave.

Glen was an interesting bird. He was a transplant from Texas, complete with a very wry sense of humor colored by his southern accent. He was generally very quiet and reserved.

Glen had been responsible for a set of tools for creating annotations on documents for Windows applications. Dave did similar work in Javascript/asp.net. The problem was that Glen would do something and then Dave would have to replicate the work in parallel. This was not an effective way to work.

At one point, our CEO Bill decided that it was time to come out with a new version of the annotation product and I stepped in to try to fix the Glen/Dave problem. What I wanted was a greater amount of shared code and data for the annotations and at the same time to raise the bar for the feature set of our annotations.

I decided that we should have a strict Model-View-Controller design so that the data layer could be shared between the two products. Glen was not happy with this, since it was a very different approach from the existing architecture (and incompatible), but I also added in the ability to have arbitrary nesting layers of annotations, arbitrary appearances, extensible data, a strong rendering model and so on. Glen was left with the task of doing the reference implementation of my design on windows. He would come to me with questions on the implementation, hoping for a short cut and I would have to give him the bad news that it was going to be harder. Glen would say, “ohhhh-kay” in his dry Texas accent and head out to do what I had asked.

The design paid off, though. It was way easier for Dave to use the model code on the host and serialize it as json to his web controls. When Glen did a WPF version of the annotations, the model code went unchanged.

We moved into our new office and were there for several years, when Glen started having some medical issues. He was having neck pain and ended up having surgery to correct it. Glen also suffered from migraine headaches. There were days when he had to go home early and lie down. There were other days where he worked from home in the dark or just stayed in bed.

I emailed him on these days and offered to pick up food for him or run errands for him, but he never accepted my offers.

One day, Christina, our office manager, came into my office very agitated. She said curtly, “Steve, you need to come to Bill’s office. Now.” Christina turned and headed to Bill’s office with me in tow. I had no idea what was going on. Had I done something wrong? Had I pissed off a customer? When I got there Bill told me that they had gotten a call from the police and that Glen had died by suicide. I was struck dumb.

I remember taking Christina aside and admonishing her for taking me in blind, a reaction driven by my own emotions and I regret dumping on her like I did. She was suffering too – no need to make her suffer more.

As an aside, there is a very common reaction to problems in software engineering: “It’s my fault.” This stems from years of making (and correcting) mistakes. More often than not, it is my fault – my fault for not understanding an API, my fault for missing the key element in the documentation, my fault for not managing my side-effects or memory, my fault for using a weak model out of laziness, and so on.

To me, it was my fault that we lost one of our core engineers. I emailed instead of calling. I should’ve stopped by. I should’ve set up something where people checked in on him. I should’ve been more supportive. I should’ve known.

But I didn’t know. And I couldn’t have known. There was not enough feedback from Glen that he was suffering and reaching the end of his rope.

I miss Glen. I miss walking to lunch with him.

It hurt every time I passed by his condo on the way home from picking up my son at day care. I saw his white Pontiac in the lot, wondering when it too was going to be gone.

If you are reading this and you feel like Glen did, know that there are people who would gladly help you out. We can’t read your mind, but we can reach back if you reach out. No one is an island. It’s scary to open up about your pain, but trust that we will listen.