I’m Old, Part LXXVI: Trying Crazy Things

When working at Atalasoft, we did most of our work in managed languages: C#, F# or Javascript. We often had 3rd party libraries that were written in C or C++ that were exposed as C# which was part of our value add: binding C to a decent C# API is not always easy and the most obvious solution isn’t always the best one.

One chunk of technology we built in house was a library with tools to consume and generate PDF files. We were our own customer in that many of our separate PDF tools were written to our own API. For example, we had tools to generate PDF documents from images, render the output from OCR engines, and to generate/manipulate annotations that were all written to our own API.

To be clear though, the API we used was a private API. There was a reason for that: it cleaved very close to the PDF standard and it had very few safeguards from your own stupidity in terms of creating spec violating documents and felt that it was not suitable for our customers. We learned from doing that wrong with the TIFF spec. Exposing the dangerous API helps very few but creates a support nightmare.

When we released a friendly API on top of the internal PDF library we tried to make it near impossible to generate bad PDF.

At this point, I tried doing the things that were on the border of unreasonable if not fully within crazy-pants territory, just to see how the code would do.

I grabbed the text of Moby Dick from Project Gutenberg and wrote a C# app to render it. This is nearly 600 pages. I included special casing for chapter headers with chapter numbers and drop caps, page numbering and so on. It rendered in a few seconds with no special considerations about memory or buffering. Most PDF print drivers can’t do it that quickly.

I decided to put annotations through the ringer. I wrote code that took a sample image that we had and resample it and render blocks of 8×8 pixels as colored rectangle annotations onto a PDF page.

The tools put down 1400+ annotations and saved the document in under 1.5 seconds. It took Acrobat more than a 5 minutes just to open the file and render it. My code could open it in slightly longer than the time to create it. I should point out that originally, I was rendering the image in annotations at a substantially higher resolution – more that 4K of annotations and that just plain hung Acrobat.

For fun, I decided to do an unthinkable pet project. I wrote a PDF sketch app in F#. For the design, I adopted a fairly traditional Model-View-Controller and I tried to keep things strict: the model and view model were totally distinct. The model was in F# – a nice discriminated union to describe various shapes. The view model was my PDF toolkit being treated as write only. The view was our PDF rasterizer (which was written by FoxIt and was in C).

So what happened was that when the user drew a shape on the page, the F# code would render the view as PDF also render UI artifacts (shape handles, bezier control points, etc) as PDF on top of that, write it to a stream and send it to FoxIt to render to an image which I then stamped blindly onto the display window.

All this happened at UI speed. The app was creating, rasterizing, and then throwing away 40 PDF documents per second. I did a company demo of this to help drive home the performance of our tools so that sales staff could internalize the selling point. I was also ready for the next question – why don’t we ship this demo? I started putting lots of shapes on the page and the app got visibly slower. At around 20 shapes it entered the unusable territory, which I knew would happen all along: FoxIt couldn’t keep up. It was still quick, but as the file grew in complexity, so did the rendering, which is why most real drawing apps have a special case renderer for this kind of work and do things like render the parts that don’t change, cache it while rendering the changing parts separately, then compositing the two into the display.

What’s the point?

No. It’s not the Jurassic Park lesson “Your scientists were do preoccupied with whether they could they didn’t stop to think if they should.” The lesson is parallel, though. Go ahead and figure out the boundary of where the “could we” line is. The “should we” line will be obvious afterwards. In other words: know your strengths and know your limits.

None of the crazy samples made it into our documentation or into our usual stable of sample code that we shipped with the toolkit, but it was nice to know that if our customers tried something crazy (and trust me, they did), we were there to back them up.

For the New Fathers Out There

I worked with Rick Minerich at Atalasoft. It was great having Rick on our team. I learned a lot from Rick and I hope he learned a lot from us. Rick was on board around the time my second child was born so he got to witness the many mornings that I referred to as “Dawn of the Dad”.

Rick and I were scheduled to go to a conference in Vegas one year and our flight out of Hartford got canceled due to a thunderstorm that came in. Rick and I were put on a flight at the crack of dawn but we stayed up late getting some truly horrific food. We set the alarm for 4:00 and went to sleep at 10ish (IIRC). 6 hours of sleep? What a windfall for me! I woke up before the alarm, showered and got dressed while Rick was still struggling with the coffee maker. He seemed totally shocked that not only was I awake and moving, I was entirely wake and functional without caffeine.

Many parents will talk about sleep deprivation and its effects. I will simply say that the ability of the human body to adjust to what is needed is a remarkable thing.

But parenthood is one of the most challenging journeys in front of you. If you are attending a birth, be in the moment. Take stock of what’s happening. Etch it in your mind. You only become a parent once – afterwards, you are a parent for the rest of your life.

Human development is fascinating. Enjoy the stages of your child’s development. Some are frustrating, but as new things come on line every month will be better than the last. Look for things like raising head, discovering hands, rolling over, means-to-an-end, stacking, dump-and-fill, and so on. Don’t worry about when they happen – they’ll happen in their own time.

Much of marketing towards parents is based on fear. You don’t have to buy into it if you don’t want to. It’s not worth the time. Read up. Ask your pediatrician. Educate yourself. There are a few things that I think are worth doing: take a CPR course. I have performed the Heimlich maneuver on my daughter three times in her life. Also consider sign language for the first few years. Speech is not language. Speech is an expression of language. You can bootstrap communication earlier with sign language than speech. Some people claim that having sign language virtually eliminates the terrible twos. Not in my experience, but still having a picture of what’s going on in your child’s mind is a wonderful thing. We used Signing Time to do that.

The addition of a child in your lives creates a new set of relationship permutations. Make sure you find time to nurture the relationship with your spouse. A good babysitter is worth his/her weight in gold for that very reason. When my kids were young, I took them out on “dates”. It might have been just a simple meal or a trip to the park, but it was one-on-one time.

Write down the shit your kid says in preschool. It is unbelievable and you probably won’t remember it. Here are some examples from my son:

“I want a rat pet, but I need to wait until I grow up and marry someone new.”

“clowns can juggle coats and hats and shoes. And people’s skeletons.”

“Dada, when I eat this [oatmeal], I’m going to be strong like Bumblebee”

“You’re sad because your mommy is dead….daddy? I’m not going to kill you.”

“When I was a little honey bee, I make honey in a honey home and all my bee friends come to visit and they eat chocolate and peanut butter and we have a picnic in my honey home and that’s the end of my story.”

Take all the pictures you can. Make an album annually from the best – paper keeps better than bits and requires no power.

You’re going to be a great dad, Rick.

I’m Old, Part LXXV: Hiring Interns

One of the things that I really liked about Atalasoft was that we could come up with creative solutions to staffing problems. One of our solutions was to tap colleges for interns. We had interns working for us from WPI, Smith, Oberlin, and UMass Amherst. Sometimes we put them on projects that we didn’t have the staffing to do. Sometimes we had them working on our build systems. Sometimes we had them working on support. All of these things were speculative investments. We were also training our interns in our products and we ended up hiring several of our interns after they had finished their degree. Each of these people had shown that they understood our code and systems and showed that they were good engineers.

After Atalasoft was acquired by Kofax, this program went away, which is a pity because it was such an easy gamble.

Still, I was very happy with every intern that we hired and we tried to treat them well.

Except that one time.

See, I was having a discussion with some of my co-workers about fair food and personally, I have two different forms of fair food kryptonite: smoked turkey legs and deep fried Oreos. In this discussion, Kevin Hulse and I got into a disagreement about whether or not Hydrox or Oreos were better in deep fried Oreos. Unable to reach agreement, we decided to settle it scientifically and set up something I referred to as Fry-day.

I brought in a fryer, pancake mix, eggs, sugar,a package of Oreos and a package of…well..problem: I couldn’t find Hydrox at my usual store, so I bought a different generic chocolate sandwich cookie. Close enough. We set up a double blind test and fried cookies, letting everyone vote on which they liked best. Kevin tallied the results and the winner was Oreo.

So where’s the intern in this?

One of our sales staff, John Casanova, was talking about how wonderful deep fried pickles are and why didn’t we have those as long as we had a fryer going? So we sent out Julia Burch to the local grocery store to pick up some pickles. I felt really bad about doing this, but Julia was a trooper and made the trip out.

Julia ended up working for us after she got her degree and I was very happy to have her on board.

Really, the point here is that paid internships (and ours were always paid internships) are a valuable investment. If you cultivate your interns, the investment will pay off for years to come.

I’m Old, Part LXXIV: Explain The Death March

Before I dig in to this, I want to talk about the C programming language. C is a truly impressive language in terms of how it is both a high level language and cleaves very closely to CPU architectures. If you’re writing system code, this is a very good thing because you predict fairly accurately what assembly language will get generated from any given block of C. It is not without its issues.

As the saying goes, “C combines the power and performance of assembly language with the flexibility and ease-of-use of assembly language.” which means that you need to be extra careful about a lot of things, especially memory and pointers.

It is unfortunately, easy to create all kinds of nasty bugs involving memory misuse and pointer misuse because, simple as the concepts may be, we’re human and we make mistakes. The most common mistakes are reading or writing a pointer that has not been initialized or reading or writing beyond the bounds of a particular block of memory. If you’re lucky, such an error will cause an immediate and reproducible failure. Unfortunately, these types of bugs can also create issues that can go undetected for a very long time.

This type of bug is sometimes called shotgunning the heap (or stack) or a heap smasher. If you can’t catch it at the point of inception, tracking it down is very challenging. I have tracked down many of these bugs in my career and none of them have been pleasant experiences. There are a number of tools that you can use, but most of them are shots in the dark.

Brian Fitzpatrick posted this excerpt about handling corrections for engineers with issues called “Shit Sandwich”. It reminds me of an old UNIX fortune which was something like “Life is like a shit sandwich, the more bread you’ve got, the less shit you have to eat.” But that’s not what he talks about. You should read his essay it’s good.

I will add to it the importance of communicating when you know that you are sending an engineer on a death march. In the past, I’ve been sent on these, where there is a heap smasher that needs to be tracked down and precious little to go on. It’s a process that can take weeks. And it’s weeks of trying things – trying to find the magic incantation to make the bug happen consistently, to try to narrow it down to the point of inception, to try to find the actual cause. It can be weeks of doing similar things repeatedly and making little progress.

The worst part of being sent on one of these, besides the frustration and drudgery, is when it’s not even been acknowledged. I’ve had several of those. Resented every single one. When I worked on Acrobat Catalog for the Mac, we had a heap smasher that only happened on PowerPC Macs when indexing files on network shares. I was sent on this one with no acknowledgement of the impending pain. Ultimately, it turned out to be not my code, but Apple’s implementation of TCP/IP. It took weeks to narrow it down.

When I was at Atalasoft, we had some awful bugs in that category. A few of them, I tracked down. Others, because of other reasons had to be handed to junior engineers. For example, we had a long standing bug that we suspected was in our JPEG2000 decoder, which we licensed from an outside company. The product generally worked fine, except that once in a while when our automated unit tests would show failures. Most of the time, the test would pass the next time around. Unfortunate. Sometimes they would fail consistently, at which point we would put someone on it to try to replicate the conditions so that we could contact the manufacturer with a test case.

When I had to assign an engineer to this kind of job, I tried to do a few things:

  1. I tried to explain that I knew what kind of task this was and its scope and that I understood all to well what they were in for
  2. I tried to offer as much support as I could to help out as a reference: how to use gflags, for example, or how to create memory pools to detect illegal writes, or how to work with WinDBG.

The point being, when you know you are sending an engineer off on a particularly challenging and/or unpleasant task, I think it’s important to communicate that you understand how awful it is and do your best to help out.