I was very lucky on a number of fronts in my career. In many things, it was a “right place, right time” confluence that worked out in my favor. For example, I knew 6502 assembly language as my second programming language and as a result had a fairly deep understanding of how software works at the lowest levels. I got involved in a program at Bell Labs in high school where I learned C one night a week. By the time I started college, I was already familiar with many of the concepts before I encountered them. And for no good reason, I did a private study in college on the language FORTH and did a reference implementation for the Macintosh.
Similarly, while I was in college, I worked at Bell Communications Research, doing work on Macintosh machines and Sun workstations before I decided to see what other opportunities lay before me. I managed to wrangle interviews at Apple and Adobe and thanks to some pull I had with an employee at Adobe (hey Curtis!), I landed a job there where, to my great relief, I discovered that development was being done on Sun workstations. Of course, I had no idea what PostScript was or how to do embedded software development, but I grabbed copies of the all the PostScript manuals on my first day and read up over the weekend. Oh hey – FORTH and PostScript share a lot in common. I can deal with this.
I took over on a project that was owned by DEC to work on their DECLaser 2200 series of printers. I was taking over the project from a woman who was going on maternity leave (hi Kathie!). On the day I started, she was in the middle of an early release and had no time to do much information transfer.
I took over, bouncing between my cube and her cube, trying to learn the ins and outs of not only PostScript, but also the sets of printers (there were two models that were going to be built from one code base), and their controllers. The underlying hardware was running on a National Semiconductor NS32CG16 processor which was sold to DEC as being a much better choice than the competing Motorola 68K series. National was lying through their teeth on a number of fronts, not the least of which was because the processor, which had an optional floating point unit, was designed in such a way that when the FP unit was installed, half of the machine registers (IIRC) magically went away, being dedicated to the FP unit. This meant that if you had to compile code to run with and without the FP unit, you had to assume that you had half as many registers as you might have. It made the code nearly impossible for a compiler to optimize, so the end result ran OK. Just OK.
In the middle of the project, I was given an awkward assignment to work with engineers from National to try and address the problems DEC was seeing. It involved a lot of benchmarks and we were stuck using GCC for the main compiler, but the National engineers were telling us that if we used their compiler, we’d get better performance, but the output file format was incompatible with the rest of out tool chain for making code that would embed on this controller. I was required to die on that hill.
Still, in the process, I learned a lot about development and the benefit of having a solid QA department who was on your side. To keep organized, I kept a binder where I put copies of the bug reports. I had them in sections for done, in progress, and needs to be done.
One of the problem with printer bugs is that sometimes it takes a long time just to reliably reproduce a bug. For example, there was a bug that was reported that would occur once in a while with only a certain set of serial settings when the printer was printing in duplex (both sides of the paper). I think I ended up using close to an entire case of paper finding and fixing that bug. It turned out to be a bug in the low-level code that handled serial input through a ring buffer. The stars had to align just right in terms of rate of input and interrupts for that one to happen. I found the bug and made sure that it made it into the main code base so that future printers wouldn’t have the problem. This was the “right thing”, but it turned out that it didn’t really matter because most of development at that point was focused on the next version of PostScript, which had none of the old I/O code.
Working on that printer was really entertaining because it was my first exposure to an ICE (In Circuit Emulator), which was a chunk of hardware that plugged into the CPU socket and acted like the main CPU, but let me set instruction level breakpoints, look at registers and memory, and generally figure out what was going on.
In the process of working on this printer, I learned a lot about corporate/political processes. There was a bug that opened up on the printer wherein the PostScript code would draw a solid 50% gray box on the page then render an image of a teeny-tiny helicopter scaled up to fit the box. When rendered on the page, the helicopter was supposed to cover the entire gray box, but on my printer (and in fact on all PostScript printers), there was a gray hair line on two edges – either the top or bottom and the left or right. Math said it shouldn’t happen, but it did.
I spent a week on this bug. After a lot of hacking and fiddling, I was able to determine that it was due to floating point error and that the location of the hairlines was predictable based on the floating point implementation and the margins settings of the print engine. DEC accepted my analysis and closed the bug. My boss was shocked – DEC opened this bug on every printer they worked on with Adobe. It was a tradition. Oh, we’re working with Adobe on a printer? Let’s just open this bug up on it. Suck it DEC, I have the answer!
At one point, as this project progressed, I had gotten a little behind and DEC was getting unhappy. They had a number of high priority bugs that they wanted fixed and were worried about the schedule. In total, there were around 40 bugs open on the printer. I managed to sort through their high priority bugs fairly quickly and then pounded through the rest. We had a conference call with them where they were assuming that I hadn’t fixed the high priority bugs and were ready to do some intense horse trading on them. When they asked about the status, I said, “I fixed all the bugs.” There was a long pause. “All the high priority bugs?” “No,” I replied, “all the bugs.” There was another long pause. “Is that OK? I mean, I’m pretty sure I can put some of them back if you’d like…” Peals of laughter. They faxed me a thank you card later in the day.
Unfortunately, there was one solid fuck-up in the printer and it was all on me. The printer had a tiny amount of non-volatile RAM that was used to store settings like the number of pages printed, the serial settings and so on. Through many changes and revisions, I had managed to subtly break the NVRAM code such that it worked great as long as the NVRAM had been written to once. If it had never been written to, the printer ended up going to a dreaded procedure named “CantHappen”, which would make the printer reboot. So when a brand new printer had the PostScript cartridge installed, it would boot, then try to read NVRAM, fail and reboot. Lather, rinse, repeat. DEC was in a panic about this because the printer was in production and the ROMs that went into it were already “masked”, which means that instead of burning each ROM individually, they were making the chips directly with my bug in them and they had already paid for thousands of them. I came up with a solution which was a chunk of code that would initialize the NVRAM and print a happy message on the front panel. Adobe footed the bill to have this code put onto a one-time-programmable cartridge and shipped with all the version 1.0 printers. You put in the OTP cartridge, it formatted the NVRAM and made it so that the version 1.0 printer would work. I wasn’t proud that that bug had made it into final release, but DEC and Adobe were amicable about the solution.