Today, I fixed an issue in my code base. This, in and of itself, is not unusual. I do that most every day. What was different was that I analyzed the process and took mental notes on what I did and how to explain it out.
The issue was that my code base has a number of sample projects and while they build, several of them crash hard. This is suboptimal because chances are most clients of my code are going to ignore the copious documentation and go right for the samples. It’s best if they work.
How do you solve this problem:
Step 1: analyze the output and the errors and see what you can find
I did this and it was puzzling. I was getting a runtime error on a symbol reference while attempting to load a dynamic library and it was a symbol that I never used. I put together a mock program that allegedly did the same steps as my host program to load that library with the same arguments. It succeeded. I plunged into despair.
Step 2: Walk Away – No, Just Walk Away
Seriously – this is a good thing. I rarely fully let go of a problem and instead shift it to the back burner of my brain and very often, the solution will just occur to me later.
Step 3: Find a Place With Better Light
Here’s a joke: a person is out under a street light at night looking for something. An observer walks up and says, “hey, did you lose something?” The person says, “Yeah! I lost my car keys in that alley over there.” pointing at an unlit alley across the street. “Why are you looking over here?” “Well, the light’s better over here.”
After my time away from the machine, here was the conclusion I came to: I have close 1600 unit tests that pass. Most of them do more complicated things than this sample, but the process is identical. If I can hoist this sample into a failing unit test, then I can debug it a lot easier. In other words, move the problem into a place with more light. I did this. I pulled the entire sample into a unit test. It turned up an issue in my code, but this was not what I was seeing. I fixed that issue and had a passing unit test.
Step 4: Enter Elmo’s World
Sesame Street is a wonderful program for teaching the young. There are so many simple games they play to help build basic skills. One of them is called “One of These Things Is Not Like the Others”. This is a game where they put up multiple scenes and all are identical except one and you need to pick out the one that isn’t the same as the rest. This was exactly the game that I played with my code. I had a unit test that passed and a sample that failed. All I had to do was turn the sample code into the unit test. So I tracked compiler flags and environment variables from the unit test and shoved them blindly into the makefile for my sample. Success! The sample ran without a hitch – and in the process, I saw exactly what was missing. Was I done? Not quite.
Step 5: Cut Away Everything that Isn’t an Elephant
There’s an old saw about how you carve a statue of an elephant – you start with a big block of stone and remove all the stone that doesn’t look like an elephant. Easy!
This is what I did – I methodically went to the makefile and stepwise removed pieces that I added that didn’t affect the outcome. Why? Because this is sample code. If it were me, I would build the sample, run it, read the source code, then read the makefile. The makefile should be as simple as possible (but no simpler).
It also reminds me of a guy I know who as a teen had a transistor radio and he took it apart and saw a circuit board covered with resistors, capacitors, coils, and transistors. While the radio played, he clipped out individual components until the radio stopped playing, then soldered back the component that killed it. Eventually, he removed everything that wasn’t strictly necessary.
Same thing here. It’s also good to make sure that I understood everything that was going on in that issue so that it (hopefully) wouldn’t happen again.
Step 6: Document What You Did
I still need to do this. It would be nice if the makefile included not only the steps to build and run a given sample but information as to why it has each of the parts. I’ll do that tomorrow.
Can you find and fix any issue with these steps? No. But this is one approach you can take. I spoke to my spouse about this and she said, “wait, you made an algorithm for debugging an algorithm?” Essentially, yes.