Singularities in Software Design

Please note that all blog posts before 8 April 2007 were automatically imported from LiveJournal.  To see the comments and any LiveJournal-specific extras such as polls and user icons, please find the source posting at http://brianenigma.livejournal.com/2006/11/

Even with all of the work I’ve done with the Linux kernel and kernel drivers, there still are a number of large mysteries floating around out there. First, there is a large amount of code. It’s difficult to keep it all in your head, which is okay if you are working with a single module but sucks when you have to consider the interactions between modules. Second, there are things that are not defined in readily apparent ways.

Lately, I have taken to thinking about the whole thing as a black hole with an event horizon, to use astronomical terms. Standing on the periphery, you can only see so much of the code. Because of it’s extreme size and (accidental or intentional) obfuscated syntax, you can just can’t see past a certain point. It’s all a mystery and you know that something is going on in that singularity and some of the results can reach you, you just can’t see the inner workings of the mechanism. In the case of black holes, gravity is so strong that light can’t escape and without the light, you can’t see the core. In the case of the Linux kernel, the code is so obscured (by indirection and macros) in places that you can’t see what’s going on.

For instance, there is a particular function used throughout the PCI subsystem (pci_read_config_word(), if I remember correctly.) Because of an error with the wiring on our PCI data bus, this function was locking up (blocking on a data read) for us. As best as any of us can tell, this function just isn’t defined. It’s called, yes. It’s just not defined in any C code, assembly code, or compiler macros. Grepping it only returns program locations that use it, not a definition. This one still remains a mystery, except not an important one once we fixed our data lines. Grepping doesn’t always work, as in the next example.

Another example is with the Flash chip drivers. There is effectively a variable that is defined and used by several pieces of code but is never assigned. Well, it is. But it isn’t. How about a pseudocode example?

#define SOME_CHIP_ADDRESS 0x0C000000
struct SomeStruct {
    uint16 value1;
    uint16 value2;
};
struct SomeStruct *myStruct; // a common structure used everywhere
myStruct = malloc(sizeof(struct SomeStruct));

// ...lots of lines of code removed...

if (myStruct->value2)
    printk("Dilithium crystal buffer delay set to %d\n", myStruct->value2);

And that’s pretty much it. Keep in mind that the above code is simplified. The real code spans thousands of lines over several dozen files. myStruct.value2 is not defined or used anywhere else. It is always a good and expected constant value. Where does it get set? Nobody knows. Grep for it in the entire kernel tree, and it’s just not assigned anywhere. Well, that’s not exactly true. It does get set, but indirectly from the contents of a chip at a hardware address:
memcpy((uint8 *) myStruct, (uint8 *) SOME_CHIP_ADDRESS, 4);
Unfortunately, you can’t just grep for all instances of myStruct because very nearly every line of code uses it. You might as well manually look through all of the code.

In retrospect, that memcpy is efficient and probably the best way to set the variables, but when you are looking for an assignment, it becomes a layer of obfuscation and indirection that works just like gravity in a black hole. Get enough of those and you have a singularity with an event horizon. You assume an assignment is happening somewhere. All empirical observations show that it is. None of the tools you have at your disposal can “see” it, because it’s happening just outside of your visibility. You’re used to the 99% of the code in the rest of the kernel that uses the equal sign to put values into structures.

Fortunately, with the kernel and development tools, we can build better telescopes. With experience, we can learn that “grep -r myStruct . | grep memcpy” is a new way of searching for some of the more obscure stuff, now that we realize that memcpy on the containing structure is used. Maybe we’ll find a better telescope for looking at black holes, too.

Posted in: Code

Leave a Reply

Your email address will not be published. Required fields are marked *