I have been writing C and C++ code for many a decade. I sometimes run across new things, typically in the form of additions to the language. These include things like new libraries like Boost, the addition/standardization of STL, and most recently C++11 with its cool stuff like auto pointers. It is a very rare thing that something fundamental about the language itself surprises me, but it did just last week.
In C, there are two ways of passing a variable to a function that you’d like the function to change. You can pass by pointer or by reference. Examples:
typedef stuct { int monkey; int donkey; } MyStruct; void MyFunctionByPointer(MyStruct *parameter) { if (parameter) { parameter->monkey = 1; parameter->doney = 2; } } void MyFunctionByReference(MyStruct ¶meter) { parameter.monkey = 1; parameter.doney = 2; }
There are advantages to each method. Passing by pointer gives you the option of a “don’t care” route. A function can, for instance, take some action, the result of which can be optionally returned. Maybe the function returns a true/false value to convey success/fail and has an optional parameter for a more specific error string. If the caller doesn’t care about the specifics of the error, it can just pass a NULL. Passing by reference sometimes simplifies the code by having less pointer indirection. Until recently, I (and many coworkers) assumed it also assured you that the function always had a valid parameter. There was no NULL case, as in the pointer method. Surprisingly, this turns out to be wrong.
Let’s start with this function:
MyStruct SomeStaticThing; MyStruct *GetThing() { if (today == Monday) return NULL; else return &SomeStaticThing; }
Sometimes it returns a pointer to something valid. Sometimes it returns a NULL. It’s not an uncommon way for a function to behave.
Let’s then inspect this piece of code:
MyFunctionByReference(*GetThing());
Technically, this is a perfectly valid thing to do in C. You might not know or remember that that function could return a NULL in some cases. Oddly, when it does return NULL, you get no immediate assertion, error, or segfault. Your code effectively expands to:
MyFunctionByReference(*NULL);
With g++, this doesn’t actually cause an error at this line. The flow of control successfully drops into the MyFunctionByReference() function. If it is a large and complex enough function, it may successfully run a dozen or two lines of code. It may even call sub-functions. It won’t actually throw any sort of segfault until you attempt to use the bad parameter.
So how do you protect from this sort of situation? I guess there are two ways. One is to protect the called function, similar to the by-pointer variant:
void MyFunctionByReference(MyStruct ¶meter) { if (NULL != ¶meter) { parameter.monkey = 1; parameter.doney = 2; } }
This method seems kind of dumb, though. The ampersand is really meant to convey a non-NULL value, whether or not it actually evaluates down to NULL because of a glitch higher up in the call stack. A far better method would be to protect within caller, such as this:
MyStruct *value = GetThing(); if (value) MyFunctionByReference(*value);
This lets you handle the anomaly at the point where it actually occurs, at the same layer of abstraction, rather than down in a subordinate function.