On pointers and references

I have been writing C and C++ code for many a decade. I sometimes run across new things, typically in the form of additions to the language. These include things like new libraries like Boost, the addition/standardization of STL, and most recently C++11 with its cool stuff like auto pointers. It is a very rare thing that something fundamental about the language itself surprises me, but it did just last week.

In C, there are two ways of passing a variable to a function that you’d like the function to change. You can pass by pointer or by reference. Examples:

typedef stuct {
    int monkey;
    int donkey;
} MyStruct;

void MyFunctionByPointer(MyStruct *parameter)
{
    if (parameter)
    {
        parameter->monkey = 1;
        parameter->doney = 2;
    }
}

void MyFunctionByReference(MyStruct &parameter)
{
    parameter.monkey = 1;
    parameter.doney = 2;
}

There are advantages to each method. Passing by pointer gives you the option of a “don’t care” route. A function can, for instance, take some action, the result of which can be optionally returned. Maybe the function returns a true/false value to convey success/fail and has an optional parameter for a more specific error string. If the caller doesn’t care about the specifics of the error, it can just pass a NULL. Passing by reference sometimes simplifies the code by having less pointer indirection. Until recently, I (and many coworkers) assumed it also assured you that the function always had a valid parameter. There was no NULL case, as in the pointer method. Surprisingly, this turns out to be wrong.

Let’s start with this function:

MyStruct SomeStaticThing;

MyStruct *GetThing()
{
    if (today == Monday)
        return NULL;
    else
        return &SomeStaticThing;
}

Sometimes it returns a pointer to something valid. Sometimes it returns a NULL. It’s not an uncommon way for a function to behave.

Let’s then inspect this piece of code:

MyFunctionByReference(*GetThing());

Technically, this is a perfectly valid thing to do in C. You might not know or remember that that function could return a NULL in some cases. Oddly, when it does return NULL, you get no immediate assertion, error, or segfault. Your code effectively expands to:

MyFunctionByReference(*NULL);

With g++, this doesn’t actually cause an error at this line. The flow of control successfully drops into the MyFunctionByReference() function. If it is a large and complex enough function, it may successfully run a dozen or two lines of code. It may even call sub-functions. It won’t actually throw any sort of segfault until you attempt to use the bad parameter.

So how do you protect from this sort of situation? I guess there are two ways. One is to protect the called function, similar to the by-pointer variant:

void MyFunctionByReference(MyStruct &parameter)
{
    if (NULL != &parameter)
    {
        parameter.monkey = 1;
        parameter.doney = 2;
    }
}

This method seems kind of dumb, though. The ampersand is really meant to convey a non-NULL value, whether or not it actually evaluates down to NULL because of a glitch higher up in the call stack. A far better method would be to protect within caller, such as this:

MyStruct *value = GetThing();
if (value)
    MyFunctionByReference(*value);

This lets you handle the anomaly at the point where it actually occurs, at the same layer of abstraction, rather than down in a subordinate function.

Posted in:

On pointers and references

Published by

Brian Enigma

Leave a Reply Cancel reply