Monday, June 22, 2009

Taking out the trash

While working on a hobby project in one of my all-time favorite programming languages (standard C), I once again encountered an old problem: memory leakage. I hunted down the problem and fixed it, but it bothered me that I made this mistake, and it bothered me that it was so easy to make this mistake. In my second favorite programming language (Python), this would never have happened. Python features a built-in garbage collector. Why can't C have a garbage collector? The answer is, it can, but you just have to implement it.

Description of The Problem, The Pitfall
The common mistake happens when doing things like this:
int foo() {
char *p;
struct some_struct s;

p = (char *)malloc(NUMBER_OF_BYTES);
bar(&s, p);

if (...) {
if (... && ...) {
} else {
return -1;
return 0;

Clever folks will see that there's a memory leak at the "return -1" (forgot to call free(p)), but did you spot the other memory leak? That's right ... what I didn't tell you, is that function bar() also allocated some memory to store into a member of the struct, and it's not being freed anywhere because it's not obvious that it's using memory.

Use another language: C++
So, I can hear you say, use C++. C++ will call the destructor when the local variable goes out of scope. This only partly solves the problem. In practice, I've learned to hate C++ because it makes things even less obvious than before, causing more problems than solving them.

Reference counting
How does Python pull it off? In essence, it's simple — it uses reference counting. Reference counting means that when you point a pointer variable at a piece of memory, you are referencing this object, and so a counter is incremented. When you stop using this object, the counter is decreased. Once the counter hits zero, no pointer variable is tracking this space anymore, meaning that this object has become useless, and therefore OK to clean up. Hence, the object is being garbage collected. Note how the object can be cleaned up immediately, and that there is absolutely no need for a separate thread to actively scan memory to collect bits of unused memory. I have no idea where this myth stems from (the Java world?).

Mimicking the interpreted language
Reference counting can be done in standard C too, but it's actually quite hard to do it right. Every single time you assign a pointer variable, you have to increment the associated reference count. This is doable, and you can even write a set_xxx() wrapper function or macro to do it for you. What's much harder, is to unreference the object when you let go of it. You are bound to forget to do this correctly just the same as pairing malloc()/free() calls was too hard to get right. The only reason why (byte-code) interpreted languages like Perl, Python, PHP, Java, etc. do get it right, is because the outer layer that really drives the language is doing all assignments and keeping track of the references, rather than the programmer trying to track things from the inside. (Or am I mixing up "inside" and "outside" here, are you still with me?!?!)
I think it was Guido himself who once said that if you would get all the reference counting with objects right in C, you'd practically already be programming in Python.

Tracking mallocs
What if you'd register all malloc()ed memory to a Memory Manager, and replace C's standard "return" statement with a macro "Return" that first visits the Memory Manager to see if there's anything left behind to clean up.
Sounds great, but how can it see the difference between a temporarily allocated buffer, and an object that was allocated to live for the rest of the lifetime of the process? This is exactly the core of the problem — we use the same malloc() for two different things; we want temp memory that is somehow magically destroyed at the end of the function, and we also would like to allocate memory because our program needs it. (NB. There may be a situation where an object may be temporarily, but not quite ... although I can't say I can think of such a situation right now). The solution to this problem would be to implement a tmp_malloc() function that is used for temp variables (like local variables), and have it track this memory so that a "Return" statement can clean it up. This actually works, but in practice, a lot of duplicate code is needed because you want to able to call malloc() in one situation, and tmp_malloc() in another. A real slick trick to solve this, is to make Malloc a pointer to a memory allocating function, and have it point to either malloc() or tmp_malloc(). All the programmer needs to do now is decide whether he wants to temporarily use memory, or assign it for a longer period of time.

Implementing the solution
The solution with the Temp Memory Manager and the Malloc pointer to the allocator function is so exotic that I have not implemented it yet. Do I really want to write this kind of crazed code, just to keep from making small mistakes? One could just as easily label local variables as such, and alert the programmer to double check that this space needs to be free()d.

#define LOCAL

int foo() {
int a;
LOCAL char *p; /* <-- indicator, just to alert the programmer */

p = (char *)malloc(SOME_BYTES);

Here, the empty, seemingly meaningless #define alerts the programmer to double check this situation, and the mistake of forgetting to call free() is easily avoided. No need for in-program memory managers and self-inspection, only one simple empty #define and your common sense.

We've taken a shot at trying to find a way to do garbage collection in standard C. We've found reference counting, and thought up a way to free all 'locally' allocated memory. In the end, we're probably better off tagging potential leaks as such and leaving it up to the programmer to do his job right. While this non-technical solution may not appeal to everyone, at least we're not going for the technical solution for the sake of having a technical solution. Sometimes, "good is better than optimal".

I may still implement the technical solution, just for fun, of course.