Sunday, November 11, 2012

A Pool Allocator or The (un)importance of Code Optimization

In school I learned to use dynamic memory allocation, and to use linked lists to keep track of dynamically allocated objects. The idea was that arrays are static and fixed-size, while linked lists can virtually limitless continue to grow dynamically. The linked list also doesn't use memory that it doesn't need, and there are some nice performance characteristics when inserting and removing items from a linked list.

The malloc() library function takes memory from the heap, which is a region of memory. Whenever malloc() runs out of heap space, it will call sbrk() to grow the heap by asking the operating system kernel for extra memory pages. Modern operating systems actually use mmap() instead, which works by virtue of virtual memory and demand paging. It's all pretty sophisticated.

How different were things for classic arcade game machines, tiny computers with very small system memory. There was practically no operating system at all, it was just the game code itself running on the bare metal. Arcade games like Space Invaders and Galaga did not have the luxury of a malloc() library function. They were not even written in C, but in the machine's assembly language. The way to allocate monsters and bullets was to use an array in the bss segment—an area of system RAM memory sitting just outside where the program code was loaded, or rather, copied from eeprom to RAM when the machine was switched on.

Things were so much simpler then, but even today I like allocating monsters and bullets from a ‘free’ list backed by a statically allocated array. Such a pool allocator gives a very clear picture of what is going on and how things work.

Even though the pool allocator has a hard maximum on how many objects may be allocated, it will never deny you memory if you stay within that limit. No other task in a multitasking operating system can steal this memory away from you, making it a very robust memory allocator.
In terms of performance, the pool allocator always completes in a fixed time, which is important for games that need constant performance.

Discussions about memory usage and performance are becoming pointless in this day and age, when we have massive amounts of memory and CPU power at our disposal even on mobile devices. As computers before were frustratingly slow machines, you had to push the limits to get the most out of it. Writing efficient code was the challenge, and every time you managed to squeeze out a few cycles more it was an achievement well earned.

Nowadays this hardly applies anymore. Although I have to say, lately I had some scripting code for doing large data processing. I rewrote it in C++ with multi-threading, and boy, did that make a difference in run time.