Sunday, November 30, 2014

Memory Management: Pool Allocator

I want to get back to game programming, so I had another look at the quad-tree. I did a post on quad-trees in August last year. Remember that it's a technique for grouping objects together by chopping up 2D spaces into four quarters. Each interesting quarter is again divided into four parts. This basically goes on until max_depth is reached. The purpose of this space partitioning is grouping objects for the sake of collision testing; objects that are near each other may collide and are worth testing for collisions. Objects that are not near each other can not possibly collide, so we skip those when collision testing.

Having lots of moving objects, it's easiest to construct the quad-tree on every frame, or game tick. Now imagine that they are slow moving objects. If the objects are moving slowly, chances are that the quad-tree will look exactly the same at every frame. So why would you spend all that time building up and tearing down the same quad-tree on every frame? Unfortunately, we have to, and trying to bend an existing quad-tree into shape is much more difficult than simply rebuilding it from scratch. But we can optimize in another way.
The process of rebuilding the quad-tree takes a lot of malloc() and free() calls, and they are not cheap functions. If you want 60 fps, you will want to spend your time better than calling malloc() and free() all the time.
Rather than trying to reuse the entire quad-tree, let's reuse the timber that it's made of. Let's grab the objects that represent the quad-tree nodes and leafs from a pool allocator.

A pool allocator is simply a stack of pointers to objects in memory. Rather than freeing the memory, you push the object onto a stack. Next, when you need to allocate an object, you pop it from the stack. If the stack runs out of memory, you simply allocate a new object using malloc().
A pool allocator only allocates objects for a specific data type. All the objects in the pool are of the same type and size. By no means try to make it a generic allocator, or you'll be writing you own fragmenting memory manager.

In C, you might have:
static QuadTree *qt_pool[MAX_QT_POOL];
static int qt_pool_idx = 0;

QuadTree *qt_alloc(void);
void qt_free(QuadTree *);

Things get quite interesting when you move to C++. First of all, C++ has a std::vector class that you can use as a dynamic array, meaning that you can grow the pool as large as is necessary without having to worry too much about what the exact limit should be.
Secondly, C++ has template classes for generic programming. A templated Pool class can work with all kinds of types: QuadTrees, Monsters, Bullets, Explosions, etcetera.
Thirdly, C++ classes have constructors and destructors. In order to do this right, the pool allocator must properly construct and destruct objects when they are allocated and freed. In C++, placement new takes some already allocated memory, and calls the constructor for it. And in case you're wondering, manually calling a destructor requires no special trickery.
template <typename T>
class Pool {
public:
    // insert boilerplate here ...

    T* alloc(void) {
        if (v_.empty()) {
            // allocate new object
            return new T();
        }

        // take last object in pool
        size_t idx = v_.size() - 1;

        // use placement new to construct
        T* t = new (v_[idx]) T();

        // remove item from vector
        v_.erase(v_.begin() + idx);

        // return object pointer
        return t;
    }

    void free(T* t) {
        v_.push_back(t);
        // destruct t
        t->~T();
    }

private:
    std::vector<T*> v_;
};
This code works with raw pointers, and you can get philosophical and argue that these should be std::unique_ptr<T> objects, that are std::move()d around. And with C++14, you can use std::make_unique<T>().
[I didn't bother because personally, I'm no fan of std::unique_ptr. I think the syntax is horrid, and I wish they would have invented a new keyword instead for strong referencing pointers.]

Growing the vector means doing implicit memory allocations, something that we were trying to avoid in the first place..! Luckily, this will only happen the first time around; the pool will grow to a certain size and then stay that size as objects start being reused. An optimization would be to reserve chunks of space in advance, so that the vector need not be resized as often.
You can avoid working with pool arrays altogether, by chain linking objects to a free-list. This requires the objects to have a pointer member though (to make the chain link), and this breaks the generic nature of the Pool template class that is shown above. It would be an elegant solution for plain C however.