Monday, July 30, 2012

oolib: link with -loo

Just when it started to look like July was becoming the ‘month of Go’, I made a dramatic shift to C++. I don't even like C++ very much, but in fact Go made me go back to defeat an old enemy. Once upon a time I decided that programming in C and C++ should be easier [than it is] and set out to develop a library that allows you to have safe strings, automatically growing and bounds checking arrays, and dictionaries (associative arrays) in C. The holy grail of programming: the simplicity of Python but with the power of C. It turned out to be harder than I imagined, and resulted in thousands of lines of library code that were never actually meaningfully used. At some point I realized that such a beast is better implemented in C++ than C, and continued the holy mission, only to fail. And then came the intermezzo with Go.

Go is a funny language. There is just something about it that makes you think differently about programming. Go doesn't care about object orientation, it only matters if a class (or struct) implements the right functions to adhere to a protocol. Another thing Go does right is multithreading and communication using channels. It makes things easy. A lot of things in Go are easy, yet powerful.
It does have its quirks, I don't find Go code as easy to read as it should be. Go code typically has a lot of typecasts (or type conversions) because of its strict typing. And not totally unimportant, my colleagues don't like Go because they don't know it and stubbornly don't want to know it. That may be their problem, but it's also a problem of Go.

Back to libobjects, which now had three implementations in plain C and two others in C++. But something was wrong with it, and that something was the complexity of loose typing. The whole library leaned on the Objective-Cish idea that everything is a reference counted Object, and that arrays and dictionaries would be able to hold any type, just like they do in Python and in Objective-C. But looking at Go, it seems you can do perfectly without loose typing. In Go, arrays, maps, channels all are defined explicitly for a single type. And that's okay, things only get better with strict typing.

I took the golden advice of a co-worker that the library should be implemented using STL. I don't like writing large programs using STL, but it's chockfull of convenient standard templates perfectly fit for the job.
So for oolib, the new incarnation of the object library, I settled with C++'s template syntax: Array<int> is an array of integers and Dict<String> is a dictionary of strings. Add the Python string interface and stir in a bit of goroutines and channels, and oolib looks pretty nice. Just build with a C++ compiler and link with -loo.

Behind the scenes, oolib leans heavily on shared_ptr. The shared_ptr is a shortcut for having reference counted objects, but what's really strange is that you can't write efficient C++ code without it. Now consider that shared_ptr did not exist when C++ first entered the scene.
A consequence is that when you assign an array variable to another array, they both point at the same backing store. That's exactly what happens in Python too, so for now I'll leave it that way.

Performance-wise, oolib delivers what you expect it to: it's faster than Python, more fast than Go, and a fraction slower than pure C code.

A snippet of oolib code (that shows only a few features):
#include "oolib.h"

using namespace oo;

int main(int argc, char *argv[]) {
    if (argc <= 1) {
        print("usage: listdir [directory]");
        return 1;
    }

    Array<String> a;

    if (listdir(argv[1], a) < 0) {
        perror("listdir");
        return -1;
    }

    print("%v", &a);

    foreach(i, a)
        print("%v", &a[i]);

    return 0;
}
Finally, it begs the question whether this lib will be meaningfully used. Only time will tell, there are probably a zillion of personal libs like this one out there, and I'm glad to have mine. Mission accomplished.