The Developer's Cry: 2008

Sunday, November 23, 2008

Networking, routing, and bit masks

Last week I got funny little assignment through a colleague of the Networking department. They've got a huge set of routes set in a major internet exchange machine in Amsterdam, and wanted to have a program to optimize the routes. Optimizing the routes is not only important for performance reasons, but also because of money — some connections are more expensive than others.

This assignment is interesting for a number of reasons:

I feel like I haven't coded anything useful for ages
I will not be writing the code, but designing it together with a co-worker, who will do the actual implementation
Someone will actually be using this
This will save the boss money (we could've been rich ...!)
The internet might benefit from this, implicitly pleasing many people (the good folks at home)
It involves some good old-fashioned bit-mucking about (and I love it!)

So, what does an internet route look like? Well, you've got the network address in an IPv4 decimal dotted quad notation, and the destination to route to also in IPv4 decimal dotted quad notation. So:

Network address Destination

10.0.1.0/24 172.16.3.91

10.0.2.0/22 172.16.4.1

10.0.0.0/16 192.168.2.2

Let's call these three routes A, B, and C. Now, let's pretend that using route A is more expensive than using route C. Anyone can see in this (pretty bad) example, that route C is such a large network, that it encompasses both A and B — the network address ranges of A and B both fit in C. Therefore, route A and B can be eliminated and we can cheaply use route C.

To be able to do this programmatically, one must realize that IPv4 addresses are 4 bytes. The decimal dotted quad 10.0.1.0 translates to 4 bytes (hexidecimal notation): 0A 00 01 00, or the 32 bit word 0A000100.

I like writing this in hexadecimal because the network range '/24' translates to a bit mask; '/24' means the first 24 bits of the 32 bit word are 1s, giving a bit mask of FFFFFF00. Likewise, a /22 gives FFFFFC00 and /16 is FFFF0000.

Why is this important? Well, it shows how the internals of the internet work, and why it is so fast. For example, a source IP address of 10.0.1.18 is bitwise AND masked with its network address, and it will match route A in the routing table. The packet is consequently sent to the configured destination (in this case, 172.16.3.91).

Finding the cheapest route now only breaks down to this:

Find a route that has a smaller bitmask, but still matches the address
The route has to be cheaper in terms of cost, we don't like spending more money

We can see that route C fits these criteria for A and B:

10.0.1.0/24 => 0A000100 AND FFFFFF00

10.0.2.0/22 => 0A000200 AND FFFFFF00

10.0.0.0/16 => 0A000000 AND FFFF0000

0A000100 AND FFFF0000 == 0A000000, so route A fits in C

If we would have a route D like 10.10.0.0/16, it would not fit in C:

10.10.0.0/16 => 0A0A0000

0A0A0000 != 0A000000, so route D does not fit in C.

As you can see, routing is just simply bitmasking and using a lookup table (the "routing table"). If there is one thing that hardware can do fast, it's bitwise AND operations.

I came up with a funny statement that creates the bitmask from the slash-notation:

((1 SHIFTLEFT (32 - numbits)) - 1) XOR FFFFFFFF

This is overly complicated. It is possible to look at it in another way; e.g. with a /16 all the left-hand side bits are set, and all the right-hand side bits are clear. So it is possible to start with all bits set, and shift the clear bits in from right to left:

FFFFFFFF SHIFTLEFT (32 - numbits)

Be mindful that this works only for 32 bit registers, otherwise the mask will overflow. You can prevent this be explicitly selecting the least 32 bits:

(FFFFFFFF SHIFTLEFT (32 - numbits)) AND FFFFFFFF

We can do even better than this. There is a cool operation named arithmetic shift right which is ideal for the job, but it is not present in all programming languages. The arithmetic shift right copies the highest bit into the next bit, which is exactly what we want:

80000000 ARITHMETICSHIFTRIGHT (numbits - 1)

Just for the fun of it, I thought about implementing this in assembly language. There are many implementations possible, depending on what CPU you are programming. Just for the kick of it, I picked a 32 bits ARM processor. You have to be a bit clever with the bit operations, especially on this system.

@
@ first solution, using (-1 SHL (32 - numbits))
@
LDR R0, numbits
MOV R1, #32
SUB R1, R1, R0 @ R1 is (32 - numbits)
MVN R0, #0 @ R0 is FFFFFFFF
MOV R0, R0, LSL R1 @ R0 becomes the bitmask

@
@ second solution, using an arithmetic shift right
@
MOV R0, #1
MOV R0, R0, LSL #31 @ R0 has only highest bit set
LDR R1, numbits
SUB R1, R1, #1 @ R1 is (numbits - 1)
MOV R0, R0, ASR R1 @ R0 becomes the bitmask

Note that the routing theory holds true for IPv6 as well, except that for IPv6, you have to work with 64 bits addresses rather than with 32 bits addresses.

Tuesday, October 21, 2008

Parallel programming with pthreads

Today, we had a little conversation about pthreads programming in the office. Every now and then the topic of pthreads seems to come up. Surprisingly, I'm one of very few who actually has had some hands-on experience with it. I remember that learning pthreads was difficult because there were no easy tutorials around, and I didn't really know where to start. There is a certain learning curve to it, especially when you don't really know what your doing or what it is you'd want to be doing with this pthread library.

A thread is a "light-weight process", and that doesn't really explain what it is, especially when they say that the Linux kernel's threads are just as "fat" as the processes are.

You should consider a thread a single (serialized) flow of instructions that is executed by the CPU. A process is a logical context in the operating system that says what memory is allocated for a running program, what files are open, and what threads are running on behalf of this process. Now, not only can the operating system run multiple processes simultaneously (1), a single process can also be multi-threaded.

The cool thing about this multi-threadedness is that the threads within a process share the same memory, as the memory is allocated to the process that the threads belong to. Having shared memory between threads means that you can have a global variable and access and manipulate that variable from multiple threads simultaneously. Or, for example, you can allocate an array and have multiple threads operate on segments of the array.

Programming with pthreads can be hard. The pthread library is a fairly low-level library, and it usually requires some technical insight to be able to use it effectively. After having used pthreads for a while, you are likely to sense an urge to write wrapper-functions to make the API somewhat more high-level.

While there are some synchronization primitives like mutexes and condition variables available in the library, it is really up to the programmer to take proper advantage of these — as is often the case with powerful low-level APIs, the API itself doesn't do anything for you; it is you who has to make it all work.

Programming pthreads is often also hard for another reason; the library enables you to write programs that preempt themselves all the time, drawing you into a trap of writing operating system-like code. This kind of parallelism in code is incredibly hard to follow, and therefore also incredibly hard to debug and develop. The programmer, or developer, if you will, should definitely put some effort into making a decent design beforehand, preferably including a schematic of the communication flow that should occur between the threads. The easiest parallel programs do not communicate at all; they simply divide the data to be processed among the threads, and take off.

It should be clear that the pthread library is powerful, and that the code's complexity is really all the programmer's fault.

While the shared memory ability of pthreads is powerful, it does have the drawback that when the programmer cocks up and a single thread generates a dreadful SIGSEGV, the whole process bombs (2).

Also, as already described above, pthreads has the tendency of luring you into a trap of creating parallelism that is not based on communication, creating overly complex code flows.

The automatic shared memory has the drawback that you may not always want to share data among all threads, and that the code is not thread-safe unless you put mutex locks in crucial spots. It is entirely up to the programmer to correctly identify these spots and make sure that the routines are fully thread-safe.

It is for these reasons that communication libraries like MPI, PVM, and even fork()+socket based code are still immensely popular. The latest well-known example of a forking multi-threaded application is the Google Chrome browser, in which a forked-off "thread" may crash, but will not take the entire application down.

This blog entry has gotten too long to include some useful code examples. Therefore I will provide you with a useful link:

ptheads programming tutorial

Skip right to the examples, unless of course you wish to learn some more on the theory of threads... Note how this tutorial covers all you need to know, and cleverly stays away from the more advanced features of the pthread library, that you are not likely to need to know anyway.

If the C programming language is not your thing, try using the Python threading module. Although not truly concurrent threads, Python's threading threads appear very similar in use and behavior to pthreads.

Another interesting topic related to parallel programming is that of deadlock and starvation, and the "dining philosophers" problem. See this online computer science class material for more.

Multiple processes or threads of execution can run concurrently on multi-processor or multi-core machines. On uniprocessor machines, they are scheduled one after another in a time-shared fashion.
There are (or have been, in the past?) implementations of UNIX in which it's unclear what thread receives the UNIX signal. I believe Linux sends it to the main thread, ie. the first thread that was started when the process was created. For this reason, it's wise to keep the main thread around during the lifetime of the process.

Monday, October 13, 2008

Why C++ is not my favorite programming language

Eric S. Raymond, famous figure in the open source movement (1), wrote on his blog he was working on a paper with the colorful title "Why C++ Is Not Our Favorite Programming Language". The title alone struck a chord in me, having a general aversion of C++ for a long time now.

C++ is the object-oriented dialect of the C programming language, and was originally meant to be a "better C". C++ offers full "object-orientation", as it boasts classes, inheritance, templates, exceptions, operator overloading, references, constructors, destructors, accessors, virtual member functions, templates, and more. All of this is then mangled by the compiler and eventually a binary comes out, just like you would compile and link regular C codes.

I have personally written quite a lot of C++ code, but my latest C++ code dates back to the late 1990s. The reason: I was completely fed up with programming in C++ (and Java, that other object-oriented language), and switched back to plain C.

The C programming language is simple, and therefore very straightforward. C translates back to assembly language and machine code relatively easily, and because of this, if you know how a computer works, you will know how to write in C.

This is entirely not the case with object-oriented languages — object-oriented languages are on a higher level of abstraction and take the low-levelness away from the programmer. While this seems nice when you are working on a high level of abstraction, it makes things annoyingly difficult for the programmer who likes/wants/needs to understand what is really going on at a lower level.

Technically, this is a non-issue because in C++ the developer decides what objects look like, and when there is a need to do so, you can take a debugger and trace into those classes to see what is going on. In reality, C++ is making things harder rather than simplifying them.

Back in the day, C++ code always produced a binary that performed less well than a compiled and linked plain C code. The reason? The C++ program is calling constructor and destructor functions all the time, often consuming cpu for no real reason. Moreover, C++ has runtime type-checking that costs performance.

After more than a decade, compilers have advanced, and there are always codes for which the degraded performance is not important. Being also an assembly programmer (speed freak!), it annoyed the hell out of me to actually feel the C++ apps being sluggish on my old 80486 cpu.

For certain codes, like games, graphical user interfaces, or 3D modelers, it appears to be a good idea to use an object-oriented language, because you are already thinking in objects. Now replace that last word "objects" with "structures", and you've already taken the first step to implementing the same thing in plain, good old C.

While classes appear as useful "structures-on-steriods" (ie. regular C structs with member functions), their C counterparts (ie. structs with functions that operate on the structs) are easier to comprehend only because of the use of pointers. While pointers are usually hard to comprehend by a novice, they are a must-have for the experienced programmer.

While C++ does not hide pointers, the preferred way handling objects is by reference. A reference is like a hidden pointer that only confuses the journeyman programmer. In machine language, there is no such thing as a reference, there are only pointers.

I can already hear the crowd say the old saying "pointers are the root of all evil", but in reality, "bugs are the root of all evil". The pointer is a very powerful asset than can prevent unnecessary copying of data.

As a master in C, I've actually managed to write object-oriented-like codes in plain C. This is not surprising, because C allows you to create anything you like, even weird programs like compilers and operating system kernels. In fact, the first C++ compilers spit out mangled C code rather than object code.

They say C++ is good for teaching object-oriented programming. This was only true until Java arrived. I have experience with Java, and it's awful, but I guess it can be used to teach object-oriented programming.

What strikes me as odd, is that C++ was not ready at the time when I learned it. I actually have C++ code that does not compile today without making some necessary changes, while it was all correct when I wrote it. This actually also holds true for very old C codes (2), but I've never encountered this kind of issue for my own C codes. (Anyway, C does evolve because there is also C99 and the like. However, C99 is fully backwards compatible with older Cs).

Later, templates were added and while this black magic was presented as the Next Big Thing, it never did anything useful for me. Templates are best used for implementing lists and stacks into classes, although it appears better to me to have a list of structs with pointers point to the object data (3).

Implementing good C++ classes is hard because they are supposed to be generic, abstract implementations. Often, there comes a time when the class will turn against you and needs redesigning — leading to a major overhaul of the complete code. Proper design should prevent this, but in practice, implementating code is different from designing code and there will be frequent times when you curse at C++ for putting this on your head.

Even if a clever C++ supporter manages to undermine every statement I have made against C++ in this blog entry, I will still have the final word: I just don't like the C++ syntax.

Honestly, tell me which code fragment looks nicer:

std::cout << "Hello, world!\n"; // in C++

printf("Hello, world!\n"); /* in C */

In the C++ code fragment above, intuition would tell you that the direction brackets point in the wrong direction (!) Also, no one seems to know how to pronounce "cout" (4).

The "cout" function is actually being called without having instantiated an object of the "std" class, making this a very questionable example of "object-oriented" programming.

In fact, "cout" is not a function; it is more like the "stdout" global variable in standard C. The output is being produced via an overloaded operator function, for the shift-left operator. Funky? Yes. Fancy? Yes. But as to why a shift-left operator would have to print text, totally eludes me. The fact that printf() is a library function that, in the end, enters the write() system call, that I can understand.

My favorite object-oriented language is Python. But then again, I don't use its object-oriented capabilities much. The best things about python are its comprehensive syntax, and its brilliant memory management (to the developer, there appears to be none because all objects are reference counted and garbage collected). This costs performance, so for the more cpu-intensive stuff, I use C.

Either way, I prefer a procedural language over a weird, code obfuscating, mind bending object-bloated language that attempts to mask the natural program flow, any time.

Eric S. Raymond is the author of various well-known open source programs, and he is the author of the essay The Cathedral and the Bazaar.
The Lion's Book lists an ancient UNIX kernel source code that contains old C statements that are no longer valid today.
Actually, my old, now famous, bbs100 code uses a dirty (but clever and efficient) typecast trick to turn structs into linkable or stackable items.
No one seems to know how to pronounce "Stroustrup", either.

Sunday, August 31, 2008

Fullscreen GL ...

Programming games is fun. Games typically run in fullscreen mode, so in today's world, it's important to have a portable way of setting a full screen mode. It looks like this is more of a hassle than you'd think it is.

I'm fond of SDL because it's quite a powerful library. There is a SDL_WM_ToggleFullScreen function that lets you switch between full screen and windowed mode. I noticed however, that on the Mac, the full screen mode of SDL would actually resize the application window to maximum size rather than take up the whole screen. The problem is that the Apple menu bar remains visible on the top of the screen, and also the dock would still pop up when you move the mouse to the bottom of the screen.
In Windows, I found that it helps to resize the window to the maximum resolution before trying to toggle to full screen.
The man page of SDL_WM_ToggleFullscreen does say that toggling to full screen is not supported on all platforms.
To my astonishment, there is a rather simple fix for this, but first I'd like to tell you about the little adventure I had with GLUT.

Because of the full screen problem I had with SDL, I decided to switch to GLUT and try that out. GLUT is real easy to use, because it feels much more high-level than SDL and you quickly get results with little effort. GLUT features a so called 'GameMode' in which it switches to a full screen mode. While this seems to work across platforms, I'm having an incredibly pesky issue in that the screen is no longer being updated — all animations and movement appear frozen although the display function is being called. It's like glutSwapBuffers is not swapping buffers anymore or so?
Switch back to windowed mode, and everything works perfectly again. It's not the obvious beginners mistake of not re-initializing GL ... I don't know what it is, and I spent way too much time on trying to get this to work. Searching the web also doesn't turn up much information, other than people saying that GLUT is bugged and you should use SDL.
What's also really odd, is that GLUT allowed me to set a 10000x3000 video mode on my 1440x900 monitor. SDL doesn't let you do weird things like that.

Exit GLUT, it's nice for trying out a quick scene, but for anything larger, I'm through with it.
Getting back to SDL, it turns out to be amazingly simple to enter full screen mode. Forget about SDL_WM_ToggleFullScreen, as it's useless. Use SetVideoMode instead:

SetVideoMode(max_xres, max_yres, bpp, SDL_OPENGL | SDL_FULLSCREEN);

The maximum resolution can be obtained by using SDL_ListModes. You should always use this, as a good program knows what video modes are supported and what modes are not.
You can switch back to windowed mode by omitting the SDL_FULLSCREEN flag, and mind to set the window size back to it's original size.
The beauty of all of this is that it appears to be portable, too. As to why anyone ever thought up a broken SDL_WM_ToggleFullScreen function, we'll probably never know.

Tuesday, August 26, 2008

Doomed to code

I read an interesting interview with John Carmack, lead programmer at id Software, famous (1) for creating the game DOOM. In the interview, he talked a little about his game engines with the cool sounding code names "Tech 4", "Tech 5", "Tech 6", etc. To my surprise, there were some very negative comments to this article, people complaining that Carmack is not doing anything innovative at all, as he is recreating the same old game over and again. He is obsessed with his engine and not creating fun games any more.

This made me realize that Carmack is, above all, a programmer, and he is making his code evolve. This is what technical programmers do and what they live for.
In the early 1990s he had a lot of success with his DOOM project, and all he wants is to impress people with a better, faster, cleaner, leaner, more technologically advanced version. Likewise, in the early 1990s, I had a lot of success (although never as much :-) by doing a lot of programming on a BBS (2) and I developed several iterations of this BBS. You wouldn't believe the amount of time spent thinking up new ways of doing the internals and tinkering with the code.
Not only is programming addictive, so is having success. Trying to recreate that past success can be a great driving force. Deep inside of me there may always be the wish to create a new implementation of the BBS. John Carmack has this dream too, and he's earning a living with it.

To my surprise, Carmack is not merely striving for visual perfection. He has recognized that today's popular gaming console (read: the XBox 360) contains 2 year old hardware and therefore needs simpler graphics (like, for instance, tricked shadows rather than compute intensive real shadowing effects) to get more performance.
Kiddies that are complaining that Carmack is only talking about performance, simply don't get it. The performance of the rendering engine is a very important aspect of a game.
The "60 Hz versus 30 Hz" debate is hyped, but not entirely unimportant; having a 60 Hz engine sounds more technologically advanced, however, it sounds to me that a 30 Hz engine would have tons more time per cycle left for doing other things, like AI. Good AI is what makes games fun to play.

One last thing he mentioned that raised my interest was what Carmack said about mega-textures, a technique developed by him that makes it possible to define the world in one giant unique texture, rather than to build the world from repeating textures.
The interesting thing is that he said "it is only two pages of code"; he is now talking casually about what was presented a year ago as his big new thing. Now that he's been working with it for many months, it's not that special to him any more. This behavior is typical for programmers (I do it all the time).
Funnily enough, a lot of cool code snippets are only one or two pages.
It is also a jab at other game developers; the mega-texture technique is apparently not that hard to implement and other knuckleheads should have been able to find out how by now, too.

John Carmack is still a hero in my book, even if id Software's games are not as shockingly groundbreaking as the original DOOM. It's not like the guy gathered fame for no reason at all, (Paris Hilton comes to mind ...) and as long as he keeps revealing technical details about the inner workings of his engines, he's got my attention.

As for the complaining kids out there ... it's just like what I always told the users of the BBS: anyone who dares complain should go and write their own. And one day, I even followed my own advice.

Carmack is like a celebrity, lately doing more interviews than a movie star, and spending more time writing keynote presentations than writing code.
BBS: Some kind of (now ancient) online service featuring a message board with forums, and interactive chat.

Saturday, May 31, 2008

Optimizing an OpenGL star map

It's time for some fun stuff, and write about programming. A while back, I was working on a way cool 3D OpenGL space thing. It had lots of stars, the orbits of all major planets (and their moons!) and you could fly around through this 3D space. The only drawback was that I'm still relatively new to 3D programming and it was *hard* to get things right. So, after a break of a couple of months, I took some of the data and turned it into a 2D star map.

Graphics programming is fun, but even in 2D it is not always easy. As always when programming, there are many ways to accomplish what you want, but you should try and find the most optimal way. Even with great NVIDIA hardware in your machine, your program will be slow if you don't optimize.

The reason I have to optimize the star map code, is because it features over 300,000 background stars. The stars are plotted as GL_POINTs. The magnitude or brightness of the star is also taken into account.

Another reason is that it features some planets. I like drawing planets with many polygons so they don't look blocky. Having many polygons makes it go slow. Any modern 3D game has a high polygon count, so how do they do that?

Optimizing the stars

Today's hardware can output a staggering amount of pixels per second, but dealing with 300K+ stars was just too much for my poor old 6800 GT. (One of the bad things of speedy hardware is, no one tries to write good code any more). Of course, it is amazingly stupid to draw 300,000 stars when you can only see a couple of hundred in the viewport at once. One of the most annoying things of OpenGL is that it has this cool viewport that doesn't seem to do anything for you; it doesn't cull, it doesn't scissor, this is all left to the programmer as an exercise. The reason for this is that OpenGL is not a magical tool and will not do anything right unless you get it right. There are many ways to determine what is visible and what is not, and it greatly depends on what kind of program you are making and how your data is laid out. In fact, the layout of the data greatly determines whether the program will be fast as lightning, or slow as a donkey.

For the star map code, I decided to chop the entire space up into sectors. A sector is like a square on a map with longitude and latitude lines. Only a few sectors will be visible at any given time, so only a limited number of stars have to be drawn. The chopping-up of the star data file was done by writing a small python script. Running the script on this large input took a while, but remember that when you are preprocessing, you have all the time in the world. When you are rendering frames in realtime, that's when you don't have that luxury. Data partitioning is a very fundamental way of making programs act fast on huge amounts of data. Google does it with their data, too.

In full screen, it still draws a considerable amount of stars. This results in a large amount of calls to OpenGL functions, and the sheer number of function calls is what is making the program slow down at this point. Luckily, there is a way to make it better. You can point OpenGL to an array of vertex data and tell it to render the entire array in one go.

Something like this:

glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_COLOR_ARRAY);

/* point to 2D coordinates */
glVertexPointer(2, GL_FLOAT, 0, sector->star_coordinates);

/* each star has RGB colors */
glVertexPointer(3, GL_FLOAT, 0, sector->colors);

glDrawArrays(GL_POINTS, 0, count);

Do this for every visible sector, and it's blazing fast.

In my code, I still have to manipulate the vertices (star coordinates) because you can scroll the screen. However, I think it is also possible to keep the data completely static and have OpenGL look at it from a different position (moving the camera). This may seem like an odd approach for a 2D code, but it is "the OpenGL way".

Optimizing the planets

For drawing planets I take the easy way and use a quadrics object, a glusphere. A quadric is really a 3D object, so it's odd to use it in a 2D program, but on the other hand it looks kind of cool too. I don't know what OpenGL does, but what I learned is, it is common practice to compile the quadric into a display list.

From the top of my head:

display_sphere = glGenLists(1);
glNewList(display_sphere, GL_COMPILE);
gluSphere(sphere_quadric, radius, slices, slices);
glEndList();

Now draw the sphere:

glCallList(display_sphere);

The display list now has a compiled-in radius. Unfortunately, not all planets are the same size. Therefore, the sphere has to be scaled to match the size of the planet. This is not without consequences. When you scale objects in OpenGL, so will their normals. The normal vectors are used by OpenGL to compute the correct lighting on the object. Planets without shade don't look nice, so after scaling, the normals of every vertex (of the many sided sphere) have to be recomputed.

Now, because this planet is not going to change shape, it makes no sense to compute this object over and again for every frame. You can compute it once at program startup and keep a separate copy for every planet. This takes up more memory, but it will be fast.

In this blog entry I've given a couple of examples of how to optimize OpenGL programs. One problem that you will encounter is that optimizing sometimes means turning the code upside-down and inside-out. Optimizing performance can involve making major design changes. It is wise to draw things out on paper, and keep a couple of good copies of your code when trying new approaches.

Sunday, May 25, 2008

Hard experiences with Ubuntu Hardy

I already wrote about the buggy installation procedure of Ubuntu Hardy Heron. After a few weeks of using Ubuntu Hardy, I am near to throwing it off my system (!) It is simply frustrating, I seem to hit a bug every time I try to use it. The whole release is just a hodge-podge of "latest greatest" releases of software. There is no stability factor involved, the whole thing just feels like a "testing" release. I used Ubuntu Feisty and Gutsy and in fact, I was happy with Gutsy. Maybe I should reinstall Gutsy, argh.

As said, every day I find another bug in Hardy. I have a talent for finding bugs in bad software, as I was software tester in another lifetime. So, I decided to lend the Ubuntu project a helping hand and submit my bug reports to https://bugs.launchpad.net/ubuntu.

I hit bugs dating back to as far as 2005 that were closed as invalid because the support guy said he need to have more information. I quickly found three other instances of "I need more information", while the bug report was perfectly well described and clear to me.

From the top of my head:

dvorak keyboard setting is not being remembered. This little bug is three years and was closed due to insufficient information. Lots of googling led me to the astonishing solution: GNOME gets its keyboard settings from Xorg. It doesn't matter what you configure (or try to configure) in the System|Preferences|Keyboard menu. The setting is really tucked away in /etc/X11/xorg.conf. Use sudo vi to change.
Firefox 3 is beta, and it shows. Random crashes all over the web. Reporting the bugs won't help you, they are closed as invalid because they can not be reproduced. Yet everybody is experiencing these random crashes.
can't add bookmark in nautilus (the GNOME file manager). It turned out that at the time, my disk was full. Nautilus didn't report the error. Ubuntu support calls this a "user error", because my disk shouldn't be full. Hmm, Linux for human beings...??
The hardware testing application shows text that is too long to fit in the dialog, so you can not read the text. This is absurd. Who tested this?
The IT8212 chip in JBOD mode is not supported. You must build your own kernel and include the right module. A quick check in launchpad shows that this is a long-standing issue with remarks in the ticket like "invalid, need more info, have you tried this with a raid box?" etc. No! I'm glad I solved this issue by myself. I feel sorry for the poor dude who didn't know to build his own kernel.
The "Open With" function in nautilus is weird. It shows applications in the list that are no longer installed on the system. How do you open a file with an application that is no longer installed? (Actually, I had a total of four problems with this dialog box).
The menu editor, alacarte, doesn't edit the System menu. Or in fact, it does, but it doesn't show this to the user, as it doesn't update the screen properly. Another "invalid" bug from the dark ages of Breezy (now in the dark ages of Hardy).
Rhythmbox can't handle my music collection. I have a lot of mp3's, but come on! At startup it grinds and grinds and grinds... until I get fed up with it and kill it with -9. Die, die, you bad imitation of iTunes. (Note: iTunes is not a very good music player, either. But at least it plays music).

The music player is really a story by itself. My fav music player for all time, XMMS, has been obsoleted and pulled from the distro. The player exhibited major problems (stu-stu-stutter) after a change in the Linux scheduler, and the developer went totally nuts over it, flaming around that the whole world had gone insane except for himself. Consequently, XMMS was pulled from just about any distro you can think of.

There is an XMMS2, that I tried, but ... it sucks! As a matter of fact, there are a lot of music players for Linux, and they are all kinda sucky in one way or another. Wait, wait a minute. For the moment, I'm happy with mpd and Sonata. It's kind of weird design to have a daemon and a frontend for playing music, but it's a fun little player and at least Sonata won't grind my disk to pieces when it starts. Too bad the volume control in Sonata doesn't work under GNOME, but my guess is it'll be "invalid" and "need more information" if I report this bug.

Time to take a deep sigh and conclude this blog entry. The Hardy experience has not been a fun one so far. While the system is usable if you spend your days in a Terminal window, it clearly shows deficiencies when you are actually trying to use it. The support guys are not being helpful. Granted, the OS is free and without warranty, but then don't pretend to be a good OS. Ubuntu is not for human beings. Period. The Hardy release was done just to get an Ubuntu release out the door, not to deliver a good product. It's like the Vista of open source. The more tired I get of Ubuntu, the more I love MacOS. On the other hand, Linux paniced on me a lot less often.

Ah well, maybe I should try ArchLinux. Or maybe that distro also includes crappy GNOME, Firefox beta and more untested software. I long back to the days of debian and a simple fvwm2 desktop, maybe I'll try that.

Wednesday, May 7, 2008

Dvorak keyboard in SDL

A while back I wrote a blog entry about the dvorak keyboard on my laptop. Funnily enough, it won't work for SDL programs. SDL is the Simple Directmedia Layer library that allows for portable game development. I'm a huge fan of the SDL because it is simple, and it is very powerful too. I've made some pretty cool stuff using SDL, that I could not have made otherwise. Unfortunately, the SDL does not behave in a portable way when it comes to alternative keyboard layouts.

My personal experience:

In Linux, it works. I have a GNOME2 desktop with dvorak keyboard layout configured, and SDL programs get the correct keypresses.
In Windows XP, it does not work. SDL programs behave as if the keyboard was a qwerty keyboard.
In MacOS X, it does not work. It behaves like a qwerty keyboard.

Google turns up this solution, which does not work — at least not on the Mac, where I tested it:

In your initialization:

SDL_EnableUNICODE(1);

and then when you get a keyboard event: (From 'man SDL_keysym')

if(!(event.key.keysym.unicode & 0xff80))
  ascii = event.key.keysym.unicode;
else
  <deal with international characters>;

This method actually *works*, even with weird keyboard layouts like my custom swedish Dvorak variant. :-)

Let me repeat that this does not work, no matter what he says.

I wrote a conversion routine that maps the SDLKey keysyms from qwerty to dvorak, but this is really no solution. SDL should be made to use the platform-specific keyboard routines so that it correctly uses the configured keyboard layout.

I submitted this as a bug in bugzilla.libsdl.org, now let's see whether I got it all wrong or if someone is going to fix it.

Monday, May 5, 2008

Apple Airport Express: It just does not work

As you might know, I too fell for Apple's sleek looking hardware and its apparent easy-to-use software. The company from California that is being run by the no-nonsense genius Steve Jobs has a good reputation (well, at least for the last couple of years, ever since the ipods and Mac OS X) of delivering high quality products. Sadly, the Airport Express is a major letdown. I would go as far as to say that Apple is selling a broken product and is lying to their customers about it.

The Airport Express is a wireless networking device, which you can hook up to your home stereo to play your favorite music using iTunes. Apple's website and on the packaging of the Airport Express it is promised that you can use it in your existing wireless network. This is a downright lie, unless your entire network consists of Apple products only. That's right, the Airport Express will not play nicely with wireless equipment from other manufacturers. Oddly, this thing is getting rave reviews all over the web. Yes, it would be a great product ... if it would only just work.

The manual says setup is easy, just run the admin utility and the light on the Airport Express will turn green. Not true. One caveat is that you have to allow the MAC address of the device in your wireless router. It will get its IP address over DHCP and I got the light happy and green, but iTunes would not play "airtunes".

Eventually I hooked up an ethernet cable (direct link to my macbook!) and got it to play music, but it would not work wireless. After hours and hours of fiddling and resetting the Airport Express one more time, I decided to search the web some more.

What I found were some very disturbing messages on mailing lists, even on the Apple forums:

One guy wrote: "The Airport Express is a useless white brick with an amber light".
Another guy wrote, apparently frustrated: "What the frak is going on?!".
On amazon.com someone wrote: "Do not buy this product".

The article I am linking to ("Nothing but Problems: It just does not work!") is dated, but it also shows that it did not work in 2004 and it still does not work in 2008.

Some weblog writes that you can write some unsupported firmware into your Linksys router and make a bridged setup with the Airport Express. I sure am glad I did not try doing this ...

I returned the Airport Express the next day and got my money back. Sadly, I still can not play music from my macbook on my home stereo without plugging in a long audio cable.

Anyway, do not waste your time, do not buy this product!

Saturday, April 26, 2008

Hardy Heron: Quite Hard

The latest buzz in Linux land is the new stable (or Long Term Support) release of Ubuntu, also known as "Hardy Heron" or plain "hardy". Upgrading should be a breeze, but it still cost me a few hours of sleep as it wasn't until 0:30 AM before I got the stupid thing to work. I am generally content with Ubuntu and I have seen systems where it just works, just never with mine. My main frustration is that the same issues were present 2 years ago, and haven't been fixed.

OK, Let's do a network upgrade; open the software update manager and it shows a new distribution release is available. Click "Upgrade".

A dialog window opens and it shows it is going to upgrade the system. Click to continue. The window goes blank. Nothing happens. I wait. There are no progress bars, nothing. It just hangs there doing nothing. Eventually I xkill the blank window and kill its left behind siblings from a Terminal window. Any non-power user would give up at this moment. I remember this headache from the last upgrade, and decide to continue in console mode with "apt-get dist-upgrade".

The command-line method isn't all that bad, except that it runs into a dependency problem at one point. I use "dpkg -i --force-overwrite" for one package and "apt-get -f install" gets it going again.

The upgrade process retains most settings, but sadly, it resets my dvorak keyboard to qwerty and I can't type a single command anymore until I hook up an old qwerty keyboard that I have lying around. Now, how to set the console keymap to dvorak again? It was something with loadkeys or so.

To my surprise, X-Windows starts, but sadly only in 800x600. Again, to my surprise, a nice monitor config app pops up and I select my monitor brand and model (well, almost the right model. You know how they list 12 different models but never the exact one you own) and the 1440x900. The screen switches resolution and I click to accept this rez -- at which point it switches back 800x600 (!)
I zap Ctrl-Alt-Backspace out of X and restart it. GNOME does not come up. My custom wallpaper is there, that's nice. Nothing else works, not even Ctrl-Alt-Backspace. I hit the reset button on the PC and pray it will boot Linux at all.

It does boot and after wrestling my way out of gdm (I hate that thing, it keeps coming back when you try to exit it) I am back on the command-line to reconfigure X, but first I put an "exit 0" in /etc/init.d/gdm to shut it up.
I run "nvidia-xconfig" but afterwards, X does not start! No screens found. So, I move the backup xorg.conf.backup file back, and when I try it, X runs in a nice high resolution. Yippee!

My brief moment of joy is soon over. When I open a Terminal window, I can't type a single command. WTF?! I play with the keyboard typematic rate settings, but it doesn't help. Oddly, there is a an option in some tab in the keyboard config utility that makes your keyclicks respond dead slow. For some strange reason, this option has been turned on. (I wonder what the use of this option is, why would anyone deliberately want to break the keyboard like that?!) Uncheck the option, and I can TYPE again praise the Lord!!

I like using a VGA font in the Terminal window. Oddly, the font file is still present in the fonts directory, but the system isn't using it. I run "mkfontdir" and "xset fp rehash" and I got my favorite font back.

My MP3s are stored on a disk that is behind a IT8212 controller (not in raid mode) that Ubuntu does not include in the kernel. Consequently, I always have to build my own kernels only to include this one module that enables Linux to access my MP3s. It is annoying, especially because Ubuntu configures the kernel to include nearly everything (ever seen a desktop system with an InfiniBand adapter??) but not the module that I need.

Building the kernel takes a long time, but afterwards it does work. One thing that also works better now is sound; I suffered from stuttering sound every now and then and this appears to have been solved in newer kernels by the snd_rtctimer module.

So much for the upgrade process. What else is new in Hardy? I haven't seen much of it, but ... Firefox 3 is nice. It is beta, but they say it is fast and has a smaller memory footprint than Firefox 2. I was happy to see that Flash works right out of the box.
I was less happy to see that the bookmark sidebar has gotten more ugly.
Speaking of ugly, I have album art icons on my MP3 folders in nautilus. This used to look quite nice, but it doesn't look as good anymore. You can tune it a bit in the Preferences, but somehow it just looks worse.

Anyway, overall I'm very happy with Ubuntu. I just wanted to show that it is not always as easy to work with as they say. Remember, Ubuntu is Linux for human beings. And I am only human, after all.

Wednesday, March 5, 2008

OpenPGP smartcard

Due to my recent work in the field of security, I am now the proud owner of a socalled OpenPGP smartcard. The card looks like a regular credit card and has a chip on it. The card can be ordered from a German company called g10 and they will send it to you by (snail) mail. The card does not come pre-programmed; you need to get a "supported" smartcard reader (and writerrrr!!) in order to be able to load your personal GnuPG key onto it. (Why the card is named "OpenPGP" while it uses GnuPG, I've yet to find out).
Information on what smartcard readers/writers are "supported" and how to (supposedly) do it is described here.

Now, the bad news is that Linux totally sucks simply because the kernel development is going way too rapidly and third party applications/devices can't keep up. On top of that, the Linux documentation is outdated in some places, making it even harder to get things working. The information on gnupg.org is confusing, because some brand/type of smartcard reader needs to be configured differently than others.

Some hints on getting it working:

get the SCM Microsystems device; it is the only one that does not require pcscd and is therefore a little bit less complicated to set up.
Install the needed libraries. On Ubuntu, "apt-get install opensc" will get the packages. It will recommend the pcscd daemon, but you don't need it if you have the SCM Microsystems device. For other kinds of devices, you will probably need the daemon (read below for more info).
When GnuPG can not access the card reader, it will output a message about the pcscd not running. Ignore this misleading error message if you have the SCM Microsystems device, and instead check with strace what USB path it is trying to open.
use lsusb to get the device ID and put this in /etc/udev/rules.d/50-gnupg-ccid
the udev script gnupg-ccid provided by gnupg.org does not work. The script gets called with environment variable DEVICE set to something like /proc/usb/001/001, but this path mystically disappears after the script exits. The real device is /dev/usb/001/001, but this character device does not yet exist at the time that the script is being run by udev, so it cannot set the group permissions (No such file or directory).
By the way, it helps greatly to use "logger" to debug the udev script.
Don't bother with hotplug or usbdevfs unless you are running an ancient Linux kernel for some obscure reason. Reading the hotplug documentation will only confuse you, so skip these parts if you don't need it.

If you are lucky, you will get it working after some time. If you are unlucky enough to have an older device or something other than the SCM reader (no, they are not paying me to promote it) you will have to install the pcscd daemon. This daemon is actually an interface between gpg and the device driver. You will have to download (and often, compile from source) the driver from the website of the manufacturer of the smartcard reader. Don't be surprised if you run out of luck at this point; as said, Linux tends to change quickly and third-party drivers tend to lag behind, so don't be surprised if the driver for kernel 2.6.5 doesn't build against your 2.6.24 setup.

Another important point is "To use PC/SC make sure you disable CCID by passing the --disable-ccid option to GnuPG".

If you don't succeed, cry, and bang your head against the wall.
If you do succeed, there is of course the sweet taste of victory!

Honestly, I've only partly succeeded in getting the card to work yet. There is much work to be done and there isn't enough time in a day. To do:

try to get it working on my Mac
there must be a way around the permission problem with the udev script
use gpg-agent with this thing
try to get the OmniKey reader to work with PC/SC (drivers for 2.6.stone_age, but maybe there are more recent versions out there)

The card is way cool, but the main drawback is that you can't use it everywhere; the card always needs a reader to be present, and the reader always needs the software to be installed on the system. And especially that last part should not be underestimated.

Tuesday, February 19, 2008

Dvorak

Being a geeky programmer, I've grown accustomed to typing dvorak. After 15 years of qwerty my wrists and fingers started to hurt badly (again) and after I saw a new colleague typing dvorak on his (strange, but definately cool) TypeMatrix keyboard, I decided to give dvorak a try. Of course, I couldn't type a single e-mail message in less than 15 minutes, so I gave up before my boss got angry. During the Christmas holidays I practiced dvorak using the online ABCD typing course, and I've since then fully switched to dvorak.

Rearranging the keys on a Macbook

The reason I came to write this blog entry is of course that I rearranged the keys on my Macbook to dvorak. You should be very careful when attempting to do this, as the keyboard is quite fragile. The best way to go about doing it, is to get a paperclip and put it under the top-left corner of the key. Be gentle! It will click, now slide down the left-hand side of the key and try popping the lower left corner of the key. Trying to rotate the key clockwise off (keep it flat down, but spin it around carefully) also helps. I realise now that I'm a left-hander, so keeping the Macbook upside down if you right-handed will help. The clips that hold the keycap in its place are on the left side of the key.

If the keycap won't give, don't force it. Take your time and take a deep breath every now and then.

You will find that in most cases, the keycap will come off cleanly. In some cases however, you might accidently rip off the inner tiny plastic layer mechanism-thingy as well. Do not despair. Hold the keycap between thumb and forefinger, and use the paperclip again to gently pry the piece out of the keycap. Now reinsert the inner piece again into the keyboard, and be sure to do it right (upside down won't work OK). Put the right side in first and shove it under the metal clip. Now use your fingers to put the tiny "hinges" into place. If your fingers are too big for this, try fiddling with the paperclip and/or a tiny screwdriver (the kind you use for your glasses).

Now reinsert the keycaps by gently pushing them into place. Make sure you get the layout right the first time! You don't want have to do the operation again.

On my Macbook, the tilde and the backslash are in a different place than expected, but this is because I have a Dutch keyboard. Click the image for a larger version.

All in all, I'm very happy with this layout, at least now I know where to find the letters!

Learning Dvorak

Dvorak can be learned in a couple of weeks if you practice a lot, but do not expect to type up to speed after only a few weeks. In my experience, it is like your brain remembers all the movements your fingers make for every word by itself. Especially in the beginning, your brain will know where to find the words and you will automatically mistype it. What I'm trying to point out is, studying the layout of the keyboard won't help you; the brain seems to remember words and not characters. For a long time, I had trouble typing words that I had not yet used after switching to dvorak. As a consequence, it is hard getting your original typing speed back.

Count on as many as six months to get up a reasonably fast speed, and don't be surprised if you're still not any faster than you used to be (often due to still making too many typo's to be blazing fast). This is one of the main reasons why critics say that dvorak isn't any better at all. I say they should give it a try; you will find that when typing text, you are using one hand more than the other using a qwerty layout. With dvorak, your hands are often alternating in a surprisingly evenly balanced manner. This is due to the fact that in natural language, words that are made up from series of alternating characters, and the dvorak layout uses this fact to its advantage. A slight disadvantage is that a programming language is not much like a natural language. If you are fluent in C, for example, it takes some getting accustomed to programming in C on a dvorak keyboard as well.

Saturday, February 16, 2008

A word on portability

When I first switched to Linux (it was 1993 or 1994 so, and the music was way better back then), I wrote a vi-like editor program for MS-DOS. With the help of a bunch of ifdefs, the thing would also compile and run under Linux. Some argued that this was pretty useless, because Linux already came with editors that were much more powerful than my poor clone. But the main reason I was so thrilled with it, was because the same code worked for both platforms (1). It was portable.

Linguists might say "of course", because the C programming language is portable. In practice, there are many tiny differences to be taken into account when programming cross-platform. On top of that, the differences between MS-DOS and Linux are huge (you can't argue with that).

The UNIX operating system is a wonderful piece of machinery and it runs on all kinds of hardware. All variants of the UNIX operating system look more or less the same, at least from a distance. When you start programming under UNIX you will learn the true meaning of the term "portability" (2).

Portability does not mean that your code will build on run everywhere by default. You will find out that UNIX A is not the same as UNIX B, and your Linux code may not run on BSD, AIX, Solaris, or whatever. It's the little differences that will make a big difference. Your code may misbehave, dump core, or not build at all.

To counter these problems there is POSIX compliancy for operating systems. POSIX is a set of rules that dictates what system calls are available in the operating system, and how they behave. POSIX is what makes cross-platform development possible today, although it is by no means a perfect world yet.

A great tool aiding portability is autoconf. You should clearly understand though, that autoconf is not a magic tool that makes your code portable; it is a tool that can help you rephrase your code so that it works cross-platform. As with many tools, you still have to do the majority of the work yourself (3).

Autoconf takes some time to learn, but it is worth the investment, if your project is large/important enough. It took me a week or two to make a good configure.in for my bbs100 project, but afterwards it built and ran correctly on every machine I could get my hands on -- that includes PC, Sun, IBM, SGI, and CRAY hardware. With little more effort, it was also ported to Apple Mac OS X.

Autoconf revolves around checking whether a function is available, and if it is, it #defines a HAVE_FUNCTION for you that you can use. A good configure.in script makes very specific checks on functionality that you actually need for your program to work. A lot of software packages come with some kind of default configure script that checks everything, which is totally useless if the code doesn't make use of it.

In general, a check for ifdef HAVE_FUNCTION is much better than operating system-specific checks like ifdef __linux__, or ifdef __IRIX__.

An ifdef BIG_ENDIAN works much better than checking every existing architecture with ifdef __ppc__, ifdef __mips__, etcetera. I happen to know a Linux code that completely broke because of this. Linux is probably the most ported operating system there is, but lots of people seem to believe it is a PC-only, RedHat-only thing (4).

Actually, the best-practice trick is to stay away from using autoconf's ifdefs as much as possible, and to stick with what works everywhere. Once you learn what works and what is funky, you are often able to get away with not having to use autoconf at all. A well-written program is not kept together with the duct tape that ifdef is. This is somewhat of a bold statement, especially since so many software packages run configure before build. But a truly valid question is, do they depend on autoconf that badly and is autoconf's functionality actually being used? It is a joy to see (some of) my Linux software build everywhere with a simple make.

The funny thing is, it is still hard to write truly portable code today. Last week I wrote some 2D SDL/OpenGL code on my Linux machine. When I moved it over to Mac OS X, I got a blank screen. I found up to three problems with the code:

Apparently there is a slight difference in the SDL library when it comes to blitting bitmaps that have an alpha channel. The man page mentions that the outcome may be unexpected (when you blit a pixel surface over an empty surface with an alpha channel the outcome is zero; hence the blank screen) but then why does it work alright under Linux? I resorted to writing a TGA image loader and staying away from SDL's blitting functions.
Resetting the polygon mode in conjunction with enabling/disabling texturing multiple times in one frame seems to confuse OpenGL on Mac OS. It messes up badly.
After resizing the screen, OpenGL has lost its state and texture data and must be reinitialized. This is actually in the OpenGL standard and a bug on my side. But it does raise the question why this never surfaced on my Linux box. Apparently the (NVidia) video card has enough memory and does not get into an undefined state after a screen resize.

Lessons learned: Test your code across multiple platforms, test, test, test..!

I have yet to see my favorite DOS editor(s) run under Linux natively. Switching platforms usually means leaving your familiar apps and tools, and replacing them with a substitute.
In fact, I have a feeling that the ifdef preprocessor was invented for the sake of portability. It has other uses, but it kinda smells of a "fix" for the problem of supporting different architectures.
Having a hammer does not make you a great carpenter.
Supporting all kinds of distributions is not easy either.

A brief history of time

I've been programming for a fairly long time now. I didn't start out as young as some did, though. My first program was a school assignment, a birthday calendar written in Commodore BASIC. I used the computer in class, I didn't have one at home.

It must have been something like four years until I went to college, got a computer, and started programming. I got hooked on assembly code and spent whole days and nights writing code.

... and that was the beginning of a whole lot of bits and bytes ...

That was a really long time ago. At some point you think you know everything, but then you realize the world has changed since then, and people are blogging and stuff. Having never written any 3D code before, I found a new challenge in OpenGL. Now, this is certainly not the most important thing that's happened to me, but it is where I am at now. So, this is where the story ends, for tonight.