Sunday, October 23, 2011

Exceptional conditions: adding exceptions in standard C

An important aspect of programming is error handling. A program is running, all is well, and then ... oh no! All of a sudden a condition occurs that was not supposed to happen, or at least the programmer wished it would not happen. A program usually checks the most common errors and may retry, work around, or simply abort the operation depending on how hard to resolve this error is.

There are software error conditions and there are hardware errors. Hardware errors are represented by software error codes. There are different ways in which an error condition is given to an application when it comes to programming:
  • error return value from a function
  • library reports error: in standard C, check errno
  • UNIX signal received
  • an exception was thrown or raised
Easiest to understand and use is the first one; if the return value of a function has a certain value (error code), then it signifies an error and the application should act accordingly. The application programmer himself must choose what error codes to use for error conditions.

The second case describes how the standard C library reports errors; the library functions return zero on success and -1 on error. To get more information about what error occurred, examine the error code in errno. Note, errno is not a normal variable but a function that returns a thread-safe errno value. All errno values are predefined by the operating system. The errno values are (either translated or directly propagated) return codes from the operating system's system calls.

UNIX signals may be sent by the (UNIX) operating system to signify certain conditions. These conditions vary from SIGALRM (a timer expired) to hard errors like SIGSEGV (segment violation, which is illegal memory access) and SIGBUS (bus error). Signals are handled by the process itself, but the signal handler is run in a different context. This means that the normal program flow is interrupted and continues after the signal handler has finished executing. A signal can even interrupt a system call, after which the system call will report the errno value EINTR (system call was interrupted). Signal handlers may be hooked by the application, but an application can not define its own signal numbers. The signal numbers are predefined by the operating system.

Exceptions are exceptional conditions that may be thrown or raised. An application may catch an exception, in which it will resume operation at the point where it first tried to do the operation. Because it involves stack unwinding and jumping through code, exceptions generally require special language support. Moreover, exceptions generally have a specific syntax in the form of try { ... } catch(Exception) { ... } or try: ... except Exception: .... Exceptions are a key aspect of the Java and Python programming languages, in which it is common use to throw exceptions for error conditions. In C++ exceptions exist but it is more left to the programmer whether to use them extensively or stick with error codes. In Objective-C exceptions exist but they are not commonly used.

Exceptions do not exist in standard C, but I thought it would be fun to add them. How do you add a language feature that is completely absent from C? Well, you can emulate them. In fact, I stole this neat idea from Objective-C; in Objective-C the NSException is implemented using setjmp()/longjmp(). Without using any macros to define try and catch, I came up with the following:
if (!catchException(exception_number)) {
... try some code ...

throwException(exception_number, "descriptive message");

} else {
printf("caught exception! %s\n", caughtExceptionMsg());
}
endException();
In this code, catchException() really sets up a try block. What it does behind the screens is creating an Exception object containing a jmp_buf and pushing it onto an exception watching stack. The code that actually catches the Exception is really in throwException(), which examines the stack to see if we want to catch the thrown exception at all. When it finds the corresponding Exception object, it cleans up the stack and jumps back to the initial starting point of the try block. If the exception is not being caught, print a message about the uncaught exception and abort the program.
If the exception does not occur, you are left with an object on the exception watching stack, which is why you need to explicitly call endException() to clean up the stack. Other languages do this implicitly.

This way we have added exceptions to standard C. There is a huge caveat to using exceptions correctly: as your program is able to jump back and forth through large pieces of code, you have to be extra careful not to leak memory. You can't have everything, but if you have something like an AutoReleasePool then this may help a great deal.