Sunday, May 17, 2015

Loading PNG images with libpng

For textures in games I'm often inclined to quickly grab some old TGA (targa) loader. The code works, and thus I've been dragging it along for years. Another often used format is Windows BMP. Writing an image loader can be quite an undertaking, which is because of the huge variety of features a single image format may support. Lucky for us, we don't have to write a PNG loader from scratch; let's use the libpng library.

About the PNG image format
Nearly 20 years ago the PNG (Portable Network Graphics) image format was invented for the Web. Up till then, the most used image formats were GIF and JPEG. GIF supports only 256 colors, and uses a patented compression method. JPEG does not support transparency, and uses lossy compression, resulting in visual artefacts.
PNG improved greatly on this by supporting 32-bit color, transparency, and lossless compression. Moreover, it's absolutely patent-free and freely available. Since it was invented for the Web and not photography or printing press, PNG lacks alternate color schemes like CMYK, and you won't find the image format being used for those applications. It's a good fit for our purposes, however.

The PNG file format consists of a header followed by a number of chunks. The most basic chunks are PLTE (palette) and IDAT (data), but there are many other kinds of chunks, for example for storing color information and metadata, like last modified time. The image data is compressed with the deflate algorithm, which is the algorithm used by zip/unzip. Deflate does lossless compression, and more importantly, the algorithm is not patented. It is present in zlib, which is also freely available.
However, the image lines are filtered by one of five possible filter types to improve the compression rate. Many kinds of pixel formats are supported. Finally, PNG supports interlacing, so pixels aren't necessarily stored neatly in sequence.
This is just to illustrate that writing a PNG loader from scratch is not easily done. We don't have so much time on our hands, and will therefore be using the library.

A walkthrough of the code
Let's load up some pixels — should be easy, right? So you open up the manual for libpng, only to find that it starts with a whopping 20 pages listing just function headers. That's right, there are about 260 different functions in libpng. We won't need to use them all, but unfortunately, loading a PNG image is not quite straightforward. But let's get down to it.
Our PNG_Image class will have height, width, a bytes per pixel member, a color format (for OpenGL), and a pixel buffer. Following is the annotated code needed to load a PNG image. The method will take a filename and return true on success.

In the top the file, include the appropriate headers:
#include <png.h>
#include <gl.h>

First off, we will want to open a .png file and check the header for the “PNG” signature.
    FILE *f = fopen(filename, "rb");
    if (f == nullptr) {
        return false;
    }

    // check that the PNG signature is in the file header

    unsigned char sig[8];
    if (fread(sig, 1, sizeof(sig), f) < 8) {
        fclose(f);
        return false;
    }
    if (png_sig_cmp(sig, 0, 8)) {
        fclose(f);
        return false;
    }

Next, we need to create two data structures: png_struct and png_info. Many of libpng's functions will require us to pass in these structs. You can think of png_struct as a “PNG object” that holds state for various functions. It won't hold any image data, however. The png_info struct will hold metadata like image height and width. Creating these structs allocates memory from the heap; if anything goes wrong we must take care to deallocate them, or else we'd be leaking memory.
[Also mind that we still have an open file].
    // create two data structures: 'png_struct' and 'png_info'

    png_structp png = png_create_read_struct(PNG_LIBPNG_VER_STRING, nullptr, nullptr, nullptr);
    if (png == nullptr) {
        fclose(f);
        return false;
    }

    png_infop info = png_create_info_struct(png);
    if (info == nullptr) {
        png_destroy_read_struct(&png, nullptr, nullptr);
        fclose(f);
        return false;
    }

Mind that plain C has no exceptions. To work around this, libpng has an error handling mechanism that works with setjmp(). Imagine decompressing a corrupted image, and you don't want the program to crash. What libpng does is, it jumps back to the point where you issued the setjmp() call. So this here is just some cleanup code.
    // set libpng error handling mechanism
    if (setjmp(png_jmpbuf(png))) {
        png_destroy_read_struct(&png, &info, nullptr);
        fclose(f);

        if (pixels != nullptr) {
            delete [] pixels;
            pixels = nullptr;
        }
        return false;
    }

Q: Where did that pixels array come from?
A: That's our pixel buffer member variable. It won't be allocated until later. The construction with setjmp() is turning the code inside out.

We are just getting started here. Let's gather some basic information: image height, width, and such.
    // pass open file to png struct
    png_init_io(png, f);

    // skip signature bytes (we already read those)
    png_set_sig_bytes(png, sizeof(sig));

    // get image information
    png_read_info(png, info);

    w = png_get_image_width(png, info);
    h = png_get_image_height(png, info);

We're only interested in RGB and RGBA data, so other bit depths and color types are converted. So even if the image is in grayscale or whatever, we will get RGB data when reading the image. It's nice that libpng can do these conversions on the fly.
    // set least one byte per channel
    if (png_get_bit_depth(png, info) < 8) {
        png_set_packing(png);
    }

PNG may have a separate transparency chunk. Convert it to alpha channel.
    // if transparency, convert it to alpha
    if (png_get_valid(png, info, PNG_INFO_tRNS)) {
        png_set_tRNS_to_alpha(png);
    }

Next, get the color type. The format variable is meant as a parameter for OpenGL texturing. If the format can not be determined, give up and return an error.
    switch(png_get_color_type(png, info)) {
        case PNG_COLOR_TYPE_GRAY:
            format = GL_RGB;
            png_set_gray_to_rgb(png);
            break;

        case PNG_COLOR_TYPE_GRAY_ALPHA:
            format = GL_RGBA;
            png_set_gray_to_rgb(png);
            break;

        case PNG_COLOR_TYPE_PALETTE:
            format = GL_RGB;
            png_set_expand(png);
            break;

        case PNG_COLOR_TYPE_RGB:
            format = GL_RGB;
            break;

        case PNG_COLOR_TYPE_RGBA:
            format = GL_RGBA;
            break;

        default:
            format = -1;
    }
    if (format == -1) {
        png_destroy_read_struct(&png, &info, nullptr);
        fclose(f);
        return false;
    }

This is how to get the number of bytes per pixel, regardless of channels and bit depth:
    bpp = (uint)png_get_rowbytes(png, info) / w;

[That bpp variable is bytes per pixel, not bits per pixel].

We don't want to receive any interlaced image data. Even if the image is interlaced, we want to have the ‘final’ image data. There's a func for that:
    png_set_interlace_handling(png);

We are done reading the header (info) chunk. You need to tell libpng you want to get ahead and skip the rest of the chunk.
    png_read_update_info(png, info);

Now we are ready to go read the image data. We deferred allocating the pixel buffer up till now (this is how the manual says you should do it).
Annoyingly enough, png_read_image() expects an array of pointers to store rows of pixel data. So you need to construct both a pixel buffer and a temporary array of pointers to the rows in that pixel buffer. And then you can read the image data.
    // allocate pixel buffer
    pixels = new unsigned char[w * h * bpp];

    // setup array with row pointers into pixel buffer
    png_bytep rows[h];
    unsigned char *p = pixels;
    for(int i = 0; i < h; i++) {
        rows[i] = p;
        p += w * bpp;
    }

    // read all rows (data goes into 'pixels' buffer)
    // Note that any decoding errors will jump to the
    // setjmp point and eventually return false
    png_read_image(png, rows);

You actually need to read the end of the PNG data. This will read the final chunk in the file, and check that the file is not corrupted.
    png_read_end(png, nullptr);

Finally, clean up and return true, indicating success.
    png_destroy_read_struct(&png, &info, nullptr);
    fclose(f);
    return true;

Well ... that was a little odd, but we managed to load some pixels. Not included in this tutorial: we can pass the pixel buffer on to OpenGL to create textures.

Opinion or plain justified criticism
libpng provides tons of functionality for working with PNG images. Be that as it may, there is a lot not to like here. For one thing, loading an image is a ridiculous, unwieldy process. Conspicuously absent is a simple and stupid png_load() function that does all that listed above, now left as an exercise by the developers of libpng.
All I wanted to do was load an image, but libpng required me to learn everything there is to know about the file format, the library, and then some.
At any rate, do the dance and it will work. If it's too much for you, there's always SDL_Image.