Monday, February 24, 2014

MD5 and libs and licenses

I keep running into code that looks like this:
#define S21 5
#define S22 9
#define S23 14
#define S24 20
    GG ( a, b, c, d, in[ 1], S21, UL(4129170786)); /* 17 */
    GG ( d, a, b, c, in[ 6], S22, UL(3225465664)); /* 18 */
    GG ( c, d, a, b, in[11], S23, UL( 643717713)); /* 19 */
    GG ( b, c, d, a, in[ 0], S24, UL(3921069994)); /* 20 */
    GG ( a, b, c, d, in[ 5], S21, UL(3593408605)); /* 21 */
    GG ( d, a, b, c, in[10], S22, UL(  38016083)); /* 22 */
    GG ( c, d, a, b, in[15], S23, UL(3634488961)); /* 23 */
    /* ... etcetera ... */
    HH ( ... )
    II ( ... )
This is an excerpt of C code for the MD5 algorithm as implemented by Ron Rivest, published in RFC-1321 in 1992. What's wrong with this picture? Well, not so much, except that there is this nice OpenSSL library that is present practically everywhere. The OpenSSL library provides this functionality and the code has been reviewed by dozens of people and is being used by millions.

Using libssl for MD5 digests is easy:
#include <openssl/md5.h>

    unsigned char digest[16];
    MD5_CTX ctx;

    MD5_Init(&ctx);
    MD5_Update(&ctx, buf, buf_len);
/* multiple calls to MD5_Update() may be made */
    MD5_Final(digest, &ctx);
Link with -lssl and you're done.

On OSX, something is up. OSX no longer ships OpenSSL. Instead, Apple now provides their “CommonCrypto API”, which is, ironically, not so common. An ugly trick to get OpenSSL MD5 code to work with CommonCrypto:
#include <CommonCrypto/CommonDigest.h>

#define MD5_CTX      CC_MD5_CTX
#define MD5_Init     CC_MD5_Init
#define MD5_Update   CC_MD5_Update
#define MD5_Final    CC_MD5_Final

#ifdef MD5_DIGEST_LENGTH
#undef MD5_DIGEST_LENGTH
#define MD5_DIGEST_LENGTH    CC_MD5_DIGEST_LENGTH
#endif
Why did Apple do this, you may ask. The reason is probably software licensing; OpenSSL's dualistic license is problematic in particular for apps on iOS devices. Apple could probably have kept OpenSSL on the Mac, but they didn't.
(* It's either this or the NSA forced them into putting some weakened PRNGs into CommonCrypto).

Coming full circle, you would be justified not to use a common library in cases where you run into  issues with the software license. There are many free and open software licenses, and many of them have quirks. This is a major annoyance in software development, and something to consider when using libraries.