The Developer's Cry: April 2013

Last time I wrote about the strcpy() function and said that it's unsafe. But why exactly is it unsafe? Let us see the details of what is going on under the hood when strcpy() is called. To do so, we will dive down to the machine level and have a look at what is happening in the stack memory.

In the C language, strings have no explicit length. Instead, the length is determined by a terminating NUL character. Therefore, strcpy() copies bytes until it sees a zero byte:

void strcpy(char *dst, char *src) {
     /* copy src to dst until src[0] == 0 */
     while(*src)
         *dst++ = *src++;
}

Consider the following function, which includes the common programming mistake of assuming that the input will fit into the buffer:

void func(char *input) {
char buf[256];

strcpy(buf, input);
}

Now let's examine what happens at the machine level when this baby executes. When a subroutine is called, the CPU pushes the address of the next instruction onto the stack. This address is the return address. To return from a subroutine, the return address is popped off the stack and loaded as the current instruction pointer. Thus the program jumps back and resumes execution at the point right after the subroutine was called. Using a stack allows for nested subroutine calls.

Local variables and function parameters are placed on the stack as well. This is great because now when a subroutine ends, the local variables go out of scope as the stack frame is ‘cleaned up’ (in fact, the data is still there, but the stack pointer is moved).

This amounts to the following picture of a stack frame when we are executing the func:

stack pointer ->
+------------------------------+
| local var: char buf[256]    |
|------------------------------|
|        return address        |
|------------------------------|
| parameter: char *input      |
|------------------------------|
| local vars       of caller   |
|------------------------------|
| return address   of caller   |
|------------------------------|
| parameters       of caller   |
|------------------------------|
| ...                          |
end of stack

You can see that an attacker can overwrite the return address when he supplies an input string that is longer than buf. Not only can he overwrite the return address, he can insert a specially crafted mini-program into buf. What many exploits do is put the return address as the address of buf so that the program will jump back to execute the payload in buf.

To prove that this actually works, we can do a little experiment. Write a small program that fills a buffer with garbage and overwrites the return address as described above. Guess what, it doesn't work! The program gets killed by a run-time check, cleverly inserted by the compiler. Here is a postmortem stack trace:

(gdb) bt
#0 0x00007fff9a45cd46 in __kill ()
#1 0x00007fff9ad98ec0 in __abort ()
#2 0x00007fff9ad5a77d in __chk_fail ()
#3 0x00007fff9ad5aa4f in __strcpy_chk ()
#4 0x0000000100000de4 in func (input=0x7fff5fbff7e0 'x' , "\030??_") at hijack.c:19
#5 0x0000000100000e5e in main () at hijack.c:29

This is in OS X using clang. Googling turns up __memcpy_chk for gcc:

“GCC implements a limited buffer overflow protection mechanism that can prevent some buffer overflow attacks.”

void *
__memcpy_chk (void *__restrict__ dest,
              const void *__restrict__ src,
              size_t len, size_t slen)
{
    if (len > slen)
        __chk_fail ();
    return memcpy (dest, src, len);
}

As you can see, the compiler inserts a run-time check for the size of the buffer. On top of this, a second run-time check is made that checks the integrity of the stack. This technique is known as inserting a ‘stack canary’ and can be observed by studying a disassembly of our func below. Here you can see that 288 bytes (or in hexadecimal notation, 0x120) of space is taken from the stack for local variables. This is more than the 256 we actually requested. The remaining 32 bytes are used for the stack canary.

Then it calls strcpy_chk() rather than strcpy(). Finally, the stack canary is checked, and may result in stack_chk_fail() being called. Otherwise, the stack frame is cleaned up and the function returns normally.

0x0000000100000d90 push   %rbp
0x0000000100000d91 mov    %rsp,%rbp
0x0000000100000d94 sub    $0x120,%rsp
0x0000000100000d9b mov    0x26e(%rip),%rax
0x0000000100000da2 mov    (%rax),%rax
0x0000000100000da5 mov    %rax,-0x8(%rbp)
0x0000000100000da9 mov    $0x100,%rdx
0x0000000100000db3 lea    -0x110(%rbp),%rax
0x0000000100000dba mov    %rdi,-0x10(%rbp)
0x0000000100000dbe mov    -0x10(%rbp),%rsi
0x0000000100000dc2 mov    %rax,%rdi
0x0000000100000dc5 callq 0x100000e8c <__strcpy_chk>
0x0000000100000dca mov    0x23f(%rip),%rdx
0x0000000100000dd1 mov    (%rdx),%rdx
0x0000000100000dd4 mov    -0x8(%rbp),%rsi
0x0000000100000dd8 cmp    %rsi,%rdx
0x0000000100000ddb mov    %rax,-0x118(%rbp)
0x0000000100000de2 jne    0x100000df1 <func+97>
0x0000000100000de8 add    $0x120,%rsp
0x0000000100000def pop    %rbp
0x0000000100000df0 retq
0x0000000100000df1 callq 0x100000e86 <__stack_chk_fail>

So, the compiler does a lot of work for us in order to prevent simple buffer overflows. The canary is initialized with a random value before main() runs, so it practically can not be defeated. But beware, an attacker may still influence the program behavior in different ways and deliberately not touch the canary.

Other techniques that help prevent buffer overflow attacks are ASLR (address space layout randomization) and DEP (data execution prevention) or NX pages (non-executable memory pages). Although they make it more difficult, these too can be circumvented by trickery.

Mind ye that all of this is plain impossible if only you (or the high-level language itself!) always properly check the size of the buffer and array bounds. It is something that C normally does not do for you, so be mindful that the compiler will not always be able to save you from writing insecure code.

Every C programmer should know that the strcpy() function is insecure; if the destination buffer is too small to hold the string, strcpy() will happily overflow the buffer and copy over whatever was there in memory. Other than corrupting variables, it may overwrite return addresses in stack memory, which spells doom for system security because it allows an attacker to inject code into the running program, commonly known as ‘exploiting a vulnerability.’

Sure enough the strcpy() function is a very dangerous thing. OpenBSD thought up the strlcpy() and strlcat() safe string functions to counter the problem. With these, you always must supply the size of the destination buffer. Strings that are too long will not overflow the buffer, any attempts at buffer overflow are simply stopped. strlcpy() will terminate the copied string at the end of the buffer and thereby plugging the hole.

A remarkably simple but effective solution. The strlcpy() and strlcat() functions are widely used in OpenBSD and were adopted in other operating systems as well, like FreeBSD and Mac OS X. But I wouldn't have known about these functions in the first place if I hadn't run into a code that would not compile on Linux, where strlcpy() is missing from the GNU C library, and maybe righteously so.

However superficially brilliant strlcpy() seems, it is all too easy. The function may truncate the copy of the string, so ... It copies the string, except when it doesn't, then it only partially copies it. Many people consider this incorrect. It isn't logical to have a copy function that truncates the copy. You can get really weird things from this, for example when an UTF-8 string gets truncated in the wrong position, it will result in a corrupted string. What if the string is a file path or a URL that is truncated?
Of course, you should check the return value of the function. It's standard programming practice that also applies to the traditional string handling functions. But in a strange way, these safe string functions are actually unsafe. A false sense of security creeps in. Actually, not copying the string at all would have been better than truncating.

Nevertheless, strlcpy() remains in use in various codes (like OpenSSH and rsync). Wouldn't it be nice if these functions were available just for the sake of portability. In the Linux world, that argument just isn't good enough.

So, what are your options? For portability, you might want to use autoconf and ifdef HAVE_STRLCPY, but note that it doesn't really help you. It's sugar coating that looks advanced, but it doesn't make it any better. My advice is to steer clear of strlcpy(), just don't use it. Stick with the traditional string functions and keep checking those buffer lengths.
You might use an external string library that supports growable strings. Personally, I'm good with my_strcpy() function which is basically strlcpy() with a twist: call abort() when the destination buffer is too small. It's not user-friendly, but it gets you out of a bad situation quickly.
Other than that, accept that C is maybe not the best choice for implementing userland code written by mere mortals. Try a different language, like golang. It has very robust string handling.

Next time we'll have a look under the hood and examine how buffer overflows work.

The Developer's Cry

Sunday, April 7, 2013

strcpy(): The safety of an unsafe string copy function

Monday, April 1, 2013

strlcpy(): The unsafety of a safe string copy function