While there are other programming languages that are susceptible to buffer overflows, C and C++ are popular for this class of attacks. In this article, we’ll explore some of the reasons for buffer overflows and how someone can abuse them to take control of the vulnerable program.

What is buffer overflow?

Buffer overflow is a class of vulnerability that occurs due to the use of functions that do not perform bounds checking. In simple words, it occurs when more data is put into a fixed-length buffer than the buffer can handle.  It’s better explained using an example. So let’s take the following program as an example.   void vuln_func(char *input); int main(int argc, char *argv[]) { if(argc>1) vuln_func(argv[1]); } void vuln_func(char *input) { char buffer[256]; strcpy(buffer, input); } This is a simple C program which is vulnerable to buffer overflow. If you look closely, we have a function named vuln_func, which is taking a command-line argument. This argument is being passed into a variable called input, which in turn is being copied into another variable called buffer, which is a character array with a length of 256. However, we are performing this copy using the strcpy function. This function doesn’t perform any bounds checking implicitly; thus, we will be able to write more than 256 characters into the variable buffer and buffer overflow occurs. If this overflowing buffer is written onto the stack and if we can somehow overwrite the saved return address of this function, we will be able to control the flow of the entire program. That’s the reason why this is called a stack-based buffer overflow.

Types of buffer overflow

We have just discussed an example of stack-based buffer overflow. However, a buffer overflow is not limited to the stack. The following are some of the common buffer overflow types.

Stack-based buffer overflow

When a user-supplied buffer is stored on the stack, it is referred to as a stack-based buffer overflow. As mentioned earlier, a stack-based buffer overflow vulnerability can be exploited by overwriting the return address of a function on the stack. 

Heap-based buffer overflow

When a user-supplied buffer is stored on the heap data area, it is referred to as a heap-based buffer overflow. Heap overflows are relatively harder to exploit when compared to stack overflows. The successful exploitation of heap-based buffer overflow vulnerabilities relies on various factors, as there is no return address to overwrite as with the stack-based buffer overflow technique. The user-supplied buffer often overwrites data on the heap to manipulate the program data in an unexpected manner. 

Understanding debuggers

Understanding how to use debuggers is a crucial part of exploiting buffer overflows. When writing buffer overflow exploits, we often need to understand the stack layout, memory maps, instruction mnemonics, CPU registers and so on. A debugger can help with dissecting these details for us during the debugging process. In the Windows environment, OllyDBG and Immunity Debugger are freely available debuggers. GNU Debugger (GDB) is the most commonly used debugger in the Linux environment.

Exploit mitigation techniques

To be able to exploit a buffer overflow vulnerability on a modern operating system, we often need to deal with various exploit mitigation techniques such as stack canaries, data execution prevention, address space layout randomization and more. To keep it simple, let’s proceed with disabling all these protections. For the purposes of understanding buffer overflow basics, let’s look at a stack-based buffer overflow.

Crashing and analyzing core dumps

In this section, let’s explore how one can crash the vulnerable program to be able to write an exploit later. The following makefile can be used to compile this program with all the exploit mitigation techniques disabled in the binary. We are simply using gcc and passing the program vulnerable.c as input. We are producing the binary vulnerable as output. clean: rm vulnerable Let’s disable ASLR by writing the value 0 into the file /proc/sys/kernel/randomize_va_space. This looks like the following: Now we are fully ready to exploit this vulnerable program. Let’s compile it and produce the executable binary. To do this, run the command make and it should create a new binary for us. We should have a new binary in the current directory. Let’s run the file command against the binary and observe the details. As we can see, it’s an ELF and 64-bit binary. Let’s run the binary with an argument. $ Nothing happens. This is intentional: it doesn’t do anything apart from taking input and then copying it into another variable using the strcpy function. 

Crashing the program

Now let’s see how we can crash this application. We’re going to create a simple perl program. So we can use it as a template for the rest of the exploit. Let’s create a file called exploit1.pl and simply create a variable. Let’s give it three hundred “A”s. We want to produce 300 characters using this perl program so we can use these three hundred “A”s in our attempt to crash the application. exploit1.pl Let us also ensure that the file has executable permissions.   $junk = “A” x 300; print $junk; Now, let’s write the output of this file into a file called payload1.  Let’s simply run the vulnerable program and pass the contents of payload1 as input to the program.  As you can see, there is a segmentation fault and the application crashes. Now let’s type ls and check if there are any core dumps available in the current directory. $ If you notice, in the current directory there is nothing like a crash dump. There are no new files created due to the segmentation fault. Let’s enable core dumps so we can understand what caused the segmentation fault.  $ $ $ ls exploit1.pl  Makefile  payload1  vulnerable  vulnerable.c $   This should enable core dumps. Now, let’s crash the application again using the same command that we used earlier. Type ls once again and you should see a new file called core. This file is a core dump, which gives us the situation of this program and the time of the crash. We can use this core file to analyze the crash. Let’s see how we can analyze the core file using gdb.  $ $ $ ls core  exploit1.pl  Makefile  payload1  vulnerable*  vulnerable.c $   75 commands loaded for GDB 9.1 using Python engine 3.8 [*] 5 commands could not be loaded, run gef missing to know why. [New LWP 34966] [!] ‘./vulnerable AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’ not found/readable [!] Failed to get file debug information, most of gef features will not work Core was generated by `./vulnerable AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’. Program terminated with signal SIGSEGV, Segmentation fault. #0  0x00005555555551ad in ?? () gef➤  If you look at this gdb output, it shows that the long input has overwritten RIP somewhere. (RIP is the  register that decides which instruction is to be executed.) If you notice the next instruction to be executed, it is at the address 0x00005555555551ad, which is probably not a valid address. That’s the reason why the application crashed. As I mentioned earlier, we can use this core dump to analyze the crash.  We can also type info registers to understand what values each register is holding and at the time of crash. As I mentioned, RIP is actually overwritten with  0x00005555555551ad and we should notice some characters from our junk, which are 8 As in the RBP register. This is how core dumps can be used. rbx            0x5555555551b0      0x5555555551b0 rcx            0x80008             0x80008 rdx            0x414141            0x414141 rsi            0x7fffffffe3e0      0x7fffffffe3e0 rdi            0x7fffffffde89      0x7fffffffde89 rbp            0x4141414141414141  0x4141414141414141 rsp            0x7fffffffde68      0x7fffffffde68 r8             0x0                 0x0 r9             0x7ffff7fe0d50      0x7ffff7fe0d50 r10            0x0                 0x0 r11            0x0                 0x0 r12            0x555555555060      0x555555555060 r13            0x7fffffffdf70      0x7fffffffdf70 r14            0x0                 0x0 r15            0x0                 0x0 rip            0x5555555551ad      0x5555555551ad eflags         0x10246             [ PF ZF IF RF ] cs             0x33                0x33 ss             0x2b                0x2b ds             0x0                 0x0 es             0x0                 0x0 fs             0x0                 0x0 gs             0x0                 0x0 gef➤  Let’s run the program itself in gdb by typing gdb ./vulnerable and disassemble main using disass main.  This is the disassembly of our main function. If you notice, within the main program, we have a function called vuln_func. Let us disassemble that using disass vuln_func.    0x0000000000001149 <+0>: endbr64     0x000000000000114d <+4>: push   rbp    0x000000000000114e <+5>: mov    rbp,rsp    0x0000000000001151 <+8>: sub    rsp,0x10    0x0000000000001155 <+12>: mov    DWORD PTR [rbp-0x4],edi    0x0000000000001158 <+15>: mov    QWORD PTR [rbp-0x10],rsi    0x000000000000115c <+19>: cmp    DWORD PTR [rbp-0x4],0x1    0x0000000000001160 <+23>: jle    0x1175 <main+44>    0x0000000000001162 <+25>: mov    rax,QWORD PTR [rbp-0x10]    0x0000000000001166 <+29>: add    rax,0x8    0x000000000000116a <+33>: mov    rax,QWORD PTR [rax]    0x000000000000116d <+36>: mov    rdi,rax    0x0000000000001170 <+39>: call   0x117c <vuln_func>    0x0000000000001175 <+44>: mov    eax,0x0    0x000000000000117a <+49>: leave      0x000000000000117b <+50>: ret     End of assembler dump. gef➤   If you notice the disassembly of vuln_func, there is a call to strcpy@plt within this function.     0x000000000000117c <+0>: endbr64     0x0000000000001180 <+4>: push   rbp    0x0000000000001181 <+5>: mov    rbp,rsp    0x0000000000001184 <+8>: sub    rsp,0x110    0x000000000000118b <+15>: mov    QWORD PTR [rbp-0x108],rdi    0x0000000000001192 <+22>: mov    rdx,QWORD PTR [rbp-0x108]    0x0000000000001199 <+29>: lea    rax,[rbp-0x100]    0x00000000000011a0 <+36>: mov    rsi,rdx    0x00000000000011a3 <+39>: mov    rdi,rax    0x00000000000011a6 <+42>: call   0x1050 strcpy@plt    0x00000000000011ab <+47>: nop    0x00000000000011ac <+48>: leave      0x00000000000011ad <+49>: ret     End of assembler dump. gef➤  Now run the program by passing the contents of payload1 as input. In the current environment, a GDB extension called GEF is installed. It shows many interesting details, like a debugger with GUI. Program received signal SIGSEGV, Segmentation fault. 0x00005555555551ad in vuln_func () [ Legend: Modified register | Code | Heap | Stack | String ] ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ──── $rax   : 0x00007fffffffdd00  →  “AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[…]” $rbx   : 0x00005555555551b0  →  <__libc_csu_init+0> endbr64  $rcx   : 0x20000            $rdx   : 0x11               $rsp   : 0x00007fffffffde08  →  “AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA” $rbp   : 0x4141414141414141 (“AAAAAAAA”?) $rsi   : 0x00007fffffffe3a0  →  “AAAAAAAAAAAAAAAAA” $rdi   : 0x00007fffffffde1b  →  “AAAAAAAAAAAAAAAAA” $rip   : 0x00005555555551ad  →  <vuln_func+49> ret  $r8    : 0x0                $r9    : 0x00007ffff7fe0d50  →   endbr64  $r10   : 0x0                $r11   : 0x0                $r12   : 0x0000555555555060  →  <_start+0> endbr64  $r13   : 0x00007fffffffdf10  →  0x0000000000000002 $r14   : 0x0                $r15   : 0x0                $eflags: [zero carry parity adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification] $cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ──── 0x00007fffffffde08│+0x0000: “AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA” ← $rsp 0x00007fffffffde10│+0x0008: “AAAAAAAAAAAAAAAAAAAAAAAAAAAA” 0x00007fffffffde18│+0x0010: “AAAAAAAAAAAAAAAAAAAA” 0x00007fffffffde20│+0x0018: “AAAAAAAAAAAA” 0x00007fffffffde28│+0x0020: 0x00007f0041414141 (“AAAA”?) 0x00007fffffffde30│+0x0028: 0x00007ffff7ffc620  →  0x0005042c00000000 0x00007fffffffde38│+0x0030: 0x00007fffffffdf18  →  0x00007fffffffe25a  →  “/home/dev/x86_64/simple_bof/vulnerable” 0x00007fffffffde40│+0x0038: 0x0000000200000000 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────    0x5555555551a6 <vuln_func+42>   call   0x555555555050 strcpy@plt    0x5555555551ab <vuln_func+47>   nop        0x5555555551ac <vuln_func+48>   leave    → 0x5555555551ad <vuln_func+49>   ret     [!] Cannot disassemble from $PC ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ──── [#0] Id 1, Name: “vulnerable”, stopped 0x5555555551ad in vuln_func (), reason: SIGSEGV ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ──── [#0] 0x5555555551ad → vuln_func() ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── gef➤   Now if you look at the output, this is the same as we have already seen with the coredump. 8 As are overwriting RBP. But we have passed 300 As and we don’t know which 8 are among those three hundred As overwriting RBP register.  When exploiting buffer overflows, being able to crash the application is the first step in the process. Using this knowledge, an attacker will begin to understand the exact offsets required to overwrite RIP register to be able to control the flow of the program.

Conclusion

In this article, we discussed what buffer overflow vulnerabilities are, their types and how they can be exploited. We also analyzed a vulnerable application to understand how crashing an application generates core dumps, which will in turn be helpful in developing a working exploit. In the next article, we will discuss how we can use this knowledge to exploit a buffer overflow vulnerability.  

Sources

Buffer Overflow, OWASP Stack-Based Buffer Overflow Attacks: Explained and Examples, Rapid7 What Is a Buffer Overflow, Acunetix