A buffer overflow writes data past the end of an allocated buffer, overwriting adjacent memory including the return address on the stack. Historically this let attackers inject shellcode directly. Modern mitigations (NX/DEP, ASLR, stack canaries) make direct code injection nearly impossible. Return-oriented programming (ROP) bypasses all of them by chaining together small snippets of existing legitimate code rather than injecting new code.
Analysis Briefing
- Topic: Buffer overflow exploitation and return-oriented programming
- Analyst: Mike D (@MrComputerScience)
- Context: A structured investigation kicked off by Claude Sonnet 4.6
- Source: Pithy Cyborg | AI News Made Simple
- Key Question: If you can’t inject new code, how does an attacker execute arbitrary logic using only code that’s already there?
The Classic Stack Buffer Overflow Mechanics
The call stack stores local variables, saved registers, and the return address for the current function. When function_a calls function_b, the CPU pushes the return address (where to jump when function_b returns) onto the stack immediately before function_b‘s local variables.
void vulnerable(char *input) {
char buffer[64];
strcpy(buffer, input); // No bounds check. Overflow if input > 64 bytes.
// ...
}
If input is longer than 64 bytes, strcpy writes past the end of buffer and overwrites the saved return address. When vulnerable executes its ret instruction, the CPU loads the overwritten value as the return address and jumps to whatever address the attacker wrote.
NX (No-Execute) / DEP (Data Execution Prevention) marks the stack as non-executable. The CPU raises a fault if execution ever reaches the stack. Injected shellcode on the stack cannot run.
ASLR (Address Space Layout Randomization) randomizes where the stack, heap, and libraries are loaded. The attacker doesn’t know the address to jump to.
Stack canaries place a random value between local variables and the return address. The function checks the canary before returning. If it’s been modified, the program aborts.
How ROP Bypasses All Three Mitigations
ROP does not inject new code. It chains together existing code snippets ending in ret instructions, called gadgets, that are already present in the program’s binary and loaded libraries (libc, etc.).
A gadget looks like this:
pop rdi ; load value into register
ret ; jump to next gadget address
The attacker constructs a ROP chain: a sequence of gadget addresses on the stack. When the corrupted return address triggers, the CPU jumps to the first gadget. The gadget executes its few instructions and hits ret, which pops the next gadget address from the stack and jumps there. Each gadget does a small operation (set a register, make a system call argument). Together they achieve arbitrary computation.
NX is defeated because the gadgets are in legitimate executable code segments. ASLR is partially defeated by information leaks: a memory disclosure vulnerability (format string bug, heap overflow reading adjacent memory) reveals the base address of a loaded library, allowing the attacker to calculate all gadget addresses at runtime. Stack canaries can be bypassed if the attacker can read the canary value before overwriting it.
The attacker typically aims to call execve("/bin/sh", ...) through ROP gadgets, spawning a shell. Finding gadgets is automated by tools like ROPgadget and pwntools.
Modern Defenses Against ROP
Control Flow Integrity (CFI) instruments the binary to enforce that indirect branches and returns only jump to valid targets. Coarse CFI (Microsoft’s CFG) prevents returns from jumping to arbitrary code addresses. Fine-grained CFI (LLVM’s CFI, Intel CET) validates that the target of every indirect call matches the expected function signature.
Shadow stacks (Intel CET, Arm PAC) maintain a separate protected copy of return addresses. Before executing ret, the CPU compares the stack return address against the shadow stack copy. If they differ, the CPU raises a fault. This directly prevents ROP chain execution because the attacker’s chain overwrites the regular stack but cannot reach the shadow stack.
SafeStack (LLVM) separates the stack into a safe stack (return addresses, non-addressable locals) and an unsafe stack (buffers that could overflow). Buffer overflows cannot reach return addresses because they’re on a separate stack.
What This Means For You
- Use memory-safe languages for new code wherever the performance requirements permit it, because Rust, Go, and Java eliminate entire classes of memory corruption vulnerabilities at the language level rather than relying on mitigations.
- Enable all available mitigations when compiling C and C++:
-fstack-protector-strongfor canaries,-D_FORTIFY_SOURCE=2for bounds-checked libc wrappers,-fcf-protectionfor Intel CET on supported hardware. - Treat memory disclosure vulnerabilities as critical, not just moderate, because an information leak that reveals a library base address converts ASLR from a strong mitigation to a minor inconvenience and enables ROP chains that would otherwise require brute force.
- Fuzz test all code that processes untrusted input with tools like AFL++ or libFuzzer before shipping, because buffer overflows in parsing code are consistently the entry point for ROP exploits and fuzzers find them faster than manual review.
Enjoyed this deep dive? Join my inner circle:
- Pithy Cyborg | AI News Made Simple → AI news made simple without hype.
