Calling free() does not return memory to the operating system. It returns memory to your allocator’s internal free list, where it sits available for future malloc() calls within the same process. The OS only reclaims pages when the allocator decides the process holds too much idle heap, which may never happen during a long-running process’s lifetime.
Pithy Cyborg | AI FAQs – The Details
Question: What actually happens to memory when you call free() in C — and why doesn’t the OS get it back immediately?
Asked by: Claude Sonnet 4.6
Answered by: Mike D (MrComputerScience)
From Pithy Cyborg | AI News Made Simple
And Pithy Security | Cybersecurity News
What Your Allocator Actually Does With a Freed Pointer
When you call free(ptr), three things happen in sequence, and none of them involve the OS.
First, the allocator reads metadata stored adjacent to your allocation, typically in a header block just before the pointer address, to determine the size of the freed region. This metadata is why writing past the end of a malloc’d buffer is so dangerous: you corrupt the allocator’s bookkeeping, and the crash appears somewhere completely unrelated when the corrupted metadata is later read.
Second, the allocator marks that region as free and inserts it into a free list, a linked list of available memory chunks organized by size class. glibc’s ptmalloc, jemalloc, and tcmalloc all use variations of this approach with per-thread caches to reduce lock contention. The freed memory is now available for the next malloc() call that requests a similarly-sized chunk.
Third, the allocator may attempt to coalesce adjacent free blocks into a single larger block. This combats heap fragmentation, where the heap contains plenty of total free bytes but no single contiguous region large enough to satisfy a large allocation request. Fragmentation is one of the primary causes of memory growth in long-running C and C++ services, and it is largely invisible to standard profiling tools.
Why the OS Doesn’t See Your free() Call at All
The operating system manages memory in pages, typically 4KB on x86 and 16KB on Apple Silicon. It has no concept of individual malloc() allocations. It only knows which pages are mapped to your process’s virtual address space.
When your process first calls malloc(), the allocator requests pages from the OS using brk() or mmap() system calls. These calls are expensive relative to a typical allocation, so the allocator requests far more pages than it immediately needs, building a heap pool it manages internally. Subsequent malloc() and free() calls operate entirely within this pool, with zero OS involvement.
The OS only gets memory back when the allocator explicitly calls munmap() or sbrk() with a negative argument to shrink the heap. Most allocators do this reluctantly and only under specific conditions: the free region must be at the top of the heap (for brk-based allocation), it must exceed a size threshold (glibc’s default is 128KB), and it must have been idle long enough to be worth the syscall overhead.
This is why a C process that allocates 2GB of memory, frees it all, and sits idle will still show 2GB of virtual memory usage in top or htop. The Volatility 3 Python infostealers forensic analysis community knows this well: a process’s memory map long after a deallocation still contains the ghost of what was allocated, which is exactly how memory forensics recovers data from “freed” regions.
How Modern Allocators Handle This Differently in 2026
Not all allocators behave identically, and the differences matter for server software.
glibc’s ptmalloc is the default on most Linux systems. It is conservative about returning memory to the OS and prone to fragmentation under certain allocation patterns, particularly alternating large and small allocations. Long-running services using ptmalloc commonly show monotonically increasing RSS (resident set size) over days or weeks even with no actual memory leak.
jemalloc, used by Firefox and many high-throughput servers, uses size-segregated arenas and is more aggressive about returning pages to the OS via madvise(MADV_FREE). It handles fragmentation better under workloads with mixed allocation sizes. Meta and others have run it in production for years specifically to control heap growth.
tcmalloc, developed at Google and used across their infrastructure, uses per-thread caches aggressively to eliminate lock contention on multi-core systems. It also supports explicit heap profiling via its built-in heap profiler, making memory leak diagnosis substantially easier than with ptmalloc.
For Rust, the memory model eliminates use-after-free and double-free bugs at compile time, but the allocator behavior underneath is identical: freed memory returns to the allocator’s pool first, not the OS.
What This Means For You
- Never assume free() returns memory to the OS. Monitor RSS and virtual memory separately and understand what each number means for your process.
- Switch allocators before blaming your code. Replacing ptmalloc with jemalloc or tcmalloc has resolved apparent memory leaks in production C++ services without changing a single line of application code.
- Use Valgrind or AddressSanitizer to catch use-after-free bugs, not just memory leaks. Writing to freed memory corrupts allocator metadata and produces crashes far removed from the actual bug site.
- Profile heap fragmentation explicitly with tools like heaptrack or jemalloc’s built-in stats. Fragmentation is invisible to leak detectors but causes real RSS growth in production.
- For security-sensitive data, explicitly zero memory before freeing. Freed memory sits in the allocator’s pool readable by future allocations in the same process, which is a real attack surface in multi-tenant or compromised environments.
Pithy Cyborg | AI News Made Simple
Subscribe (Free): https://pithycyborg.substack.com/subscribe
Read archives (Free): https://pithycyborg.substack.com/archive
Pithy Security | Cybersecurity News
Subscribe (Free): https://pithysecurity.substack.com/subscribe
Read archives (Free): https://pithysecurity.substack.com/archive
