Every process on a modern operating system runs in its own virtual address space. The OS and CPU hardware cooperate to translate virtual addresses to physical RAM transparently, isolating processes from each other and enabling each to behave as though it has exclusive access to the full address space. The mechanism is page tables, maintained by the OS and walked by hardware on every memory access.
Analysis Briefing
- Topic: Virtual memory, page tables, and memory protection in operating systems
- Analyst: Mike D (@MrComputerScience)
- Context: A technical briefing developed with Claude Sonnet 4.6
- Source: Pithy Cyborg | AI News Made Simple
- Key Question: How does the CPU know which physical memory address corresponds to the virtual address in your program?
The Address Translation Mechanism
Virtual addresses in a 64-bit process are typically 48 bits wide (the upper 16 bits are unused on x86-64). The hardware divides these 48 bits into a 4-level page table index plus a 12-bit page offset.
The CPU’s Memory Management Unit (MMU) holds a register called CR3 that points to the top-level page table for the currently running process. On a context switch, the OS updates CR3 to point to the new process’s page tables. This is the hardware mechanism that enforces process isolation.
A page table walk on x86-64 proceeds through four levels:
- PML4 (Page Map Level 4): indexed by bits 47-39
- PDPT (Page Directory Pointer Table): indexed by bits 38-30
- PD (Page Directory): indexed by bits 29-21
- PT (Page Table): indexed by bits 20-12
Each level contains 512 entries of 8 bytes each, fitting precisely in a 4KB page. The final page table entry contains the physical page frame number. The 12-bit offset from the virtual address is appended to get the final physical address.
The TLB: Why Page Table Walks Are Not Slow in Practice
A full four-level page table walk requires four memory accesses before the actual data access. That would make every memory operation five times more expensive. The Translation Lookaside Buffer (TLB) caches recent virtual-to-physical translations in dedicated hardware.
The L1 TLB on a modern CPU holds 64 to 128 entries for 4KB pages. A TLB hit costs 0 additional cycles. A TLB miss triggers a hardware page table walk (on x86, the Page Walker hardware unit does this without OS involvement) and costs 40 to 200 cycles depending on how deep in cache the page tables reside.
TLB pressure is a real performance concern for workloads with large, randomly accessed memory. A program accessing 1GB of data with random access patterns will generate frequent TLB misses. Huge pages (2MB or 1GB page sizes) reduce TLB pressure dramatically by covering more memory per TLB entry.
# Enable transparent huge pages for a memory-intensive process
echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
# Or use madvise in application code
madvise(ptr, size, MADV_HUGEPAGE);
Page Faults, Demand Paging, and Memory Protection
When the CPU walks the page table and finds an entry marked not-present, it raises a page fault exception. The OS page fault handler runs in kernel mode, determines why the page is missing, and either services the fault or sends SIGSEGV.
Demand paging uses this mechanism intentionally. When a process calls malloc or maps a file, the OS does not immediately allocate physical pages. It creates virtual memory mappings with not-present page table entries. Pages are only allocated when first accessed, at which point the page fault handler allocates a physical page and updates the page table entry.
This is why a process that allocates 4GB but only touches 100MB uses only 100MB of physical RAM. It is also why the first access to any page is slower than subsequent accesses.
Memory protection bits in each page table entry enforce read, write, and execute permissions. A stack page is readable and writable but not executable. A code page is readable and executable but not writable (enforcing W^X). Attempting to write to a read-only page raises a protection fault. This is the hardware mechanism behind NX/XD bits that prevent code injection attacks by marking the stack and heap non-executable.
Copy-on-write (COW) uses page protection for efficient process forking. When a process calls fork(), the OS marks all pages in both parent and child as read-only. Reads proceed normally. The first write from either process triggers a protection fault. The fault handler copies the page, marks the copy writable, and updates the page table. The copy only happens when needed.
What This Means For You
- Use huge pages for workloads with large, randomly accessed memory, because reducing TLB miss rate on a database or numerical computing workload can improve throughput by 10 to 30% without changing any algorithmic complexity.
- Understand that virtual memory size and RSS (resident set size) are different numbers, and that your monitoring should track RSS rather than virtual size because only resident pages consume physical RAM.
- Use
mmapwithMAP_POPULATEwhen you need predictable first-access latency, because demand paging defers physical allocation and the first access to each page incurs a fault that adds unpredictable latency spikes in latency-sensitive code. - Enable Address Space Layout Randomization (ASLR) and W^X on any system running untrusted code, because these two features together make stack and heap injection attacks dramatically harder by randomizing where code and data land in the virtual address space.
Enjoyed this deep dive? Join my inner circle:
- Pithy Cyborg | AI News Made Simple → AI news made simple without hype.
