Notifications

No notifications

Memory Basics

Understanding how data is stored in memory is crucial for understanding why different data structures perform differently.

Types of Memory Allocation

Contiguous Memory (Arrays)

  • Elements stored side by side in memory
  • Fast access via index: address = base + (index × size)
  • Problem: resizing requires copying entire array
Non-Contiguous Memory (Linked Lists)
  • Elements stored at scattered locations
  • Each element has a pointer to the next
  • Easy to insert/delete, but slow random access

Stack vs Heap Memory

  • Stack: Fast, automatic, limited size (local variables, function calls)
  • Heap: Slower, manual/GC managed, large (dynamic allocation)

Pointers & References

  • A pointer stores a memory address
  • A reference is an alias to another variable
  • In JavaScript, objects are passed by reference, primitives by value

Memory Layout

` High Address → Stack (grows down ↓) ... Heap (grows up ↑) Data Segment Low Address → Code Segment `

On this page

Detailed Theory

Before any data structure makes sense, you need a basic mental model of *where* values live in a running program. This page builds that model from the silicon up — starting with "memory is a row of numbered lockers" and ending with cache locality, garbage collection, and the reasons real-world arrays beat real-world linked lists despite identical Big O.

What "Memory" Actually Is

A computer has a long line of bytes called RAM. Every byte has a numbered address (0, 1, 2, ...). When you write let x = 42, the runtime picks an address and writes the bytes for 42 there. x is just a label that remembers the address.

Locker analogy: RAM is a row of numbered lockers. A 32-bit integer needs 4 lockers in a row. The "address" of the integer is just the number of the *first* locker.

That's it. Everything else (stacks, heaps, pointers, GC, cache) is built on top of that one idea.

Stack vs Heap — The Most Important Split

Every running program has two main memory areas:

StackHeap
What lives herelocal variables, function args, return addressesobjects, arrays, anything from new / malloc
Lifetimeuntil the function returnsuntil you (or GC) free it
Speedvery fast (just bump a pointer)slower (allocator searches for a free block)
Sizesmall (~1–8 MB per thread)large (most of RAM)
Layoutstrictly LIFO, contiguous framesscattered, can fragment
Managed bycompiler / runtime, automaticallyyou (C/C++) or a garbage collector (JS, Java, Python, Go)

Stack overflow = recursing too deep, blowing past the stack's small limit. Out of memory = filling up the heap.

Primitives vs References

In most managed languages:

`js let x = 42; let y = x; // y gets a COPY of 42 — independent y = 100; console.log(x); // 42

let a = [1, 2, 3]; // a is really a *pointer* to a heap array let b = a; // b copies the pointer, NOT the array b.push(4); console.log(a); // [1, 2, 3, 4] ← same array! `

The "two variables magically updating together" trap is just: both names hold the same heap address. Once you see this, "pass by reference vs pass by value" stops being mysterious.

Beginner Mistakes to Skip

1. Thinking variables "contain" objects. They don't — they contain *addresses* of objects. 2. Confusing "shallow" and "deep" copy. b = a copies the pointer; b = [...a] copies the top-level cells; nested objects are *still shared*. 3. Setting a variable to null to "free memory". In a GC'd language, that only helps if it removes the *last* reference. Otherwise the GC ignores you. 4. Treating arrays and linked lists as identical. Both store sequences. Memory layout is wildly different and so is performance. 5. Forgetting the call stack uses memory too. Recursion of depth 10⁶ will crash even if you allocate nothing. 6. "Pointers don't exist in JavaScript / Python." They do — every object reference *is* a pointer. The language just hides the syntax.

Intermediate: Contiguous vs Non-Contiguous

Arrays — contiguous. All cells are stored back-to-back. Element i lives at address base + i × sizeof(element). That formula is why arr[i] is O(1) — no traversal, just arithmetic.

Linked lists — non-contiguous. Each node is allocated separately and points to the next. Nodes can be anywhere in the heap. Reaching the i-th element means following i pointers — O(n).

ArrayLinked list
Random access a[i]O(1)O(n)
Insert/delete at endO(1) amortisedO(1)
Insert/delete in middleO(n) (shift elements)O(1) (if you have the node)
Memory overhead per elementtinyone pointer per node (often 8 bytes)

Intermediate: Cache & Locality (Why Big O Lies Sometimes)

CPUs are way faster than RAM. To hide that, they keep small fast caches (L1, L2, L3) and load memory in cache lines of ~64 bytes at a time. So when you read one byte, the next 63 come along *for free*.

  • Spatial locality — accessing X means X+1, X+2, ... are likely cheap.
  • Temporal locality — recently accessed memory stays in cache.
Real-world consequence: an array (contiguous) plays beautifully with the cache. A linked list (scattered) causes a cache miss almost every .next — a miss can cost ~100× more than a hit. That's why a "slow" O(n) array often beats a "fast" O(1) linked list for small-to-medium n.

> Big O ignores constants. Caches are basically a giant constant. Both matter.

Intermediate: The Memory Map of a Process

` high addr ┌────────────────────┐ │ Stack (grows ↓) │ local vars, return addrs │ ↓ ↑ │ free space │ Heap (grows ↑) │ malloc / new / objects ├────────────────────┤ │ BSS │ uninitialised globals │ Data │ initialised globals/statics │ Code (Text) │ the compiled instructions low addr └────────────────────┘ `

Knowing this helps you read crash logs, segfaults, and profilers without panic.

Advanced: Garbage Collection in 90 Seconds

GC's job: find heap memory nothing references anymore, free it.

  • Reference counting (Python, Swift) — every object holds a "how many refs point at me" count. When it hits 0, free. Simple, but circular references leak (A↔B keeps each other's count > 0). Python adds a cycle detector to save you.
  • Mark-and-sweep (JS V8, Java, Go) — start from "roots" (globals, the stack), follow every pointer reachable, mark them alive; sweep up the rest. Handles cycles correctly but introduces brief pauses.
  • Generational (V8, JVM) — most objects die young. Keep a "young" generation collected often and an "old" generation collected rarely. Massive speed win in practice.
You almost never tune the GC by hand. You just need to know it exists so weird "stop the world" pauses don't surprise you.

Advanced: Memory Leaks (Even With GC)

GC frees the *unreachable*. If you keep a reference to something you don't need, the GC can't help.

Classic leaks in JS/web apps:

  • Event listeners never removed (addEventListener without removeEventListener).
  • Closures capturing huge objects: a 5-line callback keeps a 50 MB array alive.
  • Detached DOM nodes still referenced from JS variables.
  • setInterval callbacks never cleared.
  • Module-level caches that grow forever.
Tools: Chrome DevTools → Memory → take a heap snapshot, look for unexpected retainers. WeakMap / WeakSet let you cache without preventing GC.

Advanced: Alignment, Padding, Word Size

CPUs read memory most efficiently at addresses that are multiples of the data size. Compilers therefore insert padding to align fields:

`c struct Example { char a; // 1 byte + 3 padding int b; // 4 bytes (aligned to 4-byte boundary) char c; // 1 byte + 3 padding }; // total = 12, not 6 `

Reorder largest fields first to minimise padding. Doesn't matter day-to-day in JS, matters a lot in C/C++/Rust and any GPU/embedded work.

Advanced: Virtual Memory & Pages

Each process sees its own clean address space, courtesy of the OS:

  • The OS maintains page tables mapping virtual → physical addresses.
  • Memory is split into pages (typically 4 KB).
  • When RAM fills up, idle pages get swapped to disk (incredibly slow — never want to depend on this).
  • Two processes can't accidentally read each other's memory: protection.
  • A page fault = "first time touching this page" — sometimes cheap, sometimes a full disk read.
This is why a malloc may seem instant: actual physical memory is often only allocated when you *write* to it.

Advanced: Memory-Efficient Code (Practical Wins)

1. Prefer arrays / contiguous structures for hot loops — cache-friendly. 2. Pick the smallest dtype that holds your data — int32 over int64, float32 over float64 for ML weights. 3. Reuse buffers in tight loops instead of allocating fresh ones. 4. Beware string immutability — s = s + "x" in a loop is O(n²). Use a list/builder + join once. 5. Use typed arrays in JS (Int32Array, Float64Array) for numeric data — orders of magnitude leaner than number[]. 6. Drop big references early. Set huge intermediate caches to null once you're done. 7. Profile before micro-optimising. Memory bugs come from the *one* line you didn't suspect; intuition is wrong more often than not.

Practice Path

1. Write let a = [1,2,3]; let b = a; b.push(4); — predict a, then run it. Now do the same with let b = [...a]. Explain the difference using addresses. 2. Implement a 10⁶-element array sum and a 10⁶-node linked-list sum. Time both. Explain the gap using cache lines. 3. Cause and then fix a tiny memory leak: a setInterval that captures a huge array; clear it correctly. 4. Read 5 lines of any C struct, count its real size including padding, and verify with sizeof.