Last 30 Days
No notifications
These structures solve specific problems much more efficiently than basic ones.
Once the basics (arrays, lists, hash tables, BSTs) hit their limits, specialised structures take over: segment trees for range queries, Union-Find for dynamic connectivity, LRU caches for hot-data eviction, Bloom filters for "probably-yes/definitely-no" membership, and skip lists for ordered storage with simple code. These are the data structures that show up in databases, operating systems, browsers, Redis, and almost every "design X" interview question.
These structures fix specific weaknesses of the basic ones:
| Basic structure | Limitation | Advanced fix |
| array | O(n) range sum | Fenwick / segment tree |
| array | O(n) range min/max | segment tree |
| BST | degenerates to O(n) | AVL / Red-Black |
| hash table | no order, no range | Skip list / sorted map |
| set | huge memory for "is x in?" | Bloom filter |
| map | unbounded growth | LRU cache |
| string | O(n) insert in middle | Rope |
Think of each as the right tool for one specific job — not general-purpose like the basics.
You don't carry a segment tree around for fun. You reach for it when you have range queries + updates on millions of elements. You reach for Union-Find when "are these two things in the same group, dynamically?" comes up. The skill is pattern recognition — reading a problem and noticing "oh, that's an LRU pattern" or "that's a range-sum pattern".
1. Building a segment tree for read-only queries. A simple prefix-sum array does range-sum queries in O(1) if there are no updates. Don't reach for the heavy tool. 2. Using Union-Find without path compression *and* union by rank. Either alone gives O(log n); both together give O(α(n)) ≈ O(1). Skipping one is the most common Union-Find mistake. 3. Sizing the segment tree array as 2n. It must be 4n to be safe for any tree shape. Using 2n works only when n is a power of 2. 4. Treating a Bloom filter as an exact set. It can return *false positives* — "yes, x is in" sometimes lies. It never returns false negatives. Use only when you can tolerate the false-positive rate (or back it with the real source). 5. Building an LRU cache without a doubly linked list. A singly linked list makes "remove this specific node" O(n), defeating the whole point. Hash map → DLL node is the canonical layout. 6. Forgetting lazy propagation on segment trees with range updates. Without it, range update is O(n log n). With lazy propagation, it's O(log n).
`js
class UnionFind {
constructor(n) {
this.parent = Array.from({length: n}, (_, i) => i);
this.rank = new Array(n).fill(0);
}
find(x) { // path compression
if (this.parent[x] !== x) {
this.parent[x] = this.find(this.parent[x]);
}
return this.parent[x];
}
union(x, y) { // union by rank
const rx = this.find(x), ry = this.find(y);
if (rx === ry) return false;
if (this.rank[rx] < this.rank[ry]) this.parent[rx] = ry;
else if (this.rank[rx] > this.rank[ry]) this.parent[ry] = rx;
else { this.parent[ry] = rx; this.rank[rx]++; }
return true;
}
connected(x, y) { return this.find(x) === this.find(y); }
}
`
α(n) ≤ 4 for any n ≤ 10⁸⁰. Effectively constant time. Powers Kruskal's MST, dynamic connectivity, "Number of Islands II", "Accounts Merge", "Friend Circles", percolation simulations.
The classic O(1) get/put cache. Each node has \prev\ and \next\ pointers; the hash map maps keys to nodes.
`
get(key) → look up node in map → unlink → relink at head → return value.
put(key, value)
if exists: update + move to head.
else: create new node at head; if size > capacity, evict tail.
`
Why a doubly linked list? You need O(1) removal of an arbitrary middle node (given its pointer from the map) AND O(1) insertion at the head. A singly linked list makes the removal O(n) because you'd have to scan to find the node before it.
Used in: Linux page cache, Redis (\maxmemory-policy allkeys-lru\), browser cache, Memcached.
`js
class SegTree {
constructor(arr) {
this.n = arr.length;
this.tree = new Array(4 * this.n).fill(0);
this.build(arr, 1, 0, this.n - 1);
}
build(arr, node, l, r) {
if (l === r) { this.tree[node] = arr[l]; return; }
const m = (l + r) >> 1;
this.build(arr, 2*node, l, m);
this.build(arr, 2*node + 1, m + 1, r);
this.tree[node] = this.tree[2*node] + this.tree[2*node + 1];
}
query(ql, qr, node = 1, l = 0, r = this.n - 1) {
if (qr < l || r < ql) return 0;
if (ql <= l && r <= qr) return this.tree[node];
const m = (l + r) >> 1;
return this.query(ql, qr, 2*node, l, m)
+ this.query(ql, qr, 2*node + 1, m + 1, r);
}
update(i, val, node = 1, l = 0, r = this.n - 1) {
if (l === r) { this.tree[node] = val; return; }
const m = (l + r) >> 1;
if (i <= m) this.update(i, val, 2*node, l, m);
else this.update(i, val, 2*node + 1, m + 1, r);
this.tree[node] = this.tree[2*node] + this.tree[2*node + 1];
}
}
`
| Operation | Complexity |
| Build | O(n) |
| Point update | O(log n) |
| Range query | O(log n) |
| Range update + lazy | O(log n) |
Replace \+\ with \min\, \max\, \gcd\, XOR, etc. — anything associative works.
Naive range update touches every leaf — O(n). Lazy propagation stores pending operations on internal nodes and pushes them down only when needed.
`
update(l, r, val):
if completely inside: apply to node, mark lazy, return.
push(node) // propagate any pending updates to children
recurse on children
combine(node) = combine(left, right)
`
Now range updates are O(log n) too. This is what makes segment trees competitive-programming gold.
Same prefix-sum + point-update problem, much simpler code, smaller constant factor.
`js
class BIT {
constructor(n) { this.n = n; this.t = new Array(n + 1).fill(0); }
update(i, delta) { // i is 1-indexed
for (; i <= this.n; i += i & -i) this.t[i] += delta;
}
prefix(i) {
let s = 0;
for (; i > 0; i -= i & -i) s += this.t[i];
return s;
}
range(l, r) { return this.prefix(r) - this.prefix(l - 1); }
}
`
The magic is \i & -i\ — extracts the lowest set bit, telling you the parent (going up) or ancestor (going down).
| Feature | Fenwick (BIT) | Segment Tree |
| Code length | ~10 lines | ~40 lines |
| Space | n + 1 | 4n |
| Range update | tricky (needs two BITs) | natural with lazy |
| Operations | invertible only (sum, XOR) | any associative (min, max, gcd) |
| Constant factor | smaller | larger |
Rule of thumb: prefix sums + point updates → BIT. Range min/max or range updates → segment tree.
`
init: bit array of size m, k hash functions.
insert(x): set bits h_1(x), …, h_k(x) to 1.
contains(x):
if any of h_1(x)..h_k(x) is 0 → DEFINITELY not in.
if all are 1 → PROBABLY in (false-positive possible).
`
False-positive rate ≈ \(1 − e^(−kn/m))^k\. Optimal k for given m, n: \k ≈ (m/n) ln 2\. With ~10 bits per element you get ~1% false positives.
Real uses:
Multiple linked lists at different levels of granularity. Each element is randomly promoted to higher levels with p = 1/2.
`
Level 3: ──→ A ───────────→ E ───→ end
Level 2: ──→ A → B ───────→ E ───→ end
Level 1: ──→ A → B → C → D → E ───→ end
Level 0: ──→ A → B → C → D → E → F (all elements)
`
Search starts at the top level and drops down — average O(log n) per operation, worst case O(n) (rare). Simpler than balanced BSTs and famously easy to make lock-free for concurrent access. Used in Redis sorted sets (ZADD/ZRANGE) and LevelDB / RocksDB memtables.
| Property | AVL | Red-Black | |||||
| Balance | strict (height differ ≤ 1) | relaxed (no path > 2× any other) | |||||
| Lookup | slightly faster | slightly slower | |||||
| Insert/delete | more rotations | fewer rotations | |||||
| Used in | DBs, file systems | C++ \std::map\, Java \TreeMap\, Linux scheduler (CFS uses RB tree) | Rule of thumb: read-heavy → AVL. Write-heavy → Red-Black. Both are O(log n) for everything. Advanced: Rope — A Tree of String FragmentsStandard strings are O(n) to insert/delete in the middle. A rope stores text as a balanced binary tree of small fragments. Each internal node caches the total length of its left subtree. | Operation | String | Rope | |
| concat | O(n) | O(log n) | |||||
| insert at i | O(n) | O(log n) | |||||
| delete range | O(n) | O(log n) | |||||
| index access | O(1) | O(log n) | Used by: VS Code's text buffer (a piece-tree variant), Emacs gap buffer evolved variants, IntelliJ IDEA, collaborative editors (Etherpad, Google Docs CRDTs build on similar ideas). Advanced: Comparison + When-to-Use Cheatsheet | Structure | Operation | Time | Reach for it when… |
| Union-Find | union/find | O(α(n)) | dynamic connectivity, MST, "are these in the same group?" | ||||
| Segment tree | range query + update | O(log n) | range min/max/sum WITH updates | ||||
| Fenwick (BIT) | prefix sum + point update | O(log n) | simpler range sums, no range updates | ||||
| LRU cache | get/put | O(1) | bounded cache with recency eviction | ||||
| Bloom filter | contains | O(k) | huge set + false-positives ok | ||||
| Skip list | search/insert/delete | O(log n) avg | sorted set with simple, concurrent code | ||||
| AVL / Red-Black | search/insert/delete | O(log n) | sorted map (use language standard library) | ||||
| Rope | concat/insert | O(log n) | giant editable text | Advanced: Interview Frequency | Structure | Frequency | Typical question |
| LRU Cache | very high | LeetCode 146 — implement | |||||
| Union-Find | very high | Number of Islands II, Accounts Merge, Redundant Connection | |||||
| Segment tree | medium | Range Sum Query - Mutable, Count of Smaller Numbers After Self | |||||
| Fenwick tree | medium | Count of Smaller Numbers After Self (alternative) | |||||
| Bloom filter | system design | "Design URL shortener / spam filter" | |||||
| Skip list | system design / rare coding | "Design Redis sorted set" | |||||
| Rope | rare | "Design a text editor" | |||||
| AVL / RB | conceptual | usually use language built-in |
1. Implement Union-Find with path compression + union by rank. Use it to count islands after a stream of "add land at (r,c)" operations (Number of Islands II). 2. Implement LRU Cache with O(1) get/put using a hash map + doubly linked list. Verify the eviction order on a sequence of operations. 3. Implement a segment tree for range sum with build, point-update, and range-query. Compare correctness against a naive prefix-sum baseline. 4. Implement a small Bloom filter with 3 hash functions on a 1024-bit array. Insert 100 strings, check membership for 200 (100 inserted + 100 random), and measure the false-positive rate.