Last 30 Days
No notifications
A BST is a binary tree where for every node: left < node < right.
| Operation | Average | Worst (Skewed) |
| Search | O(log n) | O(n) |
| Insert | O(log n) | O(n) |
| Delete | O(log n) | O(n) |
| Min/Max | O(log n) | O(n) |
A Binary Search Tree is what you get when you combine "binary tree" with "binary search" — a data structure that stays *searchable in O(log n)* but, unlike a sorted array, also lets you insert and delete in O(log n). Every dynamic ordered collection in your standard library — Java's \TreeMap\, C++'s \std::map\, Python's \SortedDict\, the Linux kernel scheduler — is a BST variant under the hood.
For every node \n\:
n.value\.n.value\.`
8 ✓ Valid: every {1,3,4,6,7} < 8 and every {10,14} > 8
/ \
3 10
/ \ \
1 6 14
/ \
4 7
`
A sorted array gives you O(log n) lookup but O(n) insert/delete. A linked list gives you O(1) insert at known position but O(n) lookup. A BST gives you O(log n) for all three — the best of both — by spending O(log n) walks down the tree to maintain order.
Inorder traversal of any BST always gives sorted output. That's the single most useful invariant: when you're stuck, ask "what does inorder give me here?"
1. Validating with parent comparisons only. \node.left.val < node.val && node.right.val > node.val\ is wrong — the property must hold *for the entire subtree*. Use min/max bounds.
2. Forgetting balancing exists. Insert sorted data into a plain BST → degenerate skewed "linked list" → all operations become O(n). A 100k-node skewed tree blows the recursion stack and runs in seconds instead of microseconds.
3. Botching delete. Three cases — leaf, one child, two children — and the two-child case needs the inorder successor. People mix up "swap value, delete successor" with "rebuild whole subtree".
4. Using equality with a tolerance. BSTs assume keys have a strict comparator. Float keys with NaN or epsilon comparisons silently break the property.
5. Confusing "balanced" with "perfect". Balanced means heights differ by ≤ 1 at every node. Perfect means every level full. AVL keeps balanced, not perfect.
Search — O(h)
`js
function search(node, key) {
while (node) {
if (key === node.val) return node;
node = key < node.val ? node.left : node.right;
}
return null;
}
`
Each comparison eliminates ~half the remaining tree (in a balanced BST).
Insert — O(h): search until you hit a \null\ slot, hang the new node there. The structure is preserved automatically.
Delete — O(h), the case that interviewers love:
| Case | Condition | Action |
| 1 | Leaf | Just unlink it |
| 2 | One child | Replace the node with that child |
| 3 | Two children | Find inorder successor (smallest in right subtree), copy its value into this node, then delete the successor (which has at most one child — case 1 or 2) |
Why the inorder successor? It's the smallest value strictly larger than the deleted node — replacing with it preserves "left < node < right" by definition.
This is the single most useful BST identity:
`
inorder(8) → 1, 3, 4, 6, 7, 8, 10, 14
`
Consequences:
next\" = explicit stack simulating inorder.These show up *constantly* in real systems (e.g. "what's the next available time slot?").
Floor(x) — largest key ≤ x:
`js
function floor(node, x, best = null) {
while (node) {
if (node.val === x) return node.val;
if (x < node.val) node = node.left;
else { best = node.val; node = node.right; }
}
return best;
}
`
Ceiling(x) — symmetric.
Inorder successor of a given node:
"Give me all keys in [lo, hi]":
`js
function rangeQuery(node, lo, hi, out = []) {
if (!node) return out;
if (lo < node.val) rangeQuery(node.left, lo, hi, out);
if (lo <= node.val && node.val <= hi) out.push(node.val);
if (node.val < hi) rangeQuery(node.right, lo, hi, out);
return out;
}
`
O(h + k) where k is the number of results. A hash table can't do this efficiently — that's the single biggest reason to pick a BST over a hash map.
Insert 1,2,3,4,5 into a plain BST → straight line. Height = n. All operations O(n). Catastrophic.
AVL trees (Adelson-Velsky and Landis, 1962) — the first self-balancing BST.
height(left) − height(right)\.BF
≤ 1\ at every node.| BF |
| Pattern | Imbalance | Rotation |
| LL | inserted into left-of-left | single right rotation |
| RR | inserted into right-of-right | single left rotation |
| LR | inserted into right-of-left | left-then-right (double) |
| RL | inserted into left-of-right | right-then-left (double) |
Single right rotation (the LL fix):
`
z y
/ \ / \
y T4 → x z
/ \ / \ / \
x T3 T1 T2 T3 T4
/ \
T1 T2
`
AVL gives strict O(log n) and the height is at most ~1.44·log₂ n.
Used by Java \TreeMap\, C++ \std::map\ / \std::set\, the Linux kernel's CFS scheduler.
The five rules (must all hold):
1. Every node is red or black. 2. The root is black. 3. Every NIL leaf is black. 4. A red node's children are both black (no two consecutive reds). 5. Every root-to-NIL path has the same number of black nodes (equal black-height).
Together they cap the height at \2·log₂(n+1)\. Slightly looser balance than AVL means fewer rotations on insert/delete — that's why standard libraries pick RB over AVL.
| AVL | Red-Black | |
| Height bound | 1.44 log₂ n | 2 log₂(n+1) |
| Rotations on insert | up to 2 | up to 2 |
| Rotations on delete | up to log n | up to 3 |
| Search | slightly faster | slightly slower |
| Best for | read-heavy | write-heavy |
In-memory pointer chasing is cheap. Disk seeks are not. A binary tree of 1B keys has height ~30 → 30 disk seeks → milliseconds *each*. Database indexes can't afford that.
B-tree of order m: every internal node has up to m children and m−1 keys. Height becomes \log_m n\ instead of \log₂ n\ — a B-tree of order 100 over 1B keys has height ~5.
B+ tree (the variant databases actually use):
SELECT … WHERE id BETWEEN 100 AND 200\, you've walked a B+ tree.You can attach extra metadata at each node, recomputed on insert/delete, to unlock new queries:
size = 1 + size(left) + size(right)\. Now \select(k)\ (k-th smallest) and \rank(x)\ are both O(log n) — instead of O(k) for inorder.next\, O(h) memory.1. Implement \insert\ and \search\ recursively. Then add \delete\ and handle all three cases — verify on a tree with the two-child case at the root.
2. Write \isValidBST\ using min/max bounds. Stress-test it on a tree where every immediate parent-child pair is fine but a deeper descendant violates the property.
3. Implement \kthSmallest\ two ways: full inorder vs early-stop iterative inorder. Confirm the early-stop is O(h + k).
4. Implement \rangeQuery(lo, hi)\ with the prune-the-recursion trick from the intermediate section. Compare it to scanning a flat array.