Notifications

No notifications

/Phase 4

Tries (Prefix Trees)

Tries — Prefix Trees

A Trie (pronounced "try") is a tree-like data structure used for efficient string searching and prefix matching.

Structure

  • Each node represents a character
  • Root is empty
  • Path from root to a node spells a prefix
  • Nodes marked as "end" represent complete words

Operations

OperationTimeDescription
Insert wordO(m)m = word length
Search wordO(m)Exact match
Prefix searchO(m)All words with prefix
Delete wordO(m)Mark as non-end

Applications

  • Autocomplete: Search suggestions
  • Spell Checker: Find similar words
  • IP Routing: Longest prefix match
  • Word Games: Scrabble, Boggle solvers
  • Phone Directory: Contact search

Space

Worst case: O(alphabet_size × max_length × num_words). Can be optimized with compressed tries.

On this page

Detailed Theory

A trie (pronounced "try", from retrieval) is the data structure for prefix-based string queries — autocomplete in the search bar, the spell-check underline in your editor, the routing table inside every internet router, the dictionary check inside word games. It's a hash table's underwhelming cousin for exact lookup, but its superpower is "give me everything that starts with…", which a hash table simply can't do efficiently.

What a Trie Actually Is

A trie is a tree where each edge represents a single character and the path from the root to a node spells out a string. A boolean flag at each node says "a word ends here". The root is the empty string.

` (root) / | \ a b c / \ p n a a / \ p* t*d*t r l s* s* | e*

* = isEndOfWord Words inserted: app, apt, and, apple, bats, cars `

Every node stores:

  • A map of children (character → child node).
  • An isEndOfWord flag.
Note carefully: \isEndOfWord\ is independent of "has children". "app" can be a word AND have descendants like "apple" — the node for "app" has both \isEndOfWord = true\ AND a child 'l'.

Daily Mental Model: The Phone Book Index Tabs

Picture a paper dictionary with thumb-tabs: A, B, C, …. Inside the A section there are sub-tabs Aa, Ab, …, Ap. Inside Ap you have App, Apt, … At each level you skip to the right tab in O(1) and the depth equals the length of the prefix you've typed. That's exactly what a trie does — except the "tabs" are pointers in a hash map.

Beginner Mistakes to Skip

1. Conflating "has children" with "isEndOfWord". "app" being a word doesn't preclude "apple". Always store the flag separately. 2. Allocating a 26-slot array for Unicode. That works for the LeetCode "lowercase English only" problems and breaks the moment a user types "café" or an emoji. For real apps, use a hash map. 3. Forgetting to clean up empty branches on delete. If you only flip the flag back to false, the trie grows forever. Recursive bottom-up cleanup: remove a child if it has no \isEndOfWord\ and no descendants. 4. Confusing trie with suffix tree. A trie of *words* answers "does this prefix exist?". A suffix tree of *one string* answers "does this substring exist?". Different problem, different structure. 5. Reaching for a trie when a HashSet would do. If you only need exact-match dictionary lookup, a hash set is faster and uses less memory. The trie is for *prefix operations*. 6. Storing the full word in every node. Wastes memory. Either reconstruct it from the path during DFS, or store it only at end-of-word nodes if you need it for collection.

Intermediate: Trie vs Hash Table

FeatureTrieHash Table
Exact lookupO(m)O(m) average, O(n) worst
Prefix search ("startsWith")O(m)O(n · m) — must scan every key
Autocomplete (all words with prefix)natural — DFS from prefix nodenot designed for it
Sorted iterationfree (DFS yields lexicographic order)requires separate sort
Worst-case boundO(m) guaranteedO(n) under adversarial inputs
Spacemany pointers per nodecompact

The key sentence: trie performance depends on word length, not dictionary size. With 100k words averaging 8 chars, lookup is O(8). The hash table is also O(8) — but the *prefix query* on the hash table is O(800k).

Intermediate: The Four Operations

All four operations are O(m) where m is the word length. Independent of n (the number of stored words).

`js class Trie { constructor() { this.root = { children: {}, end: false }; }

insert(word) { let node = this.root; for (const ch of word) { if (!node.children[ch]) node.children[ch] = { children: {}, end: false }; node = node.children[ch]; } node.end = true; }

search(word) { const node = this.#walk(word); return !!node && node.end; }

startsWith(prefix) { return !!this.#walk(prefix); }

#walk(s) { let node = this.root; for (const ch of s) { if (!node.children[ch]) return null; node = node.children[ch]; } return node; } } `

Delete is the only one that takes thought: walk down to the end node, set \end = false\, then on the way back up, drop a child whenever it has no end flag and no children of its own.

Intermediate: Children — Array vs Hash Map

Array of 26 (or 256)Hash map
LookupO(1), cache-friendlyO(1) average, hash overhead
Memory26 pointers per node alwaysonly allocated children
Alphabetsmall, fixed (a-z)any (Unicode, mixed case)
Best fordense LeetCode triesreal-world text

The array version is dramatically faster for tight loops over English (e.g., Word Search II on LeetCode) because of cache locality. The hash map version wins everywhere else.

Advanced: Compressed Tries (Patricia / Radix Trie)

A standard trie wastes nodes on long single-child chains — e.g. inserting only "banana" creates 6 nodes in a row, each with one child. A radix trie collapses any chain of single-child nodes into one edge labelled with the whole substring.

` Plain trie: Radix trie: t t e est | / \ s ing * t * `

Used in:

  • Linux kernel routing table (lpc-trie / poptrie)
  • Redis for sorted set indices in some versions
  • Ethereum state tree (Merkle Patricia Trie)
  • Many full-text search indexes
A second variant, the ternary search trie, gives each node three children (less / equal-and-continue / greater) — a memory-friendly hybrid of trie and BST.

Advanced: The Bitwise Trie for Max XOR

For numeric problems involving XOR, treat each integer as its 32-bit binary string and insert it into a binary trie (each node has children 0 and 1).

To find the maximum \a XOR b\ for some \a\ against everything in the trie:

` for each bit of a, from MSB down: prefer the OPPOSITE bit child if it exists (1 XOR 0 = 1, that bit contributes 2^k) otherwise take the same-bit child `

Insert n numbers in O(32 n) and answer each "max XOR with set" query in O(32). This collapses an O(n²) brute-force into O(n).

Advanced: Suffix Tries and Suffix Trees

If you insert every suffix of one string into a trie, you can answer "is P a substring of S?" in O(

P
) — substring search becomes prefix search of the suffix trie.

  • Suffix trie — naive: O(n²) nodes, O(n²) space.
  • Suffix tree — compressed version, O(n) nodes and O(n) build time with Ukkonen's algorithm. Powers longest-repeated-substring, longest-common-substring, and tools like \grep -P\.
For most modern text-search systems, suffix arrays + LCP arrays have replaced suffix trees because they use less memory with comparable performance.

Advanced: Real-World Applications

  • Autocomplete with frequency ranking — at each end-node store usage count; on a prefix query, DFS the subtree, collect end nodes, return the top-K by frequency. Google's search box, IDE intellisense.
  • Spell checker — DFS the trie with edit distance budget (insert/delete/substitute); the trie prunes entire branches the moment the budget would be exceeded.
  • IP routing — longest prefix match — bitwise radix tries store CIDR blocks; routers walk the trie bit-by-bit and take the deepest match. Cisco/Juniper hardware uses a TCAM variant of this idea.
  • Word games (Boggle, Scrabble, Wordle solvers) — DFS from each board cell, prune the recursion the instant the prefix isn't in the trie. Cuts the search space by orders of magnitude.
  • T9 / swype keyboards — trie keyed on digit sequences, branches expanded by frequency.
  • Browser URL autocomplete — prefix trie of visited domains.

Advanced: The Interview Pattern Cheat Sheet

  • Implement Trie (insert/search/startsWith) — must-know.
  • Word Search II — build a trie of the target words, DFS the board with the trie as the pruner. Without it, you'd run a separate DFS per word; with it, you do one DFS for all of them.
  • Add and Search Word with \.\ wildcard — DFS that branches into all children when the current char is \.\.
  • Word Break — DP \dp[i]\ = "can s[0..i) be segmented?", and use the trie to enumerate matching prefixes from each position.
  • Replace Words — for each word, walk the trie until you hit an end-of-word; that's the shortest root prefix.
  • Maximum XOR of Two Numbers in an Array — bitwise trie + greedy opposite-bit traversal.
  • Palindrome Pairs — for each word, store its reverse in a trie; walk the trie and check the unconsumed remainder for palindromicity.

Practice Path

1. Implement \Trie\ with \insert\ / \search\ / \startsWith\. Verify "app" can coexist with "apple" — both \search("app")\ and \search("apple")\ return true. 2. Add an \autocomplete(prefix, k)\ method that returns up to k words starting with \prefix\. Use DFS from the prefix node. 3. Solve Word Search II: build a trie from the word list, then DFS the 2D board pruning whenever the current path leaves the trie. 4. Solve Maximum XOR of Two Numbers in an Array using a bitwise trie. Confirm the O(32 n) bound by comparing against the O(n²) brute force on a 10k-element input.