Notifications

No notifications

/Phase 5

Graph Representation

Graph Representation

How you store a graph impacts performance. Two main approaches:

Adjacency Matrix

A 2D array where matrix[i][j] = 1 if edge exists between i and j.

ProsCons
O(1) edge lookupO(V²) space
Simple implementationWasteful for sparse graphs
Good for dense graphsAdding vertex is expensive

Adjacency List

Each vertex stores a list of its neighbors.

ProsCons
O(V + E) spaceO(V) edge lookup
Efficient for sparseSlightly complex
Easy to iterate neighbors

When to Use What?

  • Dense graph (many edges): Matrix
  • Sparse graph (few edges): List
  • Most real-world graphs are sparse → use adjacency list

On this page

Detailed Theory

Once you've decided "this is a graph problem", the very next decision is how to store it in memory. This choice silently determines whether your algorithm runs in microseconds or melts the JVM. Get the representation right, and BFS / DFS / Dijkstra all become 10-line functions. Get it wrong, and you'll be debugging memory blowups instead of solving the problem.

What "Representation" Actually Means

A graph is an abstract object — a set of vertices and a set of edges. To compute with it, you need a concrete data layout: which array, which hash map, which pointers. The three you'll see 99% of the time:

` Adjacency Matrix Adjacency List Edge List A B C D A → [B, C] (A,B) A [ 0 1 1 0 ] B → [A, D] (A,C) B [ 1 0 0 1 ] C → [A] (B,D) C [ 1 0 0 0 ] D → [B] D [ 0 1 0 0 ] `

The same graph, three layouts, three completely different performance profiles.

Daily Mental Model: Phonebook vs Spreadsheet

  • Adjacency matrix = a giant N×N spreadsheet where cell (i,j) tells you "is i connected to j?". Instant lookup, but mostly empty cells if friendships are rare.
  • Adjacency list = a phonebook: under each person's name, the list of people they know. Compact, but checking "do A and B know each other?" means scanning A's list.
  • Edge list = a flat log of every friendship: a CSV file. Useless for "who knows A?", perfect for "sort all friendships by closeness".
Pick the layout that makes your most frequent operation cheap.

Beginner Mistakes to Skip

1. Defaulting to the adjacency matrix. It feels simpler — \adj[u][v]\ looks clean. But for V = 100,000 you've just asked for a 10-billion-cell array. Use the list. 2. Forgetting that an undirected edge needs two list entries. \addEdge(u,v)\ must push v into adj[u] AND push u into adj[v]. Otherwise BFS from one side won't see the other. 3. Using objects with non-numeric keys when V is dense and 0..N-1. A plain array of arrays beats a hash map of arrays — better cache behaviour and simpler code. 4. Storing weights in the wrong place. With a list, each entry should be an object/tuple \{to, weight}\ or \[to, weight]\. Don't sneak weights into a parallel array indexed by edge order — they'll desync the moment you add a vertex. 5. Building an explicit graph for grid problems. A 1000×1000 grid is a graph of 1M cells with 4M edges. Don't materialise it — generate the four neighbours on demand inside BFS. 6. Mixing 0-indexed and 1-indexed vertices. Most LeetCode-style inputs are 0-indexed. Many textbook descriptions are 1-indexed. Decide once and stick to it; off-by-ones here are *brutal* to debug.

Intermediate: Adjacency Matrix in Detail

A 2D array \adj[V][V]\ where \adj[i][j]\ is 1 (or weight) if edge exists, 0 (or ∞) otherwise.

`js function buildMatrix(V, edges) { const adj = Array.from({length: V}, () => Array(V).fill(0)); for (const [u, v, w = 1] of edges) { adj[u][v] = w; adj[v][u] = w; // undirected — drop this line for directed graphs } return adj; } `

OperationCost
Edge exists?O(1)
All neighbours of vO(V)
Add vertexO(V²) (resize)
Add edgeO(1)
SpaceO(V²)

When to use it: dense graphs (E close to V²), V ≤ ~1000, or algorithms that genuinely need O(1) edge lookups (Floyd-Warshall is the textbook example).

Intermediate: Adjacency List in Detail

Each vertex stores a list of its neighbours.

`js function buildList(V, edges) { const adj = Array.from({length: V}, () => []); for (const [u, v, w = 1] of edges) { adj[u].push([v, w]); adj[v].push([u, w]); // drop for directed } return adj; } `

For string-labelled vertices, swap the array of arrays for a \Map\:

`js const adj = new Map(); function addEdge(u, v) { if (!adj.has(u)) adj.set(u, []); if (!adj.has(v)) adj.set(v, []); adj.get(u).push(v); adj.get(v).push(u); } `

OperationCost
Edge exists?O(deg(u))
All neighbours of vO(deg(v))
Add vertexO(1)
Add edgeO(1)
SpaceO(V + E)

When to use it: the answer to "which representation should I default to?" 90% of the time.

Intermediate: Edge List in Detail

A flat array of triples \[u, v, w]\.

`js const edges = [ [0, 1, 5], [1, 2, 3], [0, 2, 8], ]; `

OperationCost
Edge exists?O(E)
All neighbours of vO(E)
Add edgeO(1)
SpaceO(E)
Sort by weightO(E log E) — natural

When to use it: Kruskal's MST (you sort by weight and process edges in order) and Bellman-Ford (you iterate every edge V−1 times). Both consume edges sequentially and never query "neighbours of v".

Intermediate: How to Store Weights in Each

RepresentationLayout
Matrix\adj[u][v] = weight\, use \Infinity\ (or \Number.MAX_SAFE_INTEGER\) for "no edge"
Listeach entry is \[neighbour, weight]\ or \{to, weight}\
Edge listeach entry is \[u, v, weight]\

A common subtle bug: using \0\ to mean "no edge" in the matrix. Then a real zero-weight edge becomes invisible. Always reserve a sentinel like \Infinity\.

Advanced: Implicit Graphs — Don't Build What You Can Generate

Many "graph" problems never construct an actual graph object. The graph is implicit in the problem.

  • Grid BFS: vertices are cells \(r,c)\; neighbours are \(r±1, c)\ and \(r, c±1)\ generated on the fly. A standard 4-direction array \[[1,0],[-1,0],[0,1],[0,-1]]\ does the job.
  • Word ladder: vertices are words; two words are neighbours iff they differ by one character.
  • State-space search: each game state is a vertex; legal moves generate neighbours.
  • Number transformation puzzles: each integer is a vertex; allowed operations generate the next states.
The win: zero memory for graph storage. The trade-off: you must compute neighbours efficiently — for word ladder, pre-build a "pattern → words" map (\h*t → hat, hot, hit\) so neighbour generation is O(L · 26) instead of O(N · L) per vertex.

Advanced: Compressed Sparse Row (CSR) — How Real Graph Libraries Store Edges

For massive sparse graphs (10⁹ edges), even an adjacency list with per-vertex JS arrays wastes memory on object headers and pointer indirection. CSR packs everything into two flat arrays:

` neighbours = [1, 2, 0, 3, 0, 3, 1, 2] // all neighbour ids concatenated offsets = [0, 2, 4, 6, 8] // where each vertex's slice starts `

To get neighbours of vertex \v\: slice \neighbours[offsets[v] .. offsets[v+1])\. Constant cache misses, no pointer chasing, dense in memory. This is how NetworkX, graph-tool, CUDA graph kernels, Pregel, and Apache Giraph store their data.

Trade-off: it's read-only-friendly. Inserting a new edge means rebuilding the offset array. Not for streaming graphs; ideal for analytics on a frozen snapshot.

Advanced: Object-Oriented Graphs (Node + Edge Classes)

When edges and vertices carry rich metadata (cost, capacity, reverse-edge pointer, residual capacity, …), the OOP model is cleanest:

`js class Node { constructor(id) { this.id = id; this.edges = []; } } class Edge { constructor(from, to, weight) { this.from = from; this.to = to; this.weight = weight; } } `

Used in: max-flow implementations (where each edge needs a reference to its reverse for residual updates), road-network simulators, knowledge-graph engines.

Advanced: Memory Reality Check

Quick math for V = 10,000:

  • Matrix (32-bit ints): 10⁴ × 10⁴ × 4 bytes = 400 MB.
  • Adjacency list with 50,000 edges: ~50,000 entries × ~24 B (V8 small array overhead) ≈ 1.2 MB.
  • CSR for the same: 50,000 × 4 + 10,001 × 4 ≈ 240 KB.
For V = 1,000,000 (think Twitter follow graph), the matrix isn't an option — you'd need 4 TB. List or CSR is the only path.

Advanced: Choosing the Right Representation per Algorithm

AlgorithmBest representationWhy
BFS / DFSadjacency listiterates neighbours O(deg)
Dijkstra (heap)adjacency listiterates outgoing edges per relaxation
Bellman-Fordedge listscans all edges V−1 times
Floyd-Warshalladjacency matrixconstant-time \dist[i][j]\ access
Kruskal's MSTedge listsort by weight, then iterate
Prim's MSTadjacency listgrow the tree neighbour-by-neighbour
Max-flow (Edmonds-Karp / Dinic)OOP edges with reverse pointerresidual graph mutation
Grid BFS / DFSimplicit (no graph)direction arrays

Practice Path

1. Write three constructors — \buildMatrix\, \buildList\, \buildEdgeList\ — and confirm they all describe the same 6-vertex graph by printing neighbours of every vertex. 2. Convert between representations: write \listFromMatrix\, \matrixFromList\, \edgeListFromList\. Verify round-trips preserve the graph. 3. Solve "number of islands" on a grid using the implicit-graph BFS pattern with a direction array — never construct an explicit graph. 4. Build a CSR representation from an adjacency list and write \neighbours(v)\ against it. Compare memory footprint via \process.memoryUsage()\ for a 100k-vertex random graph against the plain adjacency-list version.