Last 30 Days
No notifications
A graph G = (V, E) is a set of vertices (nodes) connected by edges (links). Graphs model relationships like social networks, maps, and dependencies.
| Type | Description |
| Directed | Edges have direction (A → B) |
| Undirected | Edges go both ways (A — B) |
| Weighted | Edges have costs/weights |
| Unweighted | All edges equal |
| Cyclic | Contains cycles |
| Acyclic | No cycles (DAG) |
Graphs are the most expressive data structure in computer science. Trees are graphs. Linked lists are graphs. The internet is a graph. Your friends list, every road map, every dependency in your build system, every lookup in a CDN — all graphs. Once you can comfortably model a problem as "nodes and edges", a huge chunk of "I have no idea where to start" disappears.
A graph G = (V, E) is a set of vertices (nodes) and a set of edges that connect pairs of vertices. That's the entire definition. Everything else — directed/undirected, weighted/unweighted, cyclic/acyclic — is a *property* of the graph, not a different kind of object.
`
A ───── B
│ /│
│ / │
│ / │
D ──C E
`
The vocabulary you'll use over and over:
| Term | Meaning |
| Vertex / node | a single point |
| Edge / arc | connection between two vertices |
| Degree | how many edges touch a vertex |
| Adjacent / neighbour | two vertices joined by an edge |
| Path | sequence of vertices joined by edges |
| Cycle | path that returns to its start |
| Connected component | maximal set of mutually reachable vertices |
| DAG | directed graph with no directed cycle |
The mental picture: a city's subway map. Stations are vertices. Lines between stations are edges. Travel-time is edge weight. "Can I get from A to B?" → connectivity. "What's the fastest route?" → shortest path. "What stations do I have to keep open so the network stays connected?" → articulation points. Almost every graph problem has a subway-map analogue.
1. Confusing directed and undirected. In an undirected graph the edge \{A,B}\ lets you go both ways. In a directed graph, \(A,B)\ does NOT imply \(B,A)\. Half of "my BFS is wrong" bugs come from this.
2. Forgetting the visited set. Without it, BFS/DFS on any graph with a cycle loops forever. *Trees* don't need a visited set. General graphs always do.
3. Treating a tree algorithm as a graph algorithm. A tree of n nodes has exactly n−1 edges and no cycles. A general graph can have up to n(n−1)/2 edges and many cycles. Tree-DFS habits that omit visited tracking break instantly on graphs.
4. Confusing in-degree and out-degree on directed graphs. A "source" has in-degree 0, a "sink" has out-degree 0. Topological sort cares about in-degree.
5. Picking the wrong representation. A 10⁶-node graph with adjacency *matrix* needs 10¹² cells — terabytes of RAM. Adjacency *list* uses O(V+E).
6. Believing a graph is one connected blob. Most real graphs have multiple connected components. Run BFS/DFS in a loop over all unvisited nodes if you need to count components.
Directed (digraph) — edges have a direction. Each edge \(u,v)\ is an ordered pair. Web links, Twitter follows, build dependencies.
Undirected — edges are unordered \{u,v}\ pairs. Friendships on Facebook, road segments, electrical wires.
Weighted — edges carry a number (cost, distance, capacity, latency). Used by Dijkstra, MST, max-flow.
Unweighted — equivalent to weight 1 everywhere. BFS finds shortest paths.
Two identities worth memorising:
| E |
| E |
| Family | Defining property | Where it shows up | ||||
Complete \Kₙ\ | every pair of vertices is connected | small adversarial inputs, theoretical bounds | ||||
| Bipartite | vertices split into two sets U, V; every edge crosses sides | scheduling, matching, "two-coloring" problems | ||||
| Tree | connected, n−1 edges, no cycle | parent-child hierarchy | ||||
| Forest | disjoint union of trees | multiple components | ||||
| DAG | directed, no directed cycle | dependencies, version history, topological sort | ||||
| Planar | drawable without edge crossings | maps, circuit boards | Bipartite checking (very common interview pattern): try to 2-colour the graph with BFS — assign color 0 to start, color 1 to all neighbours, color 0 to their neighbours, etc. If any edge connects two same-colored vertices, it's not bipartite. Theorem: a graph is bipartite iff it has no odd-length cycle. DAG fact: a directed graph has a topological ordering if and only if it is a DAG. Intermediate: Dense vs Sparse — the Representation DecisionSparse: \ | E | ≈ | V |
E ≈ V ²\ — small theoretical examples and adjacency-matrix problems. Operation Adjacency matrix Adjacency list
Space O(V²) O(V+E)
Edge exists? O(1) O(deg(u))
All neighbours of v O(V) O(deg(v))
Add edge O(1) O(1)
Remove edge O(1) O(deg)
Default to adjacency list. Use a matrix only when V is small (< ~1000) or you genuinely need O(1) edge lookups (Floyd-Warshall).
Advanced: Connected Components and Strongly Connected Components
Connected component (undirected): maximal set of mutually reachable vertices. Find them by running BFS or DFS from each unvisited node — the number of times you launch a BFS is the number of components.
Strongly Connected Component (SCC) (directed): maximal set where *every vertex can reach every other vertex along directed edges*. Two famous algorithms:
- Kosaraju's — two DFS passes: one on G to record finish times, one on Gᵀ (edge-reversed graph) in reverse-finish order. Each DFS tree in the second pass is one SCC. O(V+E).
- Tarjan's — single DFS using discovery time and "low-link" values; SCCs pop off a stack. O(V+E), one pass, slightly trickier code.
SCC condensation: collapse each SCC into one node — the result is always a DAG. This trick reduces many "directed graph" problems to "DAG" problems.Advanced: Eulerian and Hamiltonian Walks
Eulerian path — visits every *edge* exactly once. Exists iff the graph is connected and has 0 or 2 vertices of odd degree.
Eulerian circuit — Eulerian path that returns to its start. Exists iff connected and all vertices have even degree.
Algorithm: Hierholzer's, O(V+E). Useful for: solving the Bridges of Königsberg, route inspection, DNA sequencing fragment assembly.
Hamiltonian path — visits every *vertex* exactly once.
Hamiltonian circuit — Hamiltonian path that returns to its start.
NP-complete. No polynomial algorithm known. The decision version of TSP. For n ≤ ~20, bitmask DP solves it in O(2ⁿ · n²); above that, you're in heuristic / approximation territory.
The single-letter difference (edges vs vertices) flips the problem from "easy with a degree check" to "computationally intractable". Worth pausing on.
Advanced: Cycle Detection — Two Different Algorithms
Undirected graph: DFS while tracking the parent — if you encounter a visited neighbour that isn't the parent, you've found a back-edge → cycle. Or: Union-Find on each edge — if the endpoints are already in the same set, the edge closes a cycle.
Directed graph: DFS with three colors — WHITE (unvisited), GRAY (in current DFS path), BLACK (finished). Encountering a GRAY node during DFS = back edge = cycle. Alternative: Kahn's algorithm (BFS topological sort) — if it can't process all V vertices, the graph has a cycle.
The undirected trick (parent check) does NOT work on directed graphs. Always pick the right one.
Advanced: Real-World Graph Applications
Domain Vertices Edges What you compute
Social networks users friendships "people you may know", influencer detection (PageRank)
GPS / maps intersections road segments shortest path (Dijkstra, A*)
Web search pages hyperlinks PageRank, crawl scheduling
Build systems targets dependencies topological order
Compilers basic blocks control flow dataflow analysis, register allocation
Biology proteins interactions community detection
Recommender systems users + items interactions bipartite matching, link prediction
Network routing routers links OSPF (Dijkstra), BGP (path-vector)
If you can phrase the problem as "what nodes, what edges?", you're 80% of the way to a solution.
Advanced: Graph Coloring
Assign colors to vertices such that no two adjacent vertices share a color.
- Chromatic number χ(G) — the minimum colors needed.
- Bipartite ⇔ 2-colorable. That's the same theorem twice.
- Greedy coloring uses at most Δ+1 colors (Δ = max degree). Trivial to implement.
- General optimal coloring is NP-hard.
- Real-world uses: register allocation in compilers, exam-timetable scheduling, frequency assignment for cell towers, Sudoku.
Practice Path
1. Build an adjacency list for an undirected graph and write \addEdge\, \neighbours(v)\, \hasEdge(u,v)\. Verify the handshaking lemma on it.
2. Count the connected components of an undirected graph with an outer loop + DFS / BFS.
3. Implement bipartite checking with BFS 2-coloring. Test on a 4-cycle (bipartite) and a 5-cycle (not bipartite).
4. Detect a cycle in a directed graph using DFS with WHITE / GRAY / BLACK colors. Then redo it with Kahn's BFS.