Last 30 Days
No notifications
A Common Table Expression (WITH β¦) names a temporary result set you can reference later in the query β like a one-shot view. Add the RECURSIVE keyword and the same construct can walk trees, generate sequences, and traverse graphs.
WITH active_users AS (
SELECT id, email FROM users WHERE is_active
)
SELECT count(*) FROM active_users;Read it as: "define active_users for the duration of this query, then use it like a table." Replaces deeply nested subqueries with a top-down narrative.
WITH
recent_orders AS (
SELECT * FROM orders WHERE created_at >= CURRENT_DATE - INTERVAL '30 days'
),
per_user AS (
SELECT user_id, COUNT(*) AS orders_30d, SUM(amount) AS spend_30d
FROM recent_orders
GROUP BY user_id
)
SELECT u.name, p.orders_30d, p.spend_30d
FROM per_user p
JOIN users u ON u.id = p.user_id
ORDER BY p.spend_30d DESC
LIMIT 20;Each CTE can reference earlier CTEs. The whole thing reads like a paragraph: define A, define B from A, use B at the end.
| CTE | Inline subquery | |
| Reusable in the same query? | β multiple times | β would need to repeat |
| Reads top-down? | β | β inside-out |
| Self-reference (recursion)? | β | β |
| Optimizer can inline? | usually yes (PG 12+, others) | always |
Rule of thumb: reach for a CTE when the subquery is reused, recursive, or makes the query easier to read. For a single one-line subquery, leave it inline.
CTEs aren't just for SELECT β Postgres lets you put DML in them and chain results:
WITH archived AS (
DELETE FROM sessions
WHERE expires_at < now()
RETURNING id, user_id
)
INSERT INTO sessions_audit (id, user_id, archived_at)
SELECT id, user_id, now() FROM archived;One statement, one transaction, no temp table. Killer feature for ETL.
The shape is always:
WITH RECURSIVE name (cols...) AS (
-- 1. anchor: the seed rows
SELECT ... UNION ALL -- (or UNION)
-- 2. recursive step: references "name" itself
SELECT ...
FROM name
JOIN ...
WHERE ... -- termination predicate!
)
SELECT * FROM name;
The engine repeatedly executes the recursive step, feeding the previous iteration's output back in, until it produces zero new rows. Always include a termination condition.
WITH RECURSIVE nums AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM nums WHERE n < 10
)
SELECT * FROM nums;
-- β 1, 2, 3, 4, 5, 6, 7, 8, 9, 10(In Postgres you'd more idiomatically use generate_series(1, 10), but the recursive form works in every engine that supports CTEs.)
Schema: employees(id, name, manager_id).
WITH RECURSIVE org AS (
-- anchor: the CEO (no manager)
SELECT id, name, manager_id, 1 AS level, name::TEXT AS path
FROM employees
WHERE manager_id IS NULL UNION ALL
-- step: each employee whose manager is already in 'org'
SELECT e.id, e.name, e.manager_id, o.level + 1, o.path
' > '
e.name
FROM employees e
JOIN org o ON o.id = e.manager_id
)
SELECT level, path FROM org ORDER BY path;You get every employee plus their depth and reporting chain. Same shape works for category trees, comment threads, file systems, β¦
WITH RECURSIVE ancestors AS (
SELECT * FROM categories WHERE id = 42
UNION ALL
SELECT c.*
FROM categories c
JOIN ancestors a ON c.id = a.parent_id
)
SELECT * FROM ancestors;Walks upward by following parent_id until the root.
WITH RECURSIVE reachable AS (
SELECT src, dst, ARRAY[src] AS visited
FROM edges
WHERE src = 1 UNION ALL
SELECT r.src, e.dst, r.visited || e.dst
FROM edges e
JOIN reachable r ON r.dst = e.src
WHERE e.dst <> ALL(r.visited) -- skip already-visited nodes
)
SELECT DISTINCT dst FROM reachable;
Without the visited array, a cycle would loop forever. Engines also offer a built-in CYCLE clause:
WITH RECURSIVE r AS (...)
CYCLE id SET is_cycle USING path -- Postgres 14+, SQL Server, OracleUNION ALL β fast, the standard choice; you handle dedup yourself if needed.UNION β deduplicates each step, useful when you don't want repeated rows.UNION ALL plus an explicit visited-list β UNION alone won't stop infinite loops because new visited arrays differ each iteration.1. Forgotten termination in a recursive CTE β infinite loop. Most engines have a safety limit (max_recursion) but not all.
2. Old PG (β€ 11) didn't inline CTEs β they were optimization barriers. From PG 12 the planner inlines CTEs unless you write WITH β¦ AS MATERIALIZED.
3. Reusing a CTE many times in a non-inlining engine recomputes it each time β materialize manually with AS MATERIALIZED or a temp table.
4. No indexes on a CTE result β it's a virtual table; if you join it many times to a big table, performance can suffer.
| You want⦠| Pattern |
| Name a subquery for clarity | non-recursive CTE |
| Chain ETL steps in one statement (Postgres) | writable CTE with RETURNING |
| Walk a tree downward | recursive CTE, anchor = roots |
| Walk a tree upward | recursive CTE, anchor = leaf, follow parent_id |
| Generate numbers / dates | recursive CTE or generate_series |
| Graph traversal | recursive CTE + visited array / CYCLE clause |