Notifications

No notifications

/Phase 4

Window Functions

Window functions compute an aggregate per row without collapsing rows the way GROUP BY does. They unlock running totals, rankings, top-N-per-group, lag/lead comparisons and moving averages — all in pure SQL, no self-joins required.

On this page

Detailed Theory

The mental model

GROUP BY shrinks N rows to one per group. A window function keeps all N rows and attaches a per-row computation that "looks at" a window of related rows.

SELECT
  name,
  salary,
  department,
  AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees;

Each row keeps its own name and salary, plus the average for its department in a new column. No collapsing.

OVER (PARTITION BY … ORDER BY …)

function(args) OVER (
  PARTITION BY ...   -- split rows into windows
  ORDER BY ...       -- order within each window
  ROWS / RANGE ...   -- which rows in the window count (frame)
)

  • PARTITION BY is the windowed cousin of GROUP BY — splits rows into groups.
  • ORDER BY orders rows inside the window (mandatory for ranking / running totals).
  • Frame narrows further (e.g. "the 6 rows up to this one").

Ranking functions — top-N per group

FunctionWhat it returns
ROW_NUMBER()1, 2, 3, 4… (unique per row)
RANK()ties share rank; gaps after (1, 1, 3)
DENSE_RANK()ties share rank; no gaps (1, 1, 2)
NTILE(n)bucket each row into one of n equal-sized buckets
PERCENT_RANK()(rank-1)/(rows-1), 0..1

-- Top 3 highest-paid employees per department
SELECT *
FROM (
  SELECT name, department, salary,
         ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
  FROM   employees
) t
WHERE rn <= 3;

This is the single most-asked SQL interview pattern. Memorize it.

LAG / LEAD — peek at neighbouring rows

SELECT day, sales,
       LAG(sales)  OVER (ORDER BY day) AS prev_day,
       LEAD(sales) OVER (ORDER BY day) AS next_day,
       sales - LAG(sales) OVER (ORDER BY day) AS day_over_day
FROM daily_sales;

LAG(col, n, default) returns the value n rows earlier (1 by default), with a fallback if there isn't one. Perfect for diff/delta calculations.

Running totals & moving averages — frames

A frame restricts which rows of the window the aggregate looks at.

-- Running total of sales by day
SELECT day, sales,
       SUM(sales) OVER (ORDER BY day
                        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM daily_sales;

-- 7-day moving average
SELECT day, sales,
       AVG(sales) OVER (ORDER BY day
                        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS ma_7
FROM daily_sales;

Frame keywordMeans
UNBOUNDED PRECEDINGfrom the first row of the window
N PRECEDINGN rows before current
CURRENT ROWthis row
N FOLLOWINGN rows after current
UNBOUNDED FOLLOWINGthrough the last row

ROWS counts physical rows. RANGE counts logical value ranges (useful with INTERVAL for time series).

⚠️ Default frame: when you write ORDER BY but no frame, SQL uses RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW — which groups ties together. For a strict per-row running total, write the frame explicitly with ROWS.

FIRST_VALUE / LAST_VALUE / NTH_VALUE

SELECT name, salary, department,
       FIRST_VALUE(name) OVER (PARTITION BY department ORDER BY salary DESC) AS top_paid
FROM employees;

Useful for "show every row plus the best/worst in its group" without a self-join.

Window aggregates without ORDER BY

You can use any aggregate (SUM, COUNT, AVG, …) as a window function:

SELECT name, salary, department,
       AVG(salary)   OVER (PARTITION BY department) AS dept_avg,
       COUNT(*)      OVER (PARTITION BY department) AS dept_size,
       salary - AVG(salary) OVER (PARTITION BY department) AS gap_to_avg
FROM employees;

Without ORDER BY, the frame defaults to the entire partition, so each row gets the partition-level aggregate.

Named windows — DRY

When several functions share the same window:

SELECT name, salary,
       RANK()        OVER w AS rk,
       DENSE_RANK()  OVER w AS drk,
       PERCENT_RANK() OVER w AS pct
FROM   employees
WINDOW w AS (PARTITION BY department ORDER BY salary DESC);

Common patterns to remember

NeedWindow pattern
Top-N per groupROW_NUMBER() OVER (PARTITION BY g ORDER BY x DESC) then WHERE rn <= N
Running totalSUM(x) OVER (ORDER BY t ROWS UNBOUNDED PRECEDING)
Day-over-day diffx - LAG(x) OVER (ORDER BY t)
7-day moving avgAVG(x) OVER (ORDER BY t ROWS 6 PRECEDING)
Each row vs group avgx - AVG(x) OVER (PARTITION BY g)
Cumulative percentSUM(x) OVER (...) / SUM(x) OVER ()
Find duplicates by keyCOUNT(*) OVER (PARTITION BY key) then filter > 1
Median per groupPERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY x) OVER (PARTITION BY g)

Engine support

Window functions are standard SQL and supported by every major engine: PostgreSQL, MySQL 8+, MariaDB 10.2+, SQLite 3.25+, SQL Server, Oracle, BigQuery, Snowflake. MySQL versions < 8.0 don't have them — that alone is a reason to upgrade.

Performance note

Window functions sort once per window definition. They're typically much cheaper than correlated subqueries or self-joins doing the same job. The main risk is huge partitions without supporting indexes — EXPLAIN ANALYZE if you suspect a sort is the bottleneck.