Last 30 Days
No notifications
Aggregates collapse many rows into one summary value — COUNT, SUM, AVG, MIN, MAX and friends. They power every metric, KPI and report. Once you understand how aggregates handle NULLs and how DISTINCT and FILTER modify them, you've unlocked the analytical half of SQL.
| Function | What it returns |
COUNT(*) | Number of rows (NULL or not) |
COUNT(col) | Number of rows where col is NOT NULL |
COUNT(DISTINCT col) | Number of distinct non-NULL values |
SUM(col) | Sum of non-NULL values |
AVG(col) | Mean of non-NULL values |
MIN(col) / MAX(col) | Smallest / largest non-NULL value |
SELECT COUNT(*) AS total_users,
COUNT(phone) AS users_with_phone,
COUNT(DISTINCT country) AS countries,
AVG(age) AS avg_age,
MIN(age) AS youngest,
MAX(age) AS oldest,
SUM(age) AS total_age_years
FROM users;A query without GROUP BY returns one row that summarises the whole table.
Aggregates ignore NULL values (except COUNT(*)):
ages: 10, 20, NULL, 30| Expression | Value |
COUNT(*) | 4 |
COUNT(age) | 3 |
SUM(age) | 60 |
AVG(age) | 20 (60 / 3, not 4!) |
That AVG denominator is the most common surprise. If you want NULLs to count as 0:
AVG(COALESCE(age, 0))If the table is empty, all aggregates return NULL — except COUNT, which returns 0.
COUNT(*) -- count all rows
COUNT(1) -- same thing; the 1 is a non-NULL constant
COUNT(col) -- count only rows where col IS NOT NULLIn every modern engine, COUNT(*) and COUNT(1) are identical in performance. Use COUNT(*) for clarity.
SELECT COUNT(DISTINCT country) FROM users; -- # of unique countries
SELECT SUM(DISTINCT amount) FROM payments; -- sum of unique amounts (rare)COUNT(DISTINCT) is the workhorse — "how many unique X are there?" — and the basis of cohort analysis.
Standard SQL (Postgres + SQLite):
SELECT
COUNT(*) AS total,
COUNT(*) FILTER (WHERE status = 'active') AS active,
COUNT(*) FILTER (WHERE status = 'banned') AS banned,
AVG(age) FILTER (WHERE country = 'IN') AS avg_age_in
FROM users;In MySQL / older engines you fall back to SUM(CASE WHEN …):
SELECT
COUNT(*) AS total,
SUM(CASE WHEN status = 'active' THEN 1 ELSE 0 END) AS active,
AVG(CASE WHEN country = 'IN' THEN age END) AS avg_age_in
FROM users;This pattern — many counters in one query — is wildly more efficient than running multiple SELECT COUNTs.
| Function | Returns | ||||
BOOL_OR(cond) / BOOL_AND(cond) (Postgres) | TRUE if any/all rows match | ||||
MAX(CASE WHEN cond THEN 1 ELSE 0 END) = 1 | portable equivalent | String / array aggregates | Function | Engine | What it does |
STRING_AGG(col, ',') | Postgres | Concatenate with separator | |||
GROUP_CONCAT(col SEPARATOR ',') | MySQL | Same idea | |||
ARRAY_AGG(col) | Postgres | Collect into an array | |||
JSON_AGG(col) | Postgres | Collect into a JSON array |
SELECT user_id,
STRING_AGG(tag, ',' ORDER BY tag) AS tags
FROM user_tags
GROUP BY user_id;SELECT
STDDEV(age) AS std_dev,
VAR_SAMP(age) AS variance,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY age) AS median
FROM users;Postgres has the richest set; most engines have at least STDDEV / VARIANCE.
Whenever a query has any aggregate, all non-aggregated columns in the SELECT list must either appear in GROUP BY or themselves be aggregated. This errors:
-- ❌ name is neither grouped nor aggregated
SELECT name, AVG(age) FROM users;You either group:
SELECT country, AVG(age) FROM users GROUP BY country;Or pick one with an aggregate:
SELECT MAX(name) AS sample_name, AVG(age) FROM users;(MySQL allowed the broken form for years with ONLY_FULL_GROUP_BY off — it's a footgun, leave it on.)
KPI dashboard — one row, many metrics
SELECT
COUNT(*) AS users_total,
COUNT(*) FILTER (WHERE created_at >= CURRENT_DATE - INTERVAL '7 days') AS new_7d,
COUNT(*) FILTER (WHERE last_login >= CURRENT_DATE - INTERVAL '30 days') AS active_30d,
AVG(age) FILTER (WHERE age IS NOT NULL) AS avg_age
FROM users;Convert "is this list non-empty?" without LIMIT 1
SELECT EXISTS (SELECT 1 FROM orders WHERE user_id = 42);(EXISTS short-circuits — faster than COUNT(*) > 0 on large tables.)
COUNT(*) counts rows; COUNT(col) skips NULLs.AVG denominator can surprise you.FILTER (WHERE …) (or SUM(CASE WHEN…)) to compute many conditional totals in one pass.COUNT(DISTINCT col) is the cohort/uniqueness workhorse.GROUP BY (next topic) to slice metrics by category.