Last 30 Days
No notifications
GROUP BY is how you slice data into buckets and run an aggregate per bucket — sales per country, posts per user, errors per hour. HAVING is the filter that runs *after* aggregation. Together they turn a raw table into a report.
Picture the engine doing three steps:
1. WHERE filters individual rows (no aggregates allowed). 2. GROUP BY partitions surviving rows into buckets by the listed columns. 3. HAVING filters whole buckets using aggregates.
SELECT country, COUNT(*) AS users
FROM users
WHERE is_active = TRUE -- step 1: filter rows
GROUP BY country -- step 2: bucket
HAVING COUNT(*) >= 10 -- step 3: filter buckets
ORDER BY users DESC;Reads as: "Among active users, group by country, keep only countries with at least 10, sort by user count."
In a query with GROUP BY, every column in the SELECT list must either:
GROUP BY, or-- ❌ illegal: name is neither grouped nor aggregated
SELECT country, name, COUNT(*) FROM users GROUP BY country;-- ✅ legal
SELECT country, COUNT(*) FROM users GROUP BY country;
-- ✅ also legal — aggregate non-grouped columns
SELECT country, MAX(name) AS sample_name, COUNT(*) FROM users GROUP BY country;
(MySQL with ONLY_FULL_GROUP_BY=OFF would silently pick a "random" name. That mode is a footgun — leave ONLY_FULL_GROUP_BY on.)
SELECT country, plan, COUNT(*) AS users
FROM users
GROUP BY country, plan;You get one row per distinct combination of (country, plan). Order of columns in GROUP BY doesn't change the result.
You can group by computed values, not just columns:
SELECT DATE_TRUNC('day', created_at) AS day, COUNT(*) AS signups
FROM users
GROUP BY DATE_TRUNC('day', created_at)
ORDER BY day;-- Grouping by an alias works in MySQL/Postgres but not standard SQL.
-- Safer: repeat the expression, or wrap in a subquery.
| WHERE | HAVING | |
| Runs before aggregation? | ✅ yes | ❌ no |
| Can reference aggregates? | ❌ no | ✅ yes |
| Can reference plain columns? | ✅ yes | ✅ if grouped |
-- "Countries with average age > 30, only counting verified users"
SELECT country, AVG(age) AS avg_age
FROM users
WHERE is_verified = TRUE -- filter rows first
GROUP BY country
HAVING AVG(age) > 30; -- filter buckets afterPutting AVG(age) > 30 in WHERE is a syntax error. Putting is_verified = TRUE in HAVING is legal but throws away the chance to reduce rows early — bad for performance.
Rule of thumb: filter as early as possible. Use WHERE whenever it's possible; only use HAVING when you genuinely need an aggregate.
GROUP BY country puts every row whose country IS NULL into one bucket together — NULL = NULL holds only for grouping purposes. Useful but easy to forget when reading reports.
When a single query needs both detail and totals, use grouping extensions instead of UNION:
-- ROLLUP: subtotals along a hierarchy
SELECT country, plan, COUNT(*) AS users
FROM users
GROUP BY ROLLUP (country, plan);
-- → per (country, plan), per country (plan = NULL), grand total (both NULL)-- CUBE: every combination of grouping columns
SELECT country, plan, COUNT(*) AS users
FROM users
GROUP BY CUBE (country, plan);
-- GROUPING SETS: arbitrary list
SELECT country, plan, COUNT(*) AS users
FROM users
GROUP BY GROUPING SETS ((country, plan), (country), ());
The synthetic "totals" rows have NULL in the rolled-up columns. Use the GROUPING(col) function (Postgres / SQL Server / Oracle) to tell a real NULL apart from a "this row is a subtotal" NULL.
For a single column with no aggregates, both produce the same result:
SELECT DISTINCT country FROM users;
SELECT country FROM users GROUP BY country;Use DISTINCT for "give me unique rows" and GROUP BY whenever an aggregate is involved.
1. Selecting an ungrouped, non-aggregated column → error in standard SQL.
2. Filtering an aggregate in WHERE → error. Move it to HAVING.
3. Forgetting that NULLs group together.
4. Grouping by an alias — works in MySQL/Postgres only. Repeat the expression for portability.
5. Counting wrong after a LEFT JOIN — COUNT(*) counts join rows (including the NULL "no match" row); COUNT(other_table.id) only counts real matches.
-- ✅ Posts per user — including users with zero posts
SELECT u.name, COUNT(p.id) AS posts
FROM users u
LEFT JOIN posts p ON p.user_id = u.id
GROUP BY u.id, u.name;COUNT(p.id) correctly returns 0 for users with no posts; COUNT(*) would return 1.