Notifications

No notifications

/Phase 1

WHERE — Filtering Rows

WHERE is how you stop dragging entire tables back from the database. Combine comparisons, AND/OR, BETWEEN, IN, LIKE, and the special NULL handling — and you can filter for almost anything in a single line.

On this page

Detailed Theory

The mental model

SELECT  columns
FROM    table
WHERE   <boolean expression>;

For each row, the engine evaluates the expression. If it's TRUE, the row is kept. If FALSE or UNKNOWN (the special NULL state), it's dropped.

That last part is the famous SQL gotcha — covered below.

Comparison operators

OpMeaning
=equal
<> or !=not equal
< <= > >=ordering

SELECT * FROM users WHERE age = 30;
SELECT * FROM users WHERE age <> 30;
SELECT * FROM users WHERE age >= 18;

These work on numbers, strings (lexicographic), and dates.

AND / OR / NOT

Combine conditions:

SELECT * FROM users
WHERE  age >= 18
  AND  country = 'IN';

SELECT * FROM products WHERE category = 'book' OR category = 'magazine';

SELECT * FROM users WHERE NOT (country = 'US');

Precedence: NOT > AND > OR. When mixing, parenthesise to make intent obvious:

-- Probably wrong: AND binds tighter than OR
WHERE country = 'IN' OR country = 'US' AND age >= 18

-- What you usually meant: WHERE (country = 'IN' OR country = 'US') AND age >= 18

BETWEEN — inclusive range

SELECT * FROM orders
WHERE  total BETWEEN 100 AND 500;
-- same as: total >= 100 AND total <= 500

BETWEEN is inclusive on both ends. Works for dates too:

SELECT * FROM orders
WHERE  created_at BETWEEN '2025-01-01' AND '2025-12-31';

IN — membership in a list

SELECT * FROM users
WHERE  country IN ('IN', 'US', 'UK');

SELECT * FROM users WHERE status NOT IN ('banned', 'deleted');

Cleaner than chains of OR. The list can also be a subquery:

SELECT * FROM orders
WHERE  user_id IN (SELECT id FROM users WHERE country = 'IN');

LIKE — pattern matching

Two wildcards:

WildcardMatches
%any sequence of characters (incl. empty)
_exactly one character

SELECT * FROM users WHERE name LIKE 'A%';        -- starts with A
SELECT * FROM users WHERE email LIKE '%@gmail.com'; -- gmail addresses
SELECT * FROM users WHERE name LIKE '_at%';       -- 2nd & 3rd chars 'at'

LIKE is case-sensitive in Postgres but case-insensitive in MySQL by default — Postgres adds ILIKE for case-insensitive matching:

-- Postgres
SELECT * FROM users WHERE name ILIKE 'a%';

For full regex, Postgres has ~, MySQL has REGEXP.

NULL — the three-valued logic trap

NULL means unknown, not "empty" or "zero". The big gotcha:

SELECT 1 = NULL;      -- NULL  (not TRUE!)
SELECT NULL = NULL;   -- NULL  (still not TRUE!)
SELECT NULL <> NULL;  -- NULL

Because the result of any comparison with NULL is UNKNOWN, those rows are dropped by WHERE (which only keeps TRUE).

Use the special operators:

SELECT * FROM users WHERE phone IS NULL;
SELECT * FROM users WHERE phone IS NOT NULL;

For "either equal or both NULL", use IS DISTINCT FROM (Postgres) or IS NOT (SQLite):

-- Postgres
WHERE a IS DISTINCT FROM b   -- treats NULL = NULL as equal-ish

COALESCE is also handy when comparing nullable columns:

WHERE COALESCE(deleted, FALSE) = FALSE

NULL behaviour in IN / NOT IN

WHERE id IN (1, 2, NULL)        -- TRUE if id = 1 or 2; UNKNOWN otherwise
WHERE id NOT IN (1, 2, NULL)    -- ALWAYS UNKNOWN — drops every row 😱

That second pattern bites everyone once. Use NOT EXISTS or filter NULLs out of the list explicitly.

Filtering on computed expressions

WHERE runs before SELECT, so aliases declared in SELECT are not visible. Repeat the expression — or wrap in a subquery / CTE:

-- ❌ doesn't work in standard SQL
SELECT price * 1.18 AS gross FROM products WHERE gross > 100;

-- ✅ repeat SELECT price * 1.18 AS gross FROM products WHERE price * 1.18 > 100;

-- ✅ or wrap SELECT * FROM ( SELECT price * 1.18 AS gross FROM products ) t WHERE gross > 100;

Performance hints (taste of indexes)

WHERE is also where indexes pay off:

  • WHERE indexed_col = 'x' — uses the index.
  • WHERE LOWER(indexed_col) = 'x' — usually can't use the index (function on the column).
  • WHERE indexed_col + 1 = 5 — same problem.
  • WHERE indexed_col LIKE '%abc' — leading % defeats B-tree indexes.
Don't worry about this in Phase 1 — covered in detail in the Indexes topic.

Recap

SELECT id, name, email
FROM   users
WHERE  age      BETWEEN 18 AND 35
  AND  country  IN ('IN', 'US')
  AND  email    LIKE '%@gmail.com'
  AND  phone    IS NOT NULL
  AND  status   <> 'banned'
ORDER  BY name;

If you can read that, you're ready for sorting and pagination.