Notifications

No notifications

/Phase 4

SQL Fundamentals

SQL — Structured Query Language

SQL is the standard language for relational databases (PostgreSQL, MySQL, SQLite). Data lives in tables with rows and columns, linked by keys.

Table Structure

CREATE TABLE users (
  id        SERIAL PRIMARY KEY,
  name      VARCHAR(100) NOT NULL,
  email     VARCHAR(255) UNIQUE NOT NULL,
  age       INTEGER CHECK (age >= 0),
  role      VARCHAR(20) DEFAULT 'user',
  created_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE posts ( id SERIAL PRIMARY KEY, title VARCHAR(200) NOT NULL, content TEXT, author_id INTEGER REFERENCES users(id) ON DELETE CASCADE, created_at TIMESTAMP DEFAULT NOW() );

CRUD Operations

OperationSQL
CreateINSERT INTO users (name, email) VALUES ('Alice', 'a@t.com')
ReadSELECT * FROM users WHERE role = 'admin'
UpdateUPDATE users SET name = 'Bob' WHERE id = 1
DeleteDELETE FROM users WHERE id = 1

Node.js with PostgreSQL (pg)

const { Pool } = require('pg');
const pool = new Pool({ connectionString: 'postgresql://...' });

const { rows } = await pool.query( 'SELECT * FROM users WHERE role = $1', ['admin'] // Parameterized query (prevents SQL injection!) );

On this page

Detailed Theory

SQL is the language you use to talk to a relational database. You describe *what* you want — "the names of users who signed up last week" — and the database figures out *how* to fetch it. After 50 years it is still the lingua franca of data, and learning it well will outlive any framework you pick up.

What SQL Actually Is

A SQL database stores data in tables (rows + columns with strict types) and connects them with foreign keys. Four operations cover most of your day:

SELECT  ...  -- read
INSERT  ...  -- create
UPDATE  ...  -- modify
DELETE  ...  -- remove

The rest of SQL is just refinements: filter rows, group them, join them, order them.

Tables, Columns, and Types

CREATE TABLE users (
  id          SERIAL       PRIMARY KEY,
  email       VARCHAR(255) NOT NULL UNIQUE,
  name        VARCHAR(100) NOT NULL,
  age         INT          CHECK (age >= 13),
  is_active   BOOLEAN      DEFAULT true,
  created_at  TIMESTAMPTZ  DEFAULT now()
);

Common types: INT, BIGINT, DECIMAL(10,2) (money!), TEXT, VARCHAR(n), BOOLEAN, DATE, TIMESTAMPTZ, JSONB, UUID. Pick the *narrowest* type that fits — it speeds up every query.

SELECT — Reading Data

SELECT id, name, email FROM users WHERE is_active = true ORDER BY created_at DESC LIMIT 20;

The execution order is not the writing order. The DB actually evaluates:

FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT

Knowing this stops 90% of "why does my alias not work in WHERE?" confusion.

Beginner Mistakes to Skip

1. SELECT * in production code. Wastes bandwidth, breaks when columns change. Always list the columns you need. 2. String-concatenating SQL. "WHERE id = " + id = SQL injection. Always use parameterised queries ($1, ?). 3. Forgetting WHERE on UPDATE / DELETE. DELETE FROM users; deletes the whole table. Use a transaction in dev to give yourself an undo. 4. Comparing with = NULL. NULL is unknown. Use IS NULL / IS NOT NULL. 5. Storing money as FLOAT. 0.1 + 0.2 != 0.3. Use DECIMAL(12,2) or store cents as BIGINT. 6. Storing dates as strings. Sorting and time-zone math go wrong. Use DATE / TIMESTAMPTZ.

Intermediate: WHERE, Patterns, and IN

SELECT * FROM users
WHERE age BETWEEN 18 AND 30
  AND email LIKE '%@gmail.com'
  AND country IN ('IN', 'US', 'UK')
  AND deleted_at IS NULL;

LIKE patterns: % = any chars, _ = one char. ILIKE is case-insensitive (Postgres).

Intermediate: JOINs in Plain English

JOINs let you combine tables that share a column.

TypeReturns
INNER JOINOnly rows that match in both tables
LEFT JOINAll rows from the left table, NULLs where no match
RIGHT JOINMirror of LEFT (rarely used)
FULL JOINEverything from both, NULLs where no match

SELECT u.name, p.title
FROM users u
LEFT JOIN posts p ON p.author_id = u.id
WHERE u.is_active
ORDER BY u.name;

Mental picture: a Venn diagram. INNER = intersection, LEFT = left circle.

Intermediate: GROUP BY & Aggregates

Aggregates squash many rows into one summary row per group.

SELECT
  u.id,
  u.name,
  COUNT(p.id)              AS post_count,
  COALESCE(AVG(p.views),0) AS avg_views
FROM users u
LEFT JOIN posts p ON p.author_id = u.id
GROUP BY u.id, u.name
HAVING COUNT(p.id) > 5
ORDER BY post_count DESC;

WHERE filters rows; HAVING filters groups. Every non-aggregated column must appear in GROUP BY.

Intermediate: Subqueries & CTEs

-- Subquery (inline)
SELECT * FROM posts
WHERE views > (SELECT AVG(views) FROM posts);

-- CTE (named, readable, the modern way) WITH active_users AS ( SELECT id, name FROM users WHERE is_active ) SELECT au.name, COUNT(p.id) AS posts FROM active_users au LEFT JOIN posts p ON p.author_id = au.id GROUP BY au.name;

Reach for CTEs (WITH) once a query gets nested — they read top-to-bottom.

Intermediate: Transactions in SQL

Same idea as the DB intro: all-or-nothing.

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

Wrap every multi-statement business operation. Roll back on errors. From Node:

const client = await pool.connect();
try {
  await client.query('BEGIN');
  await client.query('UPDATE accounts SET balance = balance - $1 WHERE id = $2', [100, 1]);
  await client.query('UPDATE accounts SET balance = balance + $1 WHERE id = $2', [100, 2]);
  await client.query('COMMIT');
} catch (e) {
  await client.query('ROLLBACK');
  throw e;
} finally {
  client.release();
}

Advanced: Indexes & EXPLAIN

CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_posts_author_created ON posts(author_id, created_at DESC);
CREATE UNIQUE INDEX idx_users_email_uniq ON users(LOWER(email));

EXPLAIN ANALYZE ; shows the plan. "Seq Scan" on a big table = missing index. "Index Scan" = good. Compound index column order matters — leftmost prefix is what gets used.

Advanced: Window Functions

Like GROUP BY, but you keep every row and add a calculated column.

SELECT
  name,
  department,
  salary,
  RANK()       OVER (PARTITION BY department ORDER BY salary DESC) AS rank_in_dept,
  AVG(salary)  OVER (PARTITION BY department)                       AS dept_avg
FROM employees;

Use cases: "top N per group", running totals, moving averages, deduplication. Once you know window functions, you stop writing scary subqueries.

Advanced: Constraints & Data Integrity

ALTER TABLE posts
  ADD CONSTRAINT fk_author FOREIGN KEY (author_id) REFERENCES users(id) ON DELETE CASCADE,
  ADD CONSTRAINT chk_views CHECK (views >= 0);

Let the DB enforce truths. Foreign keys, UNIQUE, CHECK, NOT NULL — each one is a bug your application no longer has to remember.

Advanced: JSONB & Hybrid SQL/NoSQL

Postgres can store and *index* JSON:

ALTER TABLE users ADD COLUMN settings JSONB DEFAULT '{}'::jsonb;
CREATE INDEX idx_users_settings_gin ON users USING gin (settings);
SELECT * FROM users WHERE settings @> '{"theme":"dark"}';

Great escape hatch when you need flexible per-row data without leaving SQL.

Practice Path

1. Create users, posts, comments with proper PK/FK and constraints. 2. Insert 20 rows of fake data; write SELECTs with WHERE, ORDER BY, LIMIT. 3. Write a JOIN that returns each user's post count using LEFT JOIN + GROUP BY. 4. Wrap a 2-step money transfer in a transaction with rollback on failure, and run EXPLAIN ANALYZE on a heavy query before/after adding an index.