Notifications

No notifications

/Phase 3

DML — INSERT, UPDATE, DELETE, UPSERT

DML changes the rows inside your tables. Beyond the basic three (INSERT / UPDATE / DELETE), every modern engine supports RETURNING, multi-row inserts, INSERT … SELECT, joined updates and conflict-handling upserts. Get these solid and 95% of backend SQL is just composition.

On this page

Detailed Theory

INSERT — single row

INSERT INTO users (email, name, country)
VALUES ('ada@example.com', 'Ada', 'GB');

You list the target columns in parens, then VALUES matches positionally. Omitted columns fall back to their DEFAULT (or NULL).

You can skip the column list (INSERT INTO users VALUES (...)) but it's brittle — any schema change breaks every such call. Always list columns.

INSERT — many rows in one statement

INSERT INTO users (email, name) VALUES
  ('a@x.com', 'Ada'),
  ('b@x.com', 'Bob'),
  ('c@x.com', 'Cara');

One statement = one round-trip = much faster than three separate inserts. Batch sizes of a few hundred to a few thousand are sweet spots.

INSERT … SELECT — copy / transform from another table

INSERT INTO archived_users (id, email, archived_at)
SELECT id, email, now()
FROM   users
WHERE  is_active = FALSE;

The classic ETL move: shape rows with a SELECT, drop them straight into the destination. No application code in between.

RETURNING — get the inserted row back (Postgres / SQLite / SQL Server OUTPUT)

INSERT INTO users (email, name)
VALUES ('ada@x.com', 'Ada')
RETURNING id, created_at;

Indispensable: get the new auto-generated ID without a second SELECT. Works on UPDATE and DELETE too.

UPDATE — change rows

UPDATE users
SET    name = 'Ada Lovelace',
       age  = age + 1
WHERE  id = 42;

Always have a WHERE clause unless you really mean "every row". Many production outages start with a missing WHERE.

Update many columns from one source value:

UPDATE users
SET (name, age, country) = ('Ada', 36, 'GB')
WHERE id = 42;

UPDATE … FROM — joined updates (Postgres syntax)

Often you want to update one table using values from another:

-- Postgres
UPDATE posts p
SET    author_name = u.name
FROM   users u
WHERE  p.user_id = u.id;

-- MySQL syntax
UPDATE posts p
JOIN   users u ON u.id = p.user_id
SET    p.author_name = u.name;

-- Standard SQL alternative — correlated subquery
UPDATE posts p
SET    author_name = (SELECT u.name FROM users u WHERE u.id = p.user_id);

DELETE — remove rows

DELETE FROM users WHERE is_active = FALSE;
DELETE FROM users;          -- ⚠️ wipes everything (use TRUNCATE for that)

DELETE … USING (Postgres) deletes based on another table:

DELETE FROM posts p
USING  users u
WHERE  p.user_id = u.id AND u.is_banned;

Pair DELETE with RETURNING to log what you removed:

DELETE FROM sessions
WHERE  expires_at < now()
RETURNING id;

Safe-DELETE / safe-UPDATE patterns

1. Always SELECT first. Run the same WHERE in a SELECT to preview rows. 2. Wrap in a transaction. BEGIN;SELECTCOMMIT; (or ROLLBACK;). 3. Limit the blast radius with LIMIT (MySQL allows it; in Postgres use WHERE id IN (SELECT id FROM ... LIMIT n)). 4. Soft delete for important data — set deleted_at instead of removing rows.

-- Soft delete pattern
UPDATE users SET deleted_at = now() WHERE id = 42;
SELECT * FROM users WHERE deleted_at IS NULL;   -- "live" rows

UPSERT — insert if new, update if exists

The single most useful "extra" DML statement. Different syntax per engine.

Postgres / SQLite — ON CONFLICT:

INSERT INTO users (email, name, login_count)
VALUES ('ada@x.com', 'Ada', 1)
ON CONFLICT (email) DO UPDATE
  SET name        = EXCLUDED.name,
      login_count = users.login_count + 1;

EXCLUDED refers to the row that would have been inserted. Reference users.col for the existing value.

-- Insert OR ignore (do nothing on conflict)
INSERT INTO users (email, name) VALUES ('ada@x.com', 'Ada')
ON CONFLICT (email) DO NOTHING;

MySQL — ON DUPLICATE KEY UPDATE:

INSERT INTO users (email, name, login_count) VALUES ('ada@x.com', 'Ada', 1)
ON DUPLICATE KEY UPDATE
  name        = VALUES(name),
  login_count = login_count + 1;

Standard SQL — MERGE:

MERGE INTO users AS u
USING (SELECT 'ada@x.com' AS email, 'Ada' AS name) AS s
ON    u.email = s.email
WHEN MATCHED     THEN UPDATE SET name = s.name
WHEN NOT MATCHED THEN INSERT (email, name) VALUES (s.email, s.name);

UPSERT replaces the read-then-write race condition with a single atomic statement — critical for concurrent inserts (e.g. event ingestion, idempotent webhooks).

Best-practice checklist

  • ✅ Always list columns in INSERT.
  • ✅ Always include WHERE in UPDATE / DELETE.
  • ✅ Wrap multi-step DML in a transaction.
  • ✅ Use RETURNING to fetch generated values in one round-trip.
  • ✅ Reach for UPSERT instead of "SELECT then INSERT or UPDATE".
  • ❌ Avoid DELETE FROM big_table without WHERE — use TRUNCATE.
  • ❌ Don't loop one INSERT per row from app code — batch.