Notifications

No notifications

/Phase 4

Database Concepts

Databases β€” Storing Data Persistently

A database is an organized collection of data that can be efficiently accessed, managed, and updated.

SQL vs NoSQL

FeatureSQL (Relational)NoSQL (Non-Relational)
StructureTables with rows/columnsDocuments, key-value, graphs
SchemaFixed schema (strict)Flexible schema (dynamic)
RelationshipsJOINs, foreign keysEmbedded docs, references
ScalingVertical (bigger server)Horizontal (more servers)
ACIDβœ… Full supportVaries by implementation
ExamplesPostgreSQL, MySQL, SQLiteMongoDB, Redis, DynamoDB

When to Use What?

Use SQL WhenUse NoSQL When
Complex relationshipsFlexible/evolving schema
Financial data (ACID needed)High write throughput
Structured dataHierarchical/nested data
Complex queries with JOINsRapid prototyping
Strong consistency neededHorizontal scaling needed

ACID Properties

PropertyMeaningExample
AtomicityAll or nothingBank transfer: both debit & credit succeed or neither
ConsistencyData always validBalance never goes negative
IsolationTransactions don't interfereTwo transfers run independently
DurabilityCommitted = permanentServer crash doesn't lose data

On this page

Detailed Theory

A database is just a program whose entire job is to store data safely and let many people query it at once β€” fast, consistent, and without losing anything when the power goes out. Files in a folder cannot do that. Once your app has more than one user or more than one process, you need a database.

What a Database Actually Is

Three promises a real database makes that a JSON file does not:

1. Durability β€” once it says "saved", the data survives a crash. 2. Concurrency β€” 100 users editing at the same time will not corrupt each other's writes. 3. Querying β€” ask "give me users in India who signed up last week" and get an answer in milliseconds, not minutes.

Everything else (joins, indexes, transactions, replication) is built on top of these three.

The Four Families You Will Meet

FamilyExamplesMental modelBest for
Relational (SQL)PostgreSQL, MySQL, SQLiteSpreadsheets with strict columns + foreign keysAnything with relationships, finance, most apps
DocumentMongoDB, FirestoreFolders of JSON filesFlexible shapes, nested data, fast iteration
Key–valueRedis, DynamoDBA giant MapCaching, sessions, rate limits, queues
GraphNeo4j, MemgraphNodes + edgesSocial graphs, recommendations, fraud rings

90% of new web apps live happily on PostgreSQL + Redis. Everything else is a special tool for a special problem.

SQL vs NoSQL in One Honest Paragraph

SQL is not "old" and NoSQL is not "web scale". SQL gives you schemas + joins + transactions β€” invaluable when data is interconnected. NoSQL (document) gives you flexible shapes + horizontal scale β€” invaluable when records are self-contained and you do not know your schema yet. Both exist in 2026 because they solve different problems. Most production systems use both.

Beginner Mistakes to Skip

1. "NoSQL is faster." Only for the queries it was designed for. A bad schema in MongoDB is slower than a good one in PostgreSQL. 2. Storing files (PDFs, images, videos) in the database. Use object storage (S3, R2, Cloudinary). Keep only the URL in the DB. 3. One database per microservice from day one. Premature. Start with one DB; split when scaling forces it. 4. Hardcoding the connection string. Always read from process.env, never commit credentials. 5. Closing the connection after every query. Connections are expensive. Use a pool. 6. No backups. "It is in the cloud" is not a backup. Automated daily snapshots + monthly restore drills.

Intermediate: Schemas β€” Normalised vs Denormalised

Normalised (SQL default) β€” each fact lives in exactly one place. No duplication. Slightly slower reads (joins), much safer writes.

users(id, name, email)
posts(id, title, user_id)
comments(id, text, post_id, user_id)

Denormalised (document default) β€” store the data the way you read it. Fewer queries, but updates may need to touch many copies.

{
  "_id": "u1",
  "name": "Alice",
  "posts": [
    { "title": "Hi", "comments": [{ "text": "Cool", "author": "Bob" }] }
  ]
}

Rule of thumb: embed when read together, reference when shared or unbounded.

Intermediate: Primary Keys & Foreign Keys

  • Primary key = unique row id (id SERIAL in SQL, _id ObjectId in Mongo, UUID anywhere).
  • Foreign key = a column whose value must exist in another table's primary key. The DB enforces this for you ("you cannot insert a comment for a post that does not exist").
UUIDs vs auto-incrementing ints: UUIDs are global, safe to expose, friendly to merging databases. Integers are smaller and faster to index. Many teams use both β€” int internally, UUID externally.

Intermediate: Indexes β€” Why Queries Get Slow

Without an index the DB scans every row (O(n)). With an index it walks a tree (O(log n)). The cost: extra storage and slightly slower writes (the tree must be updated).

CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_posts_author_created ON posts(author_id, created_at DESC);

Rule of thumb: index columns used in WHERE, JOIN, ORDER BY. Don't index every column β€” each one slows writes.

Intermediate: Connection Pooling

Opening a TCP connection + auth takes ~30–100ms. Doing that per request kills your throughput. A pool keeps N connections open and hands them out:

const pool = new Pool({ max: 20, idleTimeoutMillis: 30_000 });
const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);

Size the pool to roughly (CPUs * 2) + 1 per app instance. Way more than that hurts the DB without helping you.

Intermediate: Transactions & ACID

A transaction = a group of statements that succeed or fail together. The classic example: transfer money.

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;  -- or ROLLBACK if anything failed

ACID = Atomic, Consistent, Isolated, Durable. You get this for free in SQL. NoSQL document stores have it too now (MongoDB since 4.0), but the cost / scope differs.

Advanced: CAP, BASE & The Trade-offs

When you scale to multiple machines, you can pick at most two of: Consistency, Availability, Partition tolerance. Networks fail, so P is mandatory β€” the real choice is CP (refuse writes during a split, e.g. MongoDB primary) or AP (accept writes everywhere, reconcile later, e.g. Cassandra).

NoSQL often uses BASE instead of ACID: Basically Available, Soft state, Eventually consistent.

Advanced: Replication & Sharding

  • Replication = copies of the same data on multiple machines. One primary for writes, replicas for reads. Survives a node failure.
  • Sharding = splitting the data itself across machines ("users 1–100 here, 101–200 there"). Lets you scale beyond one server's RAM.
Replication = high availability + read scaling. Sharding = write scaling. Most apps need replication long before sharding.

Advanced: Migrations & Schema Evolution

Production schemas change. Never edit them manually. Use migration tools (Prisma Migrate, Knex, Flyway, Liquibase) so every environment ends up identical and changes are reviewable in PRs.

Golden rule: expand, then contract. Add the new column β†’ dual-write β†’ backfill β†’ switch reads β†’ drop the old column. Never break the old code in one step.

Advanced: Picking Between Postgres and Mongo (2026 Take)

  • Pick Postgres when relationships matter, you want strong types, or you might need analytics queries later. JSONB even gives you document-store flexibility when you need it.
  • Pick MongoDB when documents are naturally self-contained, your shape changes weekly, or you want to start dead simple.
  • Add Redis to either one, almost always, for caching and sessions.
There is no universal winner β€” the answer is your access patterns.

Practice Path

1. Spin up Postgres in Docker, create a users and posts table with a foreign key, insert a few rows. 2. Connect from Node with pg using a pool, run a parameterised query. 3. Add an index on posts.created_at, compare EXPLAIN ANALYZE before and after. 4. Wrap a two-statement update in BEGIN/COMMIT and intentionally throw to see the rollback.