Notifications

No notifications

/Phase 4

MongoDB & Mongoose

MongoDB — Document Database

MongoDB stores data as JSON-like documents in collections. No fixed schema — each document can have different fields.

Core Concepts

SQL TermMongoDB Term
DatabaseDatabase
TableCollection
RowDocument
ColumnField
JOINPopulate / Lookup
Primary Key_id (auto-generated ObjectId)

Document Example

{
  "_id": ObjectId("507f1f77bcf86cd799439011"),
  "name": "Alice",
  "email": "alice@test.com",
  "age": 28,
  "tags": ["developer", "gamer"],
  "address": {
    "city": "New York",
    "zip": "10001"
  },
  "createdAt": ISODate("2024-01-15T10:30:00Z")
}

Mongoose — ODM for MongoDB

Mongoose provides schema validation, middleware, and query helpers on top of MongoDB.

const mongoose = require('mongoose');

// Define a schema const userSchema = new mongoose.Schema({ name: { type: String, required: true }, email: { type: String, unique: true }, age: { type: Number, min: 0 }, });

// Create model const User = mongoose.model('User', userSchema);

On this page

Detailed Theory

MongoDB stores documents — JSON-like objects — inside collections that live in a database. There are no rows, no columns, no migrations to add a new field. You write JavaScript-shaped data and read JavaScript-shaped data back. That is its whole appeal.

What MongoDB Actually Is

The mental ladder from SQL to MongoDB:

SQLMongoDB
DatabaseDatabase
TableCollection
RowDocument (BSON object)
ColumnField
Primary key_id (auto-generated ObjectId)
JOIN$lookup / populate
Schema*Optional* — enforced by your app, not the DB

// One document
{
  _id: ObjectId('66...'),
  name: 'Alice',
  email: 'a@t.com',
  tags: ['admin', 'beta'],
  address: { city: 'Pune', country: 'IN' },
  createdAt: ISODate('2026-04-01')
}

Documents can hold arrays and nested objects — something SQL needs extra tables for.

The Driver vs Mongoose

Two ways to talk to MongoDB from Node:

  • Native driver (mongodb) — thin wrapper around the protocol. No schema, no helpers. Best when you want full control.
  • Mongoose (mongoose) — ODM (Object-Document Mapper). Adds schemas, validation, hooks, virtuals, populate. The default for most Node apps.
This topic uses Mongoose because that is what 90% of teams reach for.

CRUD in Mongoose, the Short Version

// Create
await User.create({ name: 'Alice', email: 'a@t.com' });

// Read const alice = await User.findOne({ email: 'a@t.com' }); const admins = await User.find({ role: 'admin' }).limit(20).sort('-createdAt');

// Update await User.findByIdAndUpdate(id, { name: 'New' }, { new: true, runValidators: true });

// Delete await User.findByIdAndDelete(id);

The one-liner you must remember: pass { new: true } to update calls, otherwise you get the *old* document back. Endless beginner pain.

Beginner Mistakes to Skip

1. No schema at all — "flexible" turns into garbage data within a month. Use Mongoose schemas with validators. 2. Forgetting new: true on update calls and wondering why your response is stale. 3. Querying with the wrong typeUser.find({ _id: '66...' }) returns nothing because the stored type is ObjectId, not string. Use new mongoose.Types.ObjectId(id) or rely on findById. 4. Loading the whole document to update one field. Use updateOne / $set instead of load → mutate → save. 5. No indexes — a 100k-doc collection without an index on email will collection-scan on every login. 6. Storing JWT tokens, files, or huge logs as documents. Mongo has a 16MB document limit. Keep docs small.

Intermediate: Schemas, Types & Validators

const postSchema = new mongoose.Schema({
  title:   { type: String, required: true, trim: true, maxlength: 200 },
  slug:    { type: String, unique: true, lowercase: true, index: true },
  status:  { type: String, enum: ['draft', 'published', 'archived'], default: 'draft' },
  views:   { type: Number, default: 0, min: 0 },
  tags:    [String],
  author:  { type: mongoose.Schema.Types.ObjectId, ref: 'User', required: true },
  meta:    { readTime: Number, wordCount: Number },
}, { timestamps: true }); // adds createdAt + updatedAt

timestamps: true and ref: 'User' are the two flags that come up everywhere.

Intermediate: Embed vs Reference (the Big Decision)

  • Embed when the child data is owned by, small, and read with the parent. Order → line items.
  • Reference when the child data is shared, large, or queried independently. Post → comments (could be thousands).
A practical heuristic: if you would never query the child without the parent, embed.

Intermediate: Populate (Mongo's Soft JOIN)

const post = await Post.findById(id)
  .populate('author', 'name avatar')
  .populate({ path: 'comments', options: { limit: 20 } });

Under the hood Populate does a *second* query and stitches results in JavaScript. It is convenient but N+1 prone — use $lookup (aggregation) for high-volume joins.

Intermediate: Indexes (Required, Not Optional)

userSchema.index({ email: 1 }, { unique: true });
postSchema.index({ author: 1, createdAt: -1 });   // compound
postSchema.index({ title: 'text', content: 'text' }); // full-text

Rules: index every field used in find, sort, or $match. Compound indexes follow the ESR rule — *Equality, Sort, Range*.

Intermediate: Update Operators

Avoid "load and save". Use atomic operators:

await User.updateOne({ _id }, { $set:  { lastSeen: new Date() } });
await Post.updateOne({ _id }, { $inc:  { views: 1 } });
await Post.updateOne({ _id }, { $push: { tags: 'node' } });
await Post.updateOne({ _id }, { $addToSet: { tags: 'node' } }); // no dupes
await Post.updateOne({ _id }, { $pull: { tags: 'old' } });

These run as a single DB operation — no race conditions.

Intermediate: Hooks (Schema Middleware)

userSchema.pre('save', async function() {
  if (!this.isModified('password')) return;
  this.password = await bcrypt.hash(this.password, 12);
});

userSchema.post('save', function(doc) { if (doc.wasNew) sendWelcomeEmail(doc.email); });

Great for hashing, slug generation, audit logs. Watch out: hooks do not run for updateOne, findByIdAndUpdate etc. unless you explicitly hook those operations.

Advanced: Aggregation Pipelines

Think of it as a Unix pipeline for documents — each stage transforms the stream.

const stats = await Order.aggregate([
  { $match:  { createdAt: { $gte: lastMonth } } },
  { $group:  { _id: '$status', count: { $sum: 1 }, revenue: { $sum: '$total' } } },
  { $sort:   { revenue: -1 } },
  { $project:{ status: '$_id', count: 1, revenue: { $round: ['$revenue', 2] }, _id: 0 } },
]);

Key stages: $match, $group, $project, $sort, $limit, $lookup, $unwind, $facet. Put $match first so indexes get used.

Advanced: Transactions

Available on replica sets and Atlas. Use them for cross-document atomic changes:

const session = await mongoose.startSession();
try {
  await session.withTransaction(async () => {
    await Order.create([orderDoc], { session });
    await Inventory.updateOne({ _id: itemId }, { $inc: { stock: -1 } }, { session });
  });
} finally {
  session.endSession();
}

Do not wrap a single-document change in a transaction — single-doc writes are already atomic.

Advanced: Pagination at Scale

skip + limit works for small data; on huge collections it scans everything skipped. Switch to cursor pagination:

const items = await Post.find(cursor ? { _id: { $gt: cursor } } : {})
  .sort({ _id: 1 })
  .limit(21);
const hasMore = items.length > 20;
if (hasMore) items.pop();
return { items, nextCursor: hasMore ? items.at(-1)._id : null };

Advanced: Schema Anti-Patterns to Avoid

  • Unbounded arrays (post.viewerIds: ObjectId[] that grows forever → will hit the 16MB limit).
  • Massive documents — if it could grow past ~1MB, split it.
  • Many tiny collections — prefer one collection with a type field unless they really differ.
  • Storing money as Number — use Decimal128 to avoid float drift.

Practice Path

1. Build a User and Post model with proper schemas, validators, and timestamps. 2. Add unique index on email and a compound index on (author, createdAt). 3. Implement create / read / update / delete using atomic operators ($set, $inc). 4. Write an aggregation that returns post count + average views per author, sorted descending.