CampusCrate is a student operating system for B.Tech students in India, combining opportunities, academic resources, communities, learning hubs, and career tools in one platform.

Who is CampusCrate built for?

CampusCrate is built for Indian engineering and B.Tech students who need a structured place to discover opportunities, access college resources, join societies, learn technical skills, and build their career profile.

What can students find on CampusCrate?

Students can find hackathons, internships, competitions, notes, test papers, cheatsheets, college communities, DSA and development learning tracks, roadmaps, profiles, and AI career tools.

Learn Code and Practice

Git & Version Control — Track Every Change in Your Data Projects

Version control is essential for data analysts. It lets you track changes to scripts, notebooks, and SQL queries — roll back mistakes, collaborate with teammates, and maintain a clean project history.

Why Git for Data Analytics?

Reproducibility: Know exactly what code produced a report
Collaboration: Multiple analysts work on the same project without overwriting each other
Safety: Undo mistakes by reverting to any previous version
Accountability: See who changed what and when via git log and git blame

Core Git Workflow

Working Directory  →  Staging Area  →  Local Repository  →  Remote (GitHub)
   (edit files)       (git add)        (git commit)         (git push)

Essential Commands

Command	Purpose	Example
`git init`	Initialize a new repo	`git init my-analysis`
`git status`	Check file states	`git status`
`git add`	Stage files for commit	`git add analysis.py`
`git commit`	Save a snapshot	`git commit -m "Add Q1 analysis"`
`git log`	View commit history	`git log --oneline`
`git diff`	See uncommitted changes	`git diff analysis.py`
`git branch`	Create/list branches	`git branch feature-q2`
`git checkout`	Switch branches	`git checkout feature-q2`
`git merge`	Combine branches	`git merge feature-q2`
`git push`	Upload to remote	`git push origin main`
`git pull`	Download latest changes	`git pull origin main`

.gitignore for Data Projects

Data files, credentials, and environment folders should never be committed:

# .gitignore for data analytics projects
*.csv
*.xlsx
*.parquet
data/raw/
data/processed/
.env
__pycache__/
.ipynb_checkpoints/
*.pyc
venv/
.DS_Store

Branching Strategy

main — production-ready code and final reports
feature/q2-analysis — work-in-progress analysis
fix/data-cleaning-bug — bug fixes

Merge into main only when work is complete and reviewed.

Git is a time machine for files. Every save (a *commit*) is a snapshot you can return to. Every parallel idea (a *branch*) is its own timeline you can develop without breaking the main one. For analysts this matters because notebooks, SQL files, and dashboards are real code — and code without version control gets lost, overwritten, or unreproducible.

What Git Actually Is

Git tracks the contents of a folder (repo) over time. Three places to know:

Place	What lives there	Move with
Working directory	files you're editing	—
Staging area (index)	changes queued for the next commit	`git add`
Local repo (.git)	committed history	`git commit`
Remote (GitHub)	shared copy for the team	`git push` / `git pull`

Almost every Git command is just moving content between these four places.

The 8 Commands That Cover Most Days

git init                 # start a repo (or git clone <url>)
git status               # what changed?
git add file             # stage a change
git commit -m "why"      # save the staged snapshot
git log --oneline        # see history
git branch new-thing     # branch off
git checkout new-thing   # switch branch  (or git switch)
git merge new-thing      # bring it back into main

Learn these well before chasing rebase/cherry-pick/reflog.

Beginner Mistakes to Skip

1. Committing data, secrets, or huge files. Add a .gitignore *before* your first commit: *.csv, .env, __pycache__/, .ipynb_checkpoints/. 2. Useless commit messages like "update" or "fix". Future-you cannot read your mind. 3. Editing on main. Always branch for any non-trivial change. 4. Force-pushing to a shared branch. git push --force rewrites history that others have. Use --force-with-lease and never on main. 5. Pulling without committing local work. Either commit or git stash first — otherwise merge conflicts hit unsaved changes. 6. Running long-running notebooks before committing. Output cells make giant diffs; clear them or use nbstripout.

Intermediate: Branching Mental Model

A branch is just a *pointer to a commit*. Creating one is instant.

main:    A → B → C
                  \
feature:           D → E

Merging feature into main produces F, a merge commit:

main:    A → B → C →───────────F
                  \         /
feature:           D → E →──┘

A *fast-forward* merge happens when main hasn't moved — Git just slides the pointer with no merge commit.

Intermediate: Resolving Merge Conflicts

Git can't auto-merge changes that touch the same lines. It marks them:

<<<<<<< HEAD
revenue = sales * 1.10
=======
revenue = sales * 1.08
>>>>>>> feature

Fix: choose one (or combine), delete the markers, then git add + git commit. git merge --abort backs out if you panic.

Intermediate: Stash, Diff, Restore

git stash               # park dirty work
git stash pop           # bring it back
git diff                # unstaged changes
git diff --staged       # what's about to be committed
git restore file        # discard local edits to file
git restore --staged f  # un-stage f

These four commands save you from "oh no I changed the wrong file" panic.

Intermediate: GitHub PR Workflow

The industry-standard team flow:

1. git checkout -b feat/add-region-filter 2. Commit small, logical changes. 3. git push -u origin feat/add-region-filter. 4. Open a Pull Request on GitHub. Describe *why*. 5. Reviewers comment, you push more commits to the same branch. 6. Squash-merge into main when approved. 7. Delete the branch.

Protect main: required reviews, passing CI, no direct pushes.

Intermediate: Good Commit Messages

<short imperative summary, ≤50 chars><optional body explaining WHY, wrapped at 72 chars>

Examples:

✅ Fix null handling in Q1 revenue calculation
✅ Add region slicer to sales dashboard
❌ update, final, asdf

Many teams use Conventional Commits (feat:, fix:, chore:) to auto-generate changelogs.

Advanced: Rebase vs Merge

Both combine histories; they look different.

Merge preserves the actual timeline (with merge commits).
Rebase *replays* your branch on top of main, producing a linear history.

git checkout feature
git rebase main      # replay feature commits onto current main

Rule of thumb: rebase before pushing, merge after. Never rebase a branch others are working on — it rewrites their history.

Advanced: Undo Toolkit

git commit --amend            # fix the last commit
git reset HEAD~1              # un-commit, keep changes
git reset --hard HEAD~1       # un-commit AND discard (dangerous)
git revert <hash>             # safe undo: makes a new commit
git reflog                    # last-resort recovery of "lost" commits

reflog is your safety net — even "lost" commits live for ~90 days.

Advanced: Tags, Releases & Bisect

git tag v1.0.0 — mark a release point. Push with git push --tags.
GitHub Releases attach binaries / changelogs to a tag.
git bisect does a binary search through history to find which commit introduced a bug — magical when you need it.

Advanced: Notebooks in Git (Analyst-Specific)

Notebooks store outputs and execution counts inline — every re-run is a giant diff. Two fixes:

1. nbstripout — a pre-commit hook that strips outputs. 2. Jupytext — pair the notebook with a clean .py or .md file that's the source of truth in Git.

For data files, use DVC or Git LFS instead of committing CSVs directly.

Advanced: Pre-commit Hooks & CI

Wire up automated checks so bad code never lands:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
  - repo: https://github.com/psf/black
    hooks: [{ id: black }]
  - repo: https://github.com/kynan/nbstripout
    hooks: [{ id: nbstripout }]

GitHub Actions can run the same on every PR — lint, tests, dbt, whatever you need.

Practice Path

1. Initialise a repo for a notebook project, add a .gitignore that excludes data and .ipynb_checkpoints. 2. Create a feature branch, make 3 commits with proper messages, open a PR on GitHub. 3. Trigger an intentional merge conflict, resolve it, finish the merge. 4. Add nbstripout (or pre-commit) and confirm notebook outputs are no longer in diffs.