Notifications

No notifications

/Phase 1

Excel Functions & Formulas

Excel Functions — The Core Toolkit for Data Analysts

Excel is the most widely used analytics tool in business. Mastering formulas turns raw data into insights without writing a single line of code.

Lookup Functions — Finding Data Across Sheets

FunctionPurposeSyntax
VLOOKUPSearch first column, return value=VLOOKUP(lookup, range, col, FALSE)
INDEX/MATCHFlexible lookup (any direction)=INDEX(range, MATCH(val, col, 0))
XLOOKUPModern replacement for VLOOKUP=XLOOKUP(val, lookup_arr, return_arr)

Real Example: Look up an employee's salary from a separate department table:

=VLOOKUP(A2, Departments!A:C, 3, FALSE)
=INDEX(Departments!C:C, MATCH(A2, Departments!A:A, 0))

Conditional Aggregation — Counting & Summing with Criteria

FunctionExampleResult
SUMIF=SUMIF(B:B,"North",C:C)Sum sales for North region
COUNTIF=COUNTIF(D:D,"Completed")Count completed orders
SUMIFS=SUMIFS(C:C,B:B,"North",D:D,">1000")Sum North sales over 1000
AVERAGEIF=AVERAGEIF(B:B,"East",C:C)Average sales for East

Text Functions — Cleaning Messy Data

=LEFT(A2, 3)           → First 3 characters
=RIGHT(A2, 4)          → Last 4 characters
=MID(A2, 5, 3)         → 3 chars starting at position 5
=TRIM(A2)              → Remove extra spaces
=CONCATENATE(A2," ",B2) → Join with space (or use &)
=TEXT(A2, "MM/DD/YYYY") → Format date as text

References & Named Ranges

  • Relative: A1 — shifts when copied
  • Absolute: $A$1 — locked; use for constants like tax rate
  • Named Range: Select cells → Name Box → type "SalesData" → use =SUM(SalesData)

On this page

Detailed Theory

Excel formulas turn a static grid into a tiny programming language. Once you can wire cells together with lookups, conditions, and aggregates, 80% of "data analysis" is just careful spreadsheet work — and most companies still run on it.

What a Formula Actually Is

Every formula starts with =. Excel reads it left-to-right, evaluates the inner functions first, and writes the result into the cell. The cell stores the *formula*; you see the *value*.

=A1 + B1            → add two cells
=SUM(A1:A100)       → add a range
=AVERAGE(A1:A100)   → mean of a range
=ROUND(A1, 2)       → round to 2 decimals

The whole sheet recalculates whenever an input changes. That is the magic.

Cell References — The Single Most Confusing Thing

When you copy a formula, references *move*. Lock them with $.

ReferenceBehaviour when copied
A1Both move (relative)
$A1Column locked, row moves
A$1Row locked, column moves
$A$1Fully locked (absolute)

Classic use: =B2*$D$1 for tax rate — copy it down 1000 rows, the rate cell stays put.

The Six Functions That Cover Most Days

FunctionUse
SUM / AVERAGE / COUNTAggregate a range
IFOne condition
SUMIF / COUNTIFAggregate if a condition matches
SUMIFS / COUNTIFSAggregate with multiple conditions
VLOOKUP / XLOOKUPFind a value in a table
IFERRORReplace ugly errors with friendly text

Learn these well before chasing fancier ones.

Beginner Mistakes to Skip

1. Forgetting $ when copying formulas. Half the broken sheets you will ever see are this. 2. Numbers stored as text. A green triangle in the corner = text. SUM will silently ignore them. Convert with VALUE() or paste-special multiply by 1. 3. Hard-coding values inside formulas. Put the tax rate in a cell, reference it. Otherwise changing it means hunting through 50 formulas. 4. Merged cells. They look pretty and break sorting, filtering, and lookups. Use "Center Across Selection" instead. 5. Mixing data and presentation. Bold/colour your dashboard, not your raw data sheet. 6. Trusting == on dates. Dates are numbers underneath. 01/02/2026 may be Jan 2 or Feb 1 depending on locale — always use DATE(y,m,d) in formulas.

Intermediate: IF and Nested Logic

=IF(A2 >= 60, "Pass", "Fail")
=IFS(A2>=90,"A", A2>=80,"B", A2>=70,"C", TRUE,"F")   ' modern, readable
=IF(AND(score>=60, attendance>=0.75), "Pass", "Fail")
=IF(OR(role="admin", role="owner"), "Allowed", "No")

Reach for IFS instead of nested IF once you have more than 2 branches — readability matters.

Intermediate: SUMIFS / COUNTIFS

=SUMIFS(sum_range,  criteria_rng1, c1,  criteria_rng2, c2, ...)
=SUMIFS(D:D, B:B, "North", C:C, "2026")
=COUNTIFS(B:B, "North", C:C, ">=2026-01-01")

Key trap: with SUMIFS the sum range comes first; with SUMIF it comes last. They are not consistent. Wildcards * and ? work inside criteria.

Intermediate: Text Functions

=LEFT(A1, 3)              → first 3 chars
=MID(A1, 4, 2)            → chars 4-5
=LEN(TRIM(A1))            → length without padding spaces
=UPPER / LOWER / PROPER
=CONCAT(A1, " - ", B1)
=TEXTSPLIT(A1, ",")        → modern, returns array
=SUBSTITUTE(A1, "-", "")

Text cleanup is half of real data work. TRIM + CLEAN first, always.

Intermediate: Lookups — VLOOKUP, XLOOKUP, INDEX/MATCH

=VLOOKUP("Pen", A:C, 3, FALSE)
=XLOOKUP("Pen", A:A, C:C, "Not found")
=INDEX(C:C, MATCH("Pen", A:A, 0))

  • VLOOKUP — the classic, but only searches the leftmost column and breaks if you insert a column.
  • XLOOKUP *(modern, recommended)* — looks any direction, has a built-in "if not found" argument.
  • INDEX/MATCH — the trick that worked before XLOOKUP and still does on older Excel.

Advanced: Dates & Time

=TODAY() / =NOW()
=DATE(2026, 4, 26)
=YEAR(A1)  =MONTH(A1)  =DAY(A1)
=EOMONTH(A1, 0)            → last day of that month
=NETWORKDAYS(start, end)   → working days, ignores weekends
=A1 - 30                   → 30 days earlier (dates are numbers)

Format displays the look (yyyy-mm-dd); the underlying value is always a serial number.

Advanced: Dynamic Arrays (Modern Excel / Microsoft 365)

Formulas now spill results into many cells automatically:

=UNIQUE(A2:A100)               → list distinct values
=SORT(A2:A100, 1, -1)          → sort descending
=FILTER(A2:C100, B2:B100="IN") → sub-table where country = IN
=LET(rate, VLOOKUP(B2, rates, 2, FALSE), price * (1 + rate))

These replace fragile array formulas (Ctrl+Shift+Enter) and read top-to-bottom like real code.

Advanced: Errors & Auditing

#DIV/0!     → dividing by 0 / blank
#N/A        → lookup failed
#REF!       → reference deleted
#VALUE!     → wrong type (text where number expected)
#NAME?      → misspelled function or named range

=IFERROR(VLOOKUP(...), "Not found") =IFNA(XLOOKUP(...), "Missing")

Use Trace Precedents / Trace Dependents (Formulas tab) to see which cells flow into a formula — invaluable for debugging inherited spreadsheets.

Advanced: Tables, Named Ranges, Structured References

Convert your data range to a Table (Ctrl+T). Then formulas become self-documenting:

=SUMIFS(Sales[Amount], Sales[Region], "North")

Tables auto-expand, named ranges replace cryptic $D$1:$D$10000, and pivot tables / charts pointing at a Table refresh cleanly.

Practice Path

1. Build a sales sheet with 200 rows; compute totals, averages, top-5 per region using SUMIFS + SORT + FILTER. 2. Replace one nested IF chain with IFS and one VLOOKUP with XLOOKUP. 3. Add an input cell for tax rate and reference it everywhere with $D$1 — change it once, watch the whole sheet recompute. 4. Convert the data to a Table, name the ranges, rebuild a key formula using structured references.