Notifications

No notifications

/Phase 4

NumPy Fundamentals

NumPy — Numerical Python šŸ”¢

NumPy is the foundation of Python's scientific computing ecosystem. Its core object, the ndarray, enables fast vectorized operations on homogeneous data — up to 50x faster than pure Python lists for numerical work.

Why NumPy?

import numpy as np
# Python list: slow element-by-element loop
result = [x ** 2 for x in range(1_000_000)]

# NumPy: vectorized C-level operation arr = np.arange(1_000_000) result = arr ** 2 # 50x faster

Creating Arrays

FunctionExampleResult
np.array()np.array([1,2,3])[1 2 3]
np.zeros()np.zeros((2,3))2Ɨ3 of zeros
np.ones()np.ones(5)[1. 1. 1. 1. 1.]
np.arange()np.arange(0, 10, 2)[0 2 4 6 8]
np.linspace()np.linspace(0, 1, 5)[0. 0.25 0.5 0.75 1.]
np.eye()np.eye(3)3Ɨ3 identity matrix
np.random.rand()np.random.rand(2, 3)2Ɨ3 uniform random

Indexing & Slicing

a = np.array([[1,2,3],[4,5,6],[7,8,9]])
a[0, 1]      # 2    — single element
a[:, 1]      # [2, 5, 8] — entire column
a[1:, :2]    # [[4,5],[7,8]] — sub-matrix
a[a > 5]     # [6, 7, 8, 9] — boolean indexing

Broadcasting Rules

NumPy automatically expands smaller arrays to match larger ones during arithmetic:

1. Dimensions are compared right-to-left 2. Sizes must match or one of them is 1 3. Missing dimensions are treated as 1

matrix = np.ones((3, 4))   # shape (3, 4)
row = np.array([1, 2, 3, 4])  # shape (4,)
result = matrix + row       # shape (3, 4) — row broadcast

> Tip: Prefer NumPy vectorized operations over Python loops. If you find yourself writing a for-loop over array elements, there's almost certainly a NumPy function that does it faster.

On this page

Detailed Theory

NumPy is the foundation of numerical Python. Pandas, scikit-learn, PyTorch, TensorFlow, OpenCV — they're all built on NumPy arrays. Once you can think in arrays instead of for-loops, your numerical code gets shorter, clearer, and 10–100Ɨ faster.

What a NumPy Array Actually Is

import numpy as np

a = np.array([1, 2, 3, 4]) print(a, a.dtype, a.shape, a.ndim) # [1 2 3 4] int64 (4,) 1

A NumPy ndarray is a fixed-size, single-dtype, contiguous block of memory — think "a C array with a Python wrapper". That's what makes it fast: the CPU can stream through it without dereferencing pointers per element (which Python lists do).

Creating Arrays

np.array([1, 2, 3])
np.zeros((3, 4))         # all zeros, shape 3Ɨ4
np.ones((2, 2))
np.full((2, 3), 7.0)
np.arange(0, 10, 2)       # like range(), returns array
np.linspace(0, 1, 11)     # 11 evenly-spaced values
np.eye(4)                  # identity matrix
np.random.default_rng(42).normal(size=(1000,))

Shape, dtype, Reshape

m = np.arange(12).reshape(3, 4)
m.shape    # (3, 4)
m.dtype    # int64
m.T          # transpose
m.flatten()  # 1-D copy

Think of shape as "axis sizes". A 2-D array is rows Ɨ cols; 3-D is layers Ɨ rows Ɨ cols. reshape rearranges without copying when possible.

Indexing & Slicing

m[0]            # row 0 → [0 1 2 3]
m[0, 2]         # element
m[:, 1]         # whole column
m[1:, :2]        # subgrid
m[m > 5]         # boolean mask → 1-D array of matches
m[[0, 2]]        # "fancy" indexing — rows 0 and 2

Boolean masks and fancy indexing are the secret to readable, fast filtering and reshuffling.

Vectorised Math (the Whole Point)

a = np.array([1, 2, 3])
b = np.array([10, 20, 30])

a + b # [11 22 33] a * 2 # [ 2 4 6] np.sqrt(a) # ufunc — element-wise a @ b # dot product a.sum(), a.mean(), a.std(), a.max()

No loops. Each operation runs in optimised C/SIMD code. Rule of thumb: if you find yourself looping over a NumPy array, stop — there's a vectorised way.

Beginner Mistakes to Skip

1. Looping with for x in arr. Always vectorise. If you must loop, you've probably picked the wrong tool. 2. Confusing shape and size. shape is a tuple; size is total element count. 3. Mixing dtypes. Adding int8 and float64 upcasts; doing it in a loop is slow. 4. Forgetting that slicing returns a *view*. b = a[0:3]; b[0] = 99 mutates a. Use .copy() to detach. 5. np.append in a loop. Reallocates each time — O(n²). Pre-allocate or build a Python list and convert once. 6. == returning an array. if a == b: raises "ambiguous truth value". Use (a == b).all() or np.array_equal(a, b).

Intermediate: Broadcasting

The single most powerful (and confusing) NumPy feature. Arrays of different shapes can combine when their shapes are *compatible*:

M = np.ones((3, 4))
row = np.array([1, 2, 3, 4])
(M + row).shape    # (3, 4) — row added to every row of M

col = np.array([[10], [20], [30]]) (M + col).shape # (3, 4) — col added to every column

Rule: trailing dimensions must match or be 1. Lets you replace whole nested loops with single lines.

Intermediate: Aggregations Along an Axis

m = np.arange(12).reshape(3, 4)
m.sum()              # 66 — over everything
m.sum(axis=0)        # [12 15 18 21] — down columns
m.sum(axis=1)        # [ 6 22 38] — across rows
m.mean(axis=1)
m.argmax(axis=0)

axis=0 collapses rows (gives one value per column); axis=1 collapses columns. This phrasing trips everyone up at first — just remember: the axis you name is the one that disappears.

Intermediate: Conditional & Where

x = np.array([-2, -1, 0, 1, 2])
np.where(x > 0, x, 0)        # ReLU → [0 0 0 1 2]
np.clip(x, 0, 1)              # → [0 0 0 1 1]
np.select([x<0, x>0], ["neg","pos"], default="zero")

np.where replaces vectorised if/else. Combined with masks, you rarely need explicit branching.

Intermediate: Random with a Generator

rng = np.random.default_rng(42)
rng.integers(0, 10, size=5)
rng.normal(loc=0, scale=1, size=(3, 3))
rng.choice(["A", "B", "C"], size=10, p=[.5, .3, .2])

The new Generator API (NumPy 1.17+) is the right one to use — reproducible, faster, modern.

Advanced: Memory — Views, Copies, Strides

  • View — same memory, different shape/slice. Mutating it mutates the source. Cheap.
  • Copy — new memory. a.copy(), fancy indexing, boolean masking all copy.
  • Strides — the byte-step between elements along each axis. Reshapes and transposes typically just adjust strides instead of copying.
Check with a.flags['OWNDATA'] and a.strides. Knowing this prevents "why did my change leak?" bugs and surprise allocations.

Advanced: dtype Choice & Memory

big = np.zeros(1_000_000, dtype=np.float64)   # ~8 MB
lean = np.zeros(1_000_000, dtype=np.float32)  # ~4 MB

For ML inference and large tensors, float32 (or even float16 / bfloat16) is standard. Match dtype to the actual range of values — don't store flags as int64.

Advanced: Linear Algebra

A = np.random.rand(3, 3)
b = np.random.rand(3)

x = np.linalg.solve(A, b) # Ax = b A_inv = np.linalg.inv(A) # rarely needed; prefer solve U, S, Vt = np.linalg.svd(A) eigvals, eigvecs = np.linalg.eig(A)

Foundation for regressions, PCA, recommendations, graphics. Real ML code reaches for these constantly.

Advanced: When NumPy Isn't Enough

  • GPU + autodiff → PyTorch / JAX (NumPy-like API, runs on GPUs, computes gradients).
  • DataFrames → Pandas / Polars (labelled rows + heterogeneous columns).
  • Out-of-memory data → Dask / Vaex.
  • Sparse matrices → SciPy scipy.sparse.
NumPy is the lingua franca — most of these libraries accept and return ndarrays.

Practice Path

1. Create a 1000Ɨ10 matrix of random normals; print the per-column mean and std using axis=. 2. Replace every negative value in an array with 0 using np.where (one line). 3. Demonstrate broadcasting by subtracting a 1Ɨ4 row vector from a 3Ɨ4 matrix; explain why it works. 4. Compare timings of summing 1M numbers via sum(...) (Python list) vs arr.sum() (NumPy) using time.perf_counter().