Last 30 Days
No notifications
Dictionaries store data as key-value pairs. Keys must be unique and hashable. Since Python 3.7, dicts maintain insertion order.
person = {"name": "Alice", "age": 30, "city": "NYC"}
empty = {}
from_pairs = dict(name="Bob", age=25)| Operation | Syntax | Description | |||
| Create | d["key"] = value | Add a new key-value pair | |||
| Read | d["key"] or d.get("key") | Access value (get returns None if missing) | |||
| Update | d["key"] = new_value | Overwrite existing value | |||
| Delete | del d["key"] or d.pop("key") | Remove a pair | Useful Dict Methods | Method | Returns |
d.keys() | All keys (view object) | ||||
d.values() | All values (view object) | ||||
d.items() | All (key, value) pairs | ||||
d.get(k, default) | Value for key, or default | ||||
d.update(other) | Merge another dict | ||||
d.setdefault(k, v) | Get or set default |
squares = {x: x**2 for x in range(5)}
# {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}Sets are unordered collections of unique elements:
colors = {"red", "green", "blue"}
nums = set([1, 2, 2, 3]) # {1, 2, 3}| Operation | Syntax | Symbol | |
| Union | a.union(b) | a \ | b |
| Intersection | a.intersection(b) | a & b | |
| Difference | a.difference(b) | a - b | |
| Symmetric Diff | a.symmetric_difference(b) | a ^ b |
frozenset is an immutable set β can be used as a dict key or inside other sets.
> Tip: Use dicts when you need fast lookups by key. Use sets when you need uniqueness or set-math operations.
Dictionaries and sets are Python's *hash-based* collections. A dict maps keys to values (think: a phonebook). A set holds unique values (think: tags on a post). Both give you O(1) average-case lookup β the secret sauce that makes most real-world Python code fast.
user = {
"name": "Alice",
"age": 30,
"role": "admin",
}
user["name"] # "Alice"
user["email"] = "a@x.com" # add
del user["age"] # remove
"role" in user # True (O(1))A dict is an unordered⦠actually, insertion-ordered since Python 3.7. Iterating yields keys in the order they were added.
tags = {"python", "web", "python"} # β {"python", "web"}
tags.add("async")
tags.discard("web")
"python" in tags # O(1)No duplicates, no ordering guarantee. Empty set: set() ({} is an empty dict).
d.get(k, default) # safe lookup, doesn't raise
d.setdefault(k, default) # get or insert
d.pop(k, default)
d.update(other) # merge in
d.keys(), d.values(), d.items()
for k, v in d.items(): ...a | b # union
a & b # intersection
a - b # difference
a ^ b # symmetric difference
a <= b # subsetGreat for de-duping (set(items)) and fast membership filtering.
1. d["missing"] β KeyError. Use d.get("missing") or d.get("missing", default).
2. Using a list as a key. Lists aren't hashable β they can mutate. Use a tuple instead.
3. {} for an empty set. That's an empty dict. Use set().
4. Iterating then mutating. for k in d: del d[k] raises RuntimeError. Iterate list(d.keys()) or build a new dict.
5. d.has_key(k). Removed in Python 3. Use k in d.
6. Mistaking dict order for sortedness. Insertion order β sorted order. Use sorted(d.items()) if you need that.
lengths = {name: len(name) for name in names}
active = {k: v for k, v in users.items() if v["active"]}
uniques = {x.lower() for x in tags}Dict and set comprehensions are as expressive as list comprehensions β use them whenever you're transforming or filtering.
for k in d: # keys
for v in d.values():
for k, v in d.items(): # canonical loopsorted_by_value = sorted(d.items(), key=lambda kv: kv[1], reverse=True)
collections β The Power Toolsfrom collections import defaultdict, Counter, OrderedDict, ChainMapbuckets = defaultdict(list)
for word in words:
buckets[word[0]].append(word) # no manual key check
freq = Counter(words) # {word: count}
freq.most_common(3) # top 3
defaultdict(factory) β missing keys auto-create a value (perfect for grouping).Counter β multiset / histogram. Has most_common, supports + and -.ChainMap β lookup across multiple dicts (config layering).merged = {a, b} # b overrides a
merged = a | b # 3.9+, same idea
a.update(b) # in-placeA dict stores entries in a contiguous array. When you insert key k:
1. Compute hash(k).
2. Take the hash modulo the table size to pick a slot.
3. If the slot is busy (collision), probe to the next free slot.
Lookup repeats steps 1β3 until it finds the key (or an empty slot, meaning "absent"). Average cost: O(1). Worst case (lots of collisions): O(n), but Python's hash randomisation makes that rare.
When the table fills past ~2/3, Python resizes (doubles) and rehashes everything β amortised O(1) writes.
A key's hash is computed *once* and used to find its slot. If a key's hash could change after insertion, lookups would silently break. Hence:
int, float, str, bool, tuple (of hashables), frozenset, immutable custom classes.list, dict, set β they're mutable.__hash__ and __eq__ consistently β @dataclass(frozen=True) does this automatically.| Op | Dict avg | Set avg |
| Lookup / insert / delete | O(1) | O(1) |
| Iteration | O(n) | O(n) |
| Worst case | O(n) | O(n) |
Lists are O(n) for membership, dicts/sets are O(1). When in doubt, prefer a set/dict for lookups.
defaultdict(list).Counter.list(dict.fromkeys(items)).functools.lru_cache (built on a dict).bidict from PyPI).approved = wanted & allowed.1. Convert a list of (key, value) tuples into a dict with a comprehension, then back to a list of tuples.
2. Use Counter to find the 5 most common words in a sentence.
3. Group a list of users by their country field using defaultdict(list).
4. Deduplicate a list while preserving order using dict.fromkeys; explain why this works.