Last 30 Days
No notifications
Python provides built-in functions to read, write, and manage files. The open() function is your gateway to file I/O, and the with statement ensures files are properly closed.
file = open("data.txt", "r") # Open for reading
content = file.read()
file.close() # Must close manually!| Mode | Description | Creates? | Truncates? |
"r" | Read (default) | No | No |
"w" | Write | Yes | Yes |
"a" | Append | Yes | No |
"x" | Exclusive create | Yes (error if exists) | No |
"rb" | Read binary | No | No |
"w+" | Read + Write | Yes | Yes |
with Statement (Context Manager)Always use with — it automatically closes the file, even if an exception occurs:
with open("data.txt", "r") as f:
content = f.read()
# File is auto-closed here| Method | Returns |
f.read() | Entire file as a string |
f.read(n) | Next n characters |
f.readline() | Next line (with \n) |
f.readlines() | List of all lines |
for line in f | Iterate line-by-line (memory efficient) |
with open("output.txt", "w") as f:
f.write("Line 1\n")
f.writelines(["Line 2\n", "Line 3\n"])import csv
with open("data.csv") as f:
reader = csv.DictReader(f)
for row in reader:
print(row["name"], row["score"])import json
with open("config.json") as f:
data = json.load(f) # Parse JSON → dictwith open("output.json", "w") as f:
json.dump(data, f, indent=2) # Dict → JSON file
from pathlib import Path
p = Path("data") / "file.txt"
text = p.read_text()
p.write_text("new content")> Tip: Always use with for file operations. For large files, iterate line-by-line instead of reading the whole file into memory.
Real programs read and write data: config files, logs, CSVs, JSON APIs, images, models. Python's file I/O is short and safe — once you internalise with open(...) and pick the right mode, the rest is just choosing the format.
A file is a stream of bytes the OS lets you read/write. Python wraps that with a friendly object: open it, read or write, close it. The with block guarantees the close call — even on errors.
with open("notes.txt", "r", encoding="utf-8") as f:
content = f.read()with open("out.txt", "w", encoding="utf-8") as f:
f.write("hello\n")
Always:
1. Use with (auto-closes).
2. Specify encoding="utf-8" for text — platform defaults bite you on Windows.
3. Pick the right mode.
| Mode | Means |
"r" | read text (default) |
"w" | write text, truncates existing file |
"a" | append text, creates if missing |
"x" | exclusive create, fails if exists |
"rb" / "wb" / "ab" | binary versions (raw bytes) |
"r+" | read and write (rare; usually pick one) |
f.read() # whole file as one string
f.read(1024) # first 1024 chars
f.readline() # one line (incl. \n)
f.readlines() # list of all lines
for line in f: # streams line-by-line, low memory
process(line.rstrip())Prefer the for line in f form for big files — it doesn't load everything at once.
with open("log.txt", "a", encoding="utf-8") as f:
f.write("event 1\n")
f.writelines(["event 2\n", "event 3\n"]) # no auto-newlinewrite/writelines don't add newlines — you handle them.
1. Forgetting with. open without with leaks file handles when an exception happens.
2. Using "w" when you meant "a". "w" truncates the file the moment you open it. Use "a" to append.
3. No encoding. On Windows, default is often cp1252; you'll get UnicodeDecodeError on non-ASCII text. Always pass encoding="utf-8".
4. Reading huge files with .read(). Loads the whole thing into RAM. Iterate line-by-line.
5. Mixing text and binary. Don't open an image with "r" — use "rb" for any non-text.
6. Hard-coded paths. Use pathlib.Path and join with / so your code works on Windows, mac, and Linux.
pathlib — Modern Pathsfrom pathlib import Pathbase = Path("data")
file = base / "users" / "alice.json"
file.parent.mkdir(parents=True, exist_ok=True)
file.write_text('{"name": "alice"}', encoding="utf-8")
text = file.read_text(encoding="utf-8")
for p in base.rglob("*.json"):
print(p, p.stat().st_size)
Path gives you a friendly, OS-aware API: joining (/), exists(), mkdir(parents=True), rglob, stat, with_suffix. Prefer it over os.path in new code.
import jsondata = json.loads('{"name": "Alice", "age": 30}') # str → dict
text = json.dumps(data, indent=2, ensure_ascii=False) # dict → str
with open("users.json", "r", encoding="utf-8") as f:
users = json.load(f)
with open("users.json", "w", encoding="utf-8") as f:
json.dump(users, f, indent=2, ensure_ascii=False)
Mnemonic: s = string (loads/dumps), no s = file (load/dump).
import csvwith open("users.csv", newline="", encoding="utf-8") as f:
for row in csv.DictReader(f):
print(row["name"], row["age"])
with open("out.csv", "w", newline="", encoding="utf-8") as f:
w = csv.DictWriter(f, fieldnames=["name", "age"])
w.writeheader()
w.writerows([{"name": "Alice", "age": 30}])
newline="" is the standard incantation — avoids double-newlines on Windows. For anything beyond trivial CSVs, reach for pandas (pd.read_csv).
with open("image.png", "rb") as f:
data = f.read() # byteswith open("copy.png", "wb") as f:
f.write(data)
Binary mode never decodes — you get raw bytes. Use it for images, PDFs, executables, anything non-text.
The with statement isn't magic — it just calls __enter__ and __exit__ on whatever follows. You can build your own:
from contextlib import contextmanager@contextmanager
def chdir(path):
import os
old = os.getcwd()
os.chdir(path)
try:
yield
finally:
os.chdir(old)
with chdir("/tmp"):
...
Pattern: acquire → yield → release. Used for files, locks, transactions, temporary directories, mocks in tests.
Naive write of a config file: open config.json, start writing, crash mid-way — file is corrupted. Atomic write:
import os, tempfile, jsondef atomic_write_json(path, data):
dir_ = os.path.dirname(path) or "."
fd, tmp = tempfile.mkstemp(dir=dir_)
try:
with os.fdopen(fd, "w", encoding="utf-8") as f:
json.dump(data, f)
os.replace(tmp, path) # atomic rename
finally:
if os.path.exists(tmp):
os.remove(tmp)
os.replace is atomic on the same filesystem — the file either has the old contents or the new, never half-written.
utf-8 unless someone forces another codec on you.utf-8-sig (UTF-8 with BOM). Read with encoding="utf-8-sig" to drop the BOM cleanly.chardet / charset-normalizer can guess.for line in f:).while chunk := f.read(64 * 1024):).gzip.open, zipfile, tarfile open them directly without manual extract.1. Read notes.txt line-by-line and print only lines starting with #.
2. Convert a list of dicts to JSON on disk, then read it back and assert equality.
3. Walk a directory recursively with Path.rglob("*.py") and print each file's line count.
4. Implement an atomic-write helper using tempfile.mkstemp + os.replace; intentionally crash before os.replace and confirm the original file is unchanged.