snippetpythonModeratepending
Python generator expressions vs list comprehensions
Viewed 0 times
generatorlist comprehensionlazy evaluationmemoryyielditertools
Problem
Need to understand when to use generator expressions vs list comprehensions for memory efficiency.
Solution
Generators are lazy; lists are eager:
# List comprehension: creates entire list in memory
squares = [x**2 for x in range(1_000_000)] # ~8MB in memory
# Generator expression: computes on demand
squares = (x**2 for x in range(1_000_000)) # ~100 bytes!
# Use generators when:
# 1. Processing large datasets
total = sum(x**2 for x in range(1_000_000)) # No list needed
# 2. Chaining transformations
results = (
transform(item)
for item in data
if is_valid(item)
)
for result in results: # Processes one at a time
save(result)
# 3. Reading large files
def read_large_csv(path):
with open(path) as f:
for line in f: # File iterator is already lazy
yield line.strip().split(',')
# Use lists when:
# 1. Need to iterate multiple times
items = [process(x) for x in data] # Can iterate twice
print(len(items)) # Can get length
print(items[5]) # Can index
# 2. Need to know the size
# 3. Need random access
# 4. Small data where memory doesn't matter
# Generator functions for complex logic
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
from itertools import islice
first_20 = list(islice(fibonacci(), 20))
# any() and all() short-circuit with generators
has_admin = any(u.role == 'admin' for u in users) # Stops at first matchWhy
Generator expressions use O(1) memory regardless of input size, while list comprehensions use O(n). For large datasets, this is the difference between running and crashing.
Context
Python code processing large datasets or streams
Revisions (0)
No revisions yet.