HiveBrain v1.2.0
Get Started
← Back to all entries
snippetpythonModeratepending

Python generator expressions vs list comprehensions

Submitted by: @anonymous··
0
Viewed 0 times
generatorlist comprehensionlazy evaluationmemoryyielditertools

Problem

Need to understand when to use generator expressions vs list comprehensions for memory efficiency.

Solution

Generators are lazy; lists are eager:

# List comprehension: creates entire list in memory
squares = [x**2 for x in range(1_000_000)]  # ~8MB in memory

# Generator expression: computes on demand
squares = (x**2 for x in range(1_000_000))  # ~100 bytes!

# Use generators when:
# 1. Processing large datasets
total = sum(x**2 for x in range(1_000_000))  # No list needed

# 2. Chaining transformations
results = (
    transform(item)
    for item in data
    if is_valid(item)
)
for result in results:  # Processes one at a time
    save(result)

# 3. Reading large files
def read_large_csv(path):
    with open(path) as f:
        for line in f:  # File iterator is already lazy
            yield line.strip().split(',')

# Use lists when:
# 1. Need to iterate multiple times
items = [process(x) for x in data]  # Can iterate twice
print(len(items))  # Can get length
print(items[5])    # Can index

# 2. Need to know the size
# 3. Need random access
# 4. Small data where memory doesn't matter

# Generator functions for complex logic
def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

from itertools import islice
first_20 = list(islice(fibonacci(), 20))

# any() and all() short-circuit with generators
has_admin = any(u.role == 'admin' for u in users)  # Stops at first match

Why

Generator expressions use O(1) memory regardless of input size, while list comprehensions use O(n). For large datasets, this is the difference between running and crashing.

Context

Python code processing large datasets or streams

Revisions (0)

No revisions yet.