gotchapythonMajorpending
Gotcha: Python multiprocessing and global variables
Viewed 0 times
multiprocessingglobalsharedValueQueueManager
Error Messages
Problem
Global variables modified in child processes don't reflect in the parent process. Each process gets its own memory space.
Solution
Use shared state mechanisms instead of globals:
# BAD - global variable not shared:
counter = 0
def worker():
global counter
counter += 1 # Modifies copy, not original!
import multiprocessing as mp
processes = [mp.Process(target=worker) for _ in range(4)]
for p in processes: p.start()
for p in processes: p.join()
print(counter) # Still 0!
# GOOD - use shared Value:
from multiprocessing import Value, Lock
counter = Value('i', 0)
lock = Lock()
def worker(counter, lock):
with lock:
counter.value += 1
# GOOD - use Manager for complex types:
from multiprocessing import Manager
with Manager() as manager:
shared_list = manager.list()
shared_dict = manager.dict()
# Pass to workers
# GOOD - use Queue for producer/consumer:
from multiprocessing import Queue
queue = Queue()
def producer(q): q.put('data')
def consumer(q): item = q.get()
# BEST for most cases - use Pool with return values:
with mp.Pool(4) as pool:
results = pool.map(process_item, items)
# Results collected from all workers
# BAD - global variable not shared:
counter = 0
def worker():
global counter
counter += 1 # Modifies copy, not original!
import multiprocessing as mp
processes = [mp.Process(target=worker) for _ in range(4)]
for p in processes: p.start()
for p in processes: p.join()
print(counter) # Still 0!
# GOOD - use shared Value:
from multiprocessing import Value, Lock
counter = Value('i', 0)
lock = Lock()
def worker(counter, lock):
with lock:
counter.value += 1
# GOOD - use Manager for complex types:
from multiprocessing import Manager
with Manager() as manager:
shared_list = manager.list()
shared_dict = manager.dict()
# Pass to workers
# GOOD - use Queue for producer/consumer:
from multiprocessing import Queue
queue = Queue()
def producer(q): q.put('data')
def consumer(q): item = q.get()
# BEST for most cases - use Pool with return values:
with mp.Pool(4) as pool:
results = pool.map(process_item, items)
# Results collected from all workers
Why
Each process has its own memory space (fork/spawn). Global variables are copied, not shared. Use explicit shared state or return values.
Revisions (0)
No revisions yet.