Common Pitfalls (pickling, __main__)
1) Missing main guard
Always use:
main_guard.py
if __name__ == "__main__":
passmain_guard.py
if __name__ == "__main__":
passWithout this, some environments will repeatedly spawn child processes.
2) Pickling errors
Multiprocessing needs to serialize (pickle) functions and data.
Avoid:
- lambdas
- nested functions
- open file handles
- database connections
Prefer:
- top-level functions
- simple data (numbers/strings/lists/dicts)
3) Oversubscribing CPUs
Creating too many processes can slow down your system.
Guidance:
- start with
os.cpu_count()os.cpu_count() - benchmark for your workloads
4) Returning huge data
Sending massive arrays through Queue can be slow.
Options:
- write output to files
- batch results
- aggregate inside workers
๐งช Try It Yourself
Exercise 1 โ Start a Process
Exercise 2 โ Process Pool map()
Exercise 3 โ Multiprocessing Queue
If this helped you, consider buying me a coffee โ
Buy me a coffeeWas this page helpful?
Let us know how we did
