Unexpected Memory Footprint Differences When Spawning Python Multiprocessing Pool
Solution 1:
Your question touches several loosely coupled mechanics. And it's also one that seems an easy target for additional karma points, but you can feel something's wrong and 3 hours later it's a completely different question. So in return for all the fun I had, you may find useful some of the information below.
TL;DR: Measure used memory, not free. That gives consistent results of (almost) the same result for pool/matrix order and large object size for me.
defmemory():
import resource
# RUSAGE_BOTH is not always available
self = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
children = resource.getrusage(resource.RUSAGE_CHILDREN).ru_maxrss
return self + children
Before answering questions you didn't ask, but those closely related, here's some background.
Background
The most widespread implementation, CPython (both 2 and 3 versions) use reference counting memory management [1]. Whenever you use Python object as value, it's reference counter is increased by one, and decreased back when reference is lost. The counter is an integer defined in C struct holding data of each Python object [2]. Takeaway: reference counter is changing all the time, it is stored along with the rest of object data.
Most "Unix inspired OS" (BSD family, Linux, OSX, etc) sport copy-on-write [3] memory access semantic. After fork()
, two processes have distinct memory page tables pointing to the same physical pages. But OS has marked the pages as write-protected, so when you do any memory write, CPU raises memory access exception, which is handled by OS to copy original page into new place. It walks and quacks like process has isolated memory, but hey, let's save some time (on copying) and RAM while parts of memory are equivalent. Takeaway: fork
(or mp.Pool
) create new processes, but they (almost) don't use any extra memory just yet.
CPython stores "small" objects in large pools (arenas) [4]. In common scenario where you create and destroy large number of small objects, for example, temporary variables inside a function, you don't want to call OS memory management too often. Other programming languages (most compiled ones, at least), use stack for this purpose.
Related questions
- Different memory usage right after
mp.Pool()
without any work done by pool:multiprocessing.Pool.__init__
creates N (for number of CPU detected) worker processes. Copy-on-write semantics begin at this point. - "the claim, on *nix systems, is that a pool worker subprocess copies on write from all the globals in the parent process": multiprocessing copies globals of it's "context", not globals from your module and it does so unconditionally, on any OS. [5]
- Different memory usage of
numpy.ones
and Pythonlist
:matrix = [[1,1,...],[1,2,...],...]
is a Python list of Python lists of Python integers. Lots of Python objects = lots of PyObject_HEAD = lots of ref-counters. Accessing all of them in forked environment would touch all ref-counters, therefore would copy their memory pages.matrix = numpy.ones((50000, 5000))
is a Python object of typenumpy.array
. That's it, one Python object, one ref-counter. The rest is pure low level numbers stored in memory next to each other, no ref-counters involved. For the sake of simplicity, you could usedata = '.'*size
[5] - that also creates a single object in memory.
Sources
- https://docs.python.org/2/c-api/refcounting.html
- https://docs.python.org/2/c-api/structures.html#c.PyObject_HEAD
- http://minnie.tuhs.org/CompArch/Lectures/week09.html#tth_sEc2.8
- http://www.evanjones.ca/memoryallocator/
- https://github.com/python/cpython/search?utf8=%E2%9C%93&q=globals+path%3ALib%2Fmultiprocessing%2F&type=Code
- Getting all said together https://gist.github.com/temoto/af663106a3da414359fa
Post a Comment for "Unexpected Memory Footprint Differences When Spawning Python Multiprocessing Pool"