Python 多处理中的共享内存
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14124588/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Shared memory in multiprocessing
提问by FableBlaze
I have three large lists. First contains bitarrays (module bitarray 0.8.0) and the other two contain arrays of integers.
我有三个大名单。第一个包含位数组(模块 bitarray 0.8.0),另外两个包含整数数组。
l1=[bitarray 1, bitarray 2, ... ,bitarray n]
l2=[array 1, array 2, ... , array n]
l3=[array 1, array 2, ... , array n]
These data structures take quite a bit of RAM (~16GB total).
这些数据结构需要相当多的 RAM(总共约 16GB)。
If i start 12 sub-processes using:
如果我使用以下方法启动 12 个子流程:
multiprocessing.Process(target=someFunction, args=(l1,l2,l3))
Does this mean that l1, l2 and l3 will be copied for each sub-process or will the sub-processes share these lists? Or to be more direct, will I use 16GB or 192GB of RAM?
这是否意味着将为每个子进程复制 l1、l2 和 l3,或者子进程将共享这些列表?或者更直接地说,我会使用 16GB 还是 192GB 的 RAM?
someFunction will read some values from these lists and then performs some calculations based on the values read. The results will be returned to the parent-process. The lists l1, l2 and l3 will not be modified by someFunction.
someFunction 将从这些列表中读取一些值,然后根据读取的值执行一些计算。结果将返回给父进程。列表 l1、l2 和 l3 不会被 someFunction 修改。
Therefore i would assume that the sub-processes do not need and would not copy these huge lists but would instead just share them with the parent. Meaning that the program would take 16GB of RAM (regardless of how many sub-processes i start) due to the copy-on-write approach under linux? Am i correct or am i missing something that would cause the lists to be copied?
因此,我会假设子流程不需要也不会复制这些巨大的列表,而只会与父进程共享它们。这意味着由于 linux 下的写时复制方法,该程序将占用 16GB 的 RAM(无论我启动了多少个子进程)?我是正确的还是我遗漏了一些会导致列表被复制的东西?
EDIT: I am still confused, after reading a bit more on the subject. On the one hand Linux uses copy-on-write, which should mean that no data is copied. On the other hand, accessing the object will change its ref-count (i am still unsure why and what does that mean). Even so, will the entire object be copied?
编辑:在阅读更多关于该主题的内容后,我仍然感到困惑。一方面,Linux 使用写时复制,这意味着不会复制任何数据。另一方面,访问对象会改变它的引用计数(我仍然不确定为什么以及这意味着什么)。即便如此,整个对象会被复制吗?
For example if i define someFunction as follows:
例如,如果我定义 someFunction 如下:
def someFunction(list1, list2, list3):
i=random.randint(0,99999)
print list1[i], list2[i], list3[i]
Would using this function mean that l1, l2 and l3 will be copied entirely for each sub-process?
使用此函数是否意味着将为每个子进程完全复制 l1、l2 和 l3?
Is there a way to check for this?
有没有办法检查这个?
EDIT2After reading a bit more and monitoring total memory usage of the system while sub-processes are running, it seems that entire objects are indeed copied for each sub-process. And it seems to be because reference counting.
EDIT2在子进程运行时阅读更多内容并监视系统的总内存使用情况后,似乎确实为每个子进程复制了整个对象。这似乎是因为引用计数。
The reference counting for l1, l2 and l3 is actually unneeded in my program. This is because l1, l2 and l3 will be kept in memory (unchanged) until the parent-process exits. There is no need to free the memory used by these lists until then. In fact i know for sure that the reference count will remain above 0 (for these lists and every object in these lists) until the program exits.
在我的程序中实际上不需要 l1、l2 和 l3 的引用计数。这是因为 l1、l2 和 l3 将保存在内存中(不变),直到父进程退出。在此之前无需释放这些列表使用的内存。事实上,我确信引用计数将保持在 0 以上(对于这些列表和这些列表中的每个对象),直到程序退出。
So now the question becomes, how can i make sure that the objects will not be copied to each sub-process? Can i perhaps disable reference counting for these lists and each object in these lists?
所以现在问题变成了,我如何确保对象不会被复制到每个子进程?我可以禁用这些列表和这些列表中的每个对象的引用计数吗?
EDIT3Just an additional note. Sub-processes do not need to modify l1, l2and l3or any objects in these lists. The sub-processes only need to be able to reference some of these objects without causing the memory to be copied for each sub-process.
EDIT3只是一个附加说明。子流程不需要修改l1,l2和/l3或这些列表中的任何对象。子进程只需要能够引用其中一些对象,而不会导致为每个子进程复制内存。
采纳答案by rob
Generally speaking, there are two ways to share the same data:
一般来说,共享相同的数据有两种方式:
- Multithreading
- Shared memory
- 多线程
- 共享内存
Python's multithreading is not suitable for CPU-bound tasks (because of the GIL), so the usual solution in that case is to go on multiprocessing. However, with this solution you need to explicitly share the data, using multiprocessing.Valueand multiprocessing.Array.
Python 的多线程不适合 CPU 密集型任务(因为 GIL),因此在这种情况下通常的解决方案是继续multiprocessing. 但是,使用此解决方案,您需要使用multiprocessing.Value和显式共享数据multiprocessing.Array。
Note that usually sharing data between processes may not be the best choice, because of all the synchronization issues; an approach involving actors exchanging messages is usually seen as a better choice. See also Python documentation:
请注意,由于所有同步问题,通常在进程之间共享数据可能不是最佳选择;参与者交换信息的方法通常被视为更好的选择。另请参阅Python 文档:
As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible. This is particularly true when using multiple processes.
However, if you really do need to use some shared data then multiprocessing provides a couple of ways of doing so.
如上所述,在进行并发编程时,通常最好尽可能避免使用共享状态。使用多个进程时尤其如此。
但是,如果您确实需要使用一些共享数据,那么多处理提供了几种这样做的方法。
In your case, you need to wrap l1, l2and l3in some way understandable by multiprocessing(e.g. by using a multiprocessing.Array), and then pass them as parameters.
Note also that, as you said you do not need write access, then you should pass lock=Falsewhile creating the objects, or all access will be still serialized.
在您的情况下,您需要 wrap l1,l2并l3以某种方式可以理解multiprocessing(例如通过使用 a multiprocessing.Array),然后将它们作为参数传递。
另请注意,正如您所说,您不需要写访问权限,那么您应该lock=False在创建对象时传递,否则所有访问权限仍将被序列化。
回答by SanityIO
If you want to make use of copy-on-write feature and your data is static(unchanged in child processes) - you should make python don't mess with memory blocks where your data lies. You can easily do this by using C or C++ structures (stl for instance) as containers and provide your own python wrappers that will use pointers to data memory (or possibly copy data mem) when python-level object will be created if any at all. All this can be done very easy with almost python simplicity and syntax with cython.
如果您想使用写时复制功能并且您的数据是静态的(在子进程中未更改) - 您应该让 python 不要弄乱数据所在的内存块。您可以通过使用 C 或 C++ 结构(例如 stl)作为容器轻松实现这一点,并提供您自己的 Python 包装器,当将创建 Python 级对象(如果有的话)时,该包装器将使用指向数据内存(或可能复制数据内存)的指针. 所有这一切都可以通过几乎 python 的简单性和cython 的语法轻松完成。
# pseudo cython
cdef class FooContainer:
cdef char * data
def __cinit__(self, char * foo_value):
self.data = malloc(1024, sizeof(char))
memcpy(self.data, foo_value, min(1024, len(foo_value)))
def get(self):
return self.data
# python part
from foo import FooContainer
f = FooContainer("hello world")
pid = fork()
if not pid:
f.get() # this call will read same memory page to where
# parent process wrote 1024 chars of self.data
# and cython will automatically create a new python string
# object from it and return to caller
The above pseudo-code is badly written. Dont use it. In place of self.data should be C or C++ container in your case.
上面的伪代码写得不好。不要使用它。在您的情况下,代替 self.data 应该是 C 或 C++ 容器。
回答by CrabbyPete
You can use memcached or redis and set each as a key value pair {'l1'...
您可以使用 memcached 或 redis 并将每个设置为键值对 {'l1'...
回答by Rboreal_Frippery
Because this is still a very high result on google and no one else has mentioned it yet, I thought I would mention the new possibility of 'true' shared memory which was introduced in python version 3.8.0: https://docs.python.org/3/library/multiprocessing.shared_memory.html
因为这在 google 上仍然是一个非常高的结果,而且还没有其他人提到它,我想我会提到在 python 3.8.0 版中引入的“真实”共享内存的新可能性:https://docs.python .org/3/library/multiprocessing.shared_memory.html
I have here included a small contrived example (tested on linux) where numpy arrays are used, which is likely a very common use case:
我在这里包含了一个使用 numpy 数组的小型人为示例(在 linux 上测试),这可能是一个非常常见的用例:
# one dimension of the 2d array which is shared
dim = 5000
import numpy as np
from multiprocessing import shared_memory, Process, Lock
from multiprocessing import cpu_count, current_process
import time
lock = Lock()
def add_one(shr_name):
existing_shm = shared_memory.SharedMemory(name=shr_name)
np_array = np.ndarray((dim, dim,), dtype=np.int64, buffer=existing_shm.buf)
lock.acquire()
np_array[:] = np_array[0] + 1
lock.release()
time.sleep(10) # pause, to see the memory usage in top
print('added one')
existing_shm.close()
def create_shared_block():
a = np.ones(shape=(dim, dim), dtype=np.int64) # Start with an existing NumPy array
shm = shared_memory.SharedMemory(create=True, size=a.nbytes)
# # Now create a NumPy array backed by shared memory
np_array = np.ndarray(a.shape, dtype=np.int64, buffer=shm.buf)
np_array[:] = a[:] # Copy the original data into shared memory
return shm, np_array
if current_process().name == "MainProcess":
print("creating shared block")
shr, np_array = create_shared_block()
processes = []
for i in range(cpu_count()):
_process = Process(target=add_one, args=(shr.name,))
processes.append(_process)
_process.start()
for _process in processes:
_process.join()
print("Final array")
print(np_array[:10])
print(np_array[10:])
shr.close()
shr.unlink()
Note that because of the 64 bit ints this code can take about 1gb of ram to run, so make sure that you won't freeze your system using it. ^_^
请注意,由于 64 位整数,此代码可能需要大约 1gb 的内存才能运行,因此请确保使用它时不会冻结系统。^_^

