python多处理中的共享变量

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17377426/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:02:38  来源:igfitidea点击:

Shared variable in python's multiprocessing

pythonmultiprocessingpython-multiprocessing

提问by user2435611

First question is what is the difference between Value and Manager().Value?

第一个问题是Value和Manager().Value有什么区别?

Second, is it possible to share integer variable without using Value? Below is my sample code. What I want is getting a dict with a value of integer, not Value. What I did is just change it all after the process. Is there any easier way?

其次,是否可以在不使用 Value 的情况下共享整数变量?下面是我的示例代码。我想要的是得到一个值为整数的字典,而不是值。我所做的只是在过程之后改变它。有没有更简单的方法?

from multiprocessing import Process, Manager

def f(n):
    n.value += 1

if __name__ == '__main__':
    d = {}
    p = []

    for i in range(5):
        d[i] = Manager().Value('i',0)
        p.append(Process(target=f, args=(d[i],)))
        p[i].start()

    for q in p:
        q.join()

    for i in d:
        d[i] = d[i].value

    print d

采纳答案by ChrisP

When you use Valueyou get a ctypesobject in shared memory that by default is synchronized using RLock. When you use Manageryou get a SynManagerobject that controls a server process which allows object values to be manipulated by other processes. You can create multiple proxies using the same manager; there is no need to create a new manager in your loop:

当您使用时,Value您会ctypes在共享内存中获得一个对象,该对象默认使用RLock. 当您使用时,您将Manager获得一个SynManager控制服务器进程的对象,该对象允许其他进程操作对象值。您可以使用同一个管理器创建多个代理;无需在循环中创建新管理器:

manager = Manager()
for i in range(5):
    new_value = manager.Value('i', 0)

The Managercan be shared across computers, while Valueis limited to one computer. Valuewill be faster (run the below code to see), so I think you should use that unless you need to support arbitrary objects or access them over a network.

Manager可以跨计算机共享,而Value仅限于一台计算机。 Value会更快(运行下面的代码来查看),所以我认为你应该使用它,除非你需要支持任意对象或通过网络访问它们。

import time
from multiprocessing import Process, Manager, Value

def foo(data, name=''):
    print type(data), data.value, name
    data.value += 1

if __name__ == "__main__":
    manager = Manager()
    x = manager.Value('i', 0)
    y = Value('i', 0)

    for i in range(5):
        Process(target=foo, args=(x, 'x')).start()
        Process(target=foo, args=(y, 'y')).start()

    print 'Before waiting: '
    print 'x = {0}'.format(x.value)
    print 'y = {0}'.format(y.value)

    time.sleep(5.0)
    print 'After waiting: '
    print 'x = {0}'.format(x.value)
    print 'y = {0}'.format(y.value)

To summarize:

总结一下:

  1. Use Managerto create multiple shared objects, including dicts and lists. Use Managerto share data across computers on a network.
  2. Use Valueor Arraywhen it is not necessary to share information across a network and the types in ctypesare sufficient for your needs.
  3. Valueis faster than Manager.
  1. 使用Manager创建多个共享对象,包括字典和列表。用于Manager在网络上的计算机之间共享数据。
  2. 使用Value或者Array当它是没有必要的共享信息通过网络和类型ctypes足以满足您的需求。
  3. Value比 快Manager

Warning

警告

By the way, sharing data across processes/threads should be avoided if possible. The code above will probably run as expected, but increase the time it takes to execute fooand things will get weird. Compare the above with:

顺便说一下,如果可能的话,应该避免跨进程/线程共享数据。上面的代码可能会按预期运行,但会增加执行时间foo,事情会变得很奇怪。将以上与:

def foo(data, name=''):
    print type(data), data.value, name
    for j in range(1000):
        data.value += 1

You'll need a Lockto make this work correctly.

你需要一个Lock来使这个工作正确。

I am not especially knowledgable about all of this, so maybe someone else will come along and offer more insight. I figured I would contribute an answer since the question was not getting attention. Hope that helps a little.

我对所有这些都不是特别了解,所以也许其他人会出现并提供更多见解。我想我会贡献一个答案,因为这个问题没有得到关注。希望能有所帮助。