Python 多处理锁

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28267972/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:01:24  来源:igfitidea点击:

Python Multiprocessing Locks

pythonmultiprocessing

提问by Daniel S.

This multiprocessing code works as expected. It creates 4 Python processes, and uses them to print the numbers 0 through 39, with a delay after each print.

此多处理代码按预期工作。它创建了 4 个 Python 进程,并使用它们打印数字 0 到 39,每次打印后都有延迟。

import multiprocessing
import time

def job(num):
  print num
  time.sleep(1)

pool = multiprocessing.Pool(4)

lst = range(40)
for i in lst:
  pool.apply_async(job, [i])

pool.close()
pool.join()

However, when I try to use a multiprocessing.Lock to prevent multiple processes from printing to standard out, the program just exits immediately without any output.

但是,当我尝试使用 multiprocessing.Lock 来防止多个进程打印到标准输出时,程序会立即退出而没有任何输出。

import multiprocessing
import time

def job(lock, num):
  lock.acquire()
  print num
  lock.release()
  time.sleep(1)

pool = multiprocessing.Pool(4)
l = multiprocessing.Lock()

lst = range(40)
for i in lst:
  pool.apply_async(job, [l, i])

pool.close()
pool.join()

Why does the introduction of a multiprocessing.Lock make this code not work?

为什么引入 multiprocessing.Lock 会使这段代码不起作用?

Update: It works when the lock is declared globally (where I did a few non-definitive tests to check that the lock works), as opposed to the code above which passes the lock as an argument (Python's multiprocessing documentation shows locks being passed as arguments). The code below has a lock declared globally, as opposed to passing as an argument in the code above.

更新:它在全局声明锁时起作用(我做了一些非确定性测试来检查锁是否有效),而不是上面将锁作为参数传递的代码(Python 的多处理文档显示锁被传递为论据)。下面的代码有一个全局声明的锁,而不是在上面的代码中作为参数传递。

import multiprocessing
import time

l = multiprocessing.Lock()

def job(num):
  l.acquire()
  print num
  l.release()
  time.sleep(1)

pool = multiprocessing.Pool(4)

lst = range(40)
for i in lst:
  pool.apply_async(job, [i])

pool.close()
pool.join()

采纳答案by matsjoyce

If you change pool.apply_asyncto pool.apply, you get this exception:

如果更改pool.apply_asyncpool.apply,则会出现此异常:

Traceback (most recent call last):
  File "p.py", line 15, in <module>
    pool.apply(job, [l, i])
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 244, in apply
    return self.apply_async(func, args, kwds).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
    raise self._value
RuntimeError: Lock objects should only be shared between processes through inheritance

pool.apply_asyncis just hiding it. I hate to say this, but using a global variable is probably the simplest way for your example. Let's just hope the velociraptorsdon't get you.

pool.apply_async只是隐藏它。我不想这么说,但是对于您的示例,使用全局变量可能是最简单的方法。让我们只希望速龙不会抓住你。

回答by Tom Dalton

I think the reason is that the multiprocessing pool uses pickleto transfer objects between the processes. However, a Lockcannot be pickled:

我认为原因是多处理池用于pickle在进程之间传输对象。但是, aLock不能被腌制:

>>> import multiprocessing
>>> import pickle
>>> lock = multiprocessing.Lock()
>>> lp = pickle.dumps(lock)
Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
    lp = pickle.dumps(lock)
...
RuntimeError: Lock objects should only be shared between processes through inheritance
>>> 

See the "Picklability" and "Better to inherit than pickle/unpickle" sections of https://docs.python.org/2/library/multiprocessing.html#all-platforms

请参阅“Picklability”和“更好的继承不是泡菜/ unpickle”的部分https://docs.python.org/2/library/multiprocessing.html#all-platforms

回答by Ankur

Other answers already provide the answer that the apply_asyncsilently fails unless an appropriate error_callbackargument is provided. I still found OP's other point valid -- the official docs do indeed show multiprocessing.Lockbeing passed around as a function argument. In fact, the sub-section titled "Explicitly pass resources to child processes" in Programming guidelinesrecommends passing a multiprocessing.Lockobject as function argument instead of a global variable. And, I have been writing a lot of code in which I pass a multiprocessing.Lockas an argument to the child process and it all works as expected.

apply_async除非提供适当的error_callback论据,否则其他答案已经提供了静默失败的答案。我仍然发现 OP 的另一点是有效的——官方文档确实显示multiprocessing.Lock作为函数参数传递。事实上,编程指南中标题为“显式将资源传递给子进程”的小节建议将multiprocessing.Lock对象作为函数参数而不是全局变量传递。而且,我一直在编写很多代码,其中我将 amultiprocessing.Lock作为参数传递给子进程,并且一切都按预期工作。

So, what gives?

那么,什么给?

I first investigated whether multiprocessing.Lockis pickle-able or not. In Python 3, MacOS+CPython, trying to pickle multiprocessing.Lockproduces the familiar RuntimeErrorencountered by others.

我首先调查了是否multiprocessing.Lock可以泡菜。在 Python 3,MacOS+CPython 中,试图picklemultiprocessing.Lock产生了RuntimeError别人遇到的熟悉的东西。

>>> pickle.dumps(multiprocessing.Lock())
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-7-66dfe1355652> in <module>
----> 1 pickle.dumps(multiprocessing.Lock())

/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/synchronize.py in __getstate__(self)
     99
    100     def __getstate__(self):
--> 101         context.assert_spawning(self)
    102         sl = self._semlock
    103         if sys.platform == 'win32':

/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/context.py in assert_spawning(obj)
    354         raise RuntimeError(
    355             '%s objects should only be shared between processes'
--> 356             ' through inheritance' % type(obj).__name__
    357             )

RuntimeError: Lock objects should only be shared between processes through inheritance

To me, this confirms that multiprocessing.Lockis indeed not pickle-able.

对我来说,这证实了multiprocessing.Lock它确实不能腌制。

Aside begins

旁白开始

But, the samelock still needs to be shared across two or more python processes which will have their own, potentially different address spaces (such as when we use "spawn" or "forkserver" as start methods). multiprocessingmust be doing something special to send Lock across processes. This other StackOverflow postseems to indicate that in Unix systems, multiprocessing.Lockmay be implemented via named semaphores that are supported by the OS itself (outside python). Two or more python processes can then link to the samelock that effectively resides in one location outside both python processes. There may be a shared memory implementation as well.

但是,同一个锁仍然需要在两个或多个 python 进程之间共享,这些进程将拥有自己的、可能不同的地址空间(例如当我们使用“spawn”或“forkserver”作为启动方法时)。multiprocessing必须做一些特殊的事情才能跨进程发送锁。这其他的StackOverflow职位似乎表明,在Unix系统中,multiprocessing.Lock可以通过由操作系统本身(外Python)的支持命名信号来实现。然后,两个或多个 python 进程可以链接到同一个锁,该锁有效地驻留在两个 python 进程之外的一个位置。也可能有一个共享内存实现。

Aside ends

一边结束

Can we pass multiprocessing.Lockobject as an argument or not?

我们是否可以将multiprocessing.Lock对象作为参数传递?

After a few more experiments and more reading, it appears that the difference is between multiprocessing.Pooland multiprocessing.Process.

经过多次实验和更多阅读后,似乎区别在于multiprocessing.Pool和之间multiprocessing.Process

multiprocessing.Processlets you pass multiprocessing.Lockas an argument but multiprocessing.Pooldoesn't. Here is an example that works:

multiprocessing.Process让你multiprocessing.Lock作为参数传递但multiprocessing.Pool不传递。这是一个有效的例子:

import multiprocessing
import time
from multiprocessing import Process, Lock


def task(n: int, lock):
    with lock:
        print(f'n={n}')
    time.sleep(0.25)


if __name__ == '__main__':
    multiprocessing.set_start_method('forkserver')
    lock = Lock()
    processes = [Process(target=task, args=(i, lock)) for i in range(20)]
    for process in processes:
        process.start()
    for process in processes:
        process.join()

Note the use of __name__ == '__main__'is essential as mentioned in the "Safe importing of main module" sub-section of Programming guidelines.

请注意,__name__ == '__main__'编程指南的“安全导入主模块”小节中所述,使用是必不可少的。

multiprocessing.Poolseems to use queue.SimpleQueuewhich puts each task in a queue and that's where pickling happens. Most likely, multiprocessing.Processis not using pickling (or doing a special version of pickling).

multiprocessing.Pool似乎使用queue.SimpleQueuewhich 将每个任务放在一个队列中,这就是酸洗发生的地方。最有可能的multiprocessing.Process是不使用酸洗(或进行特殊版本的酸洗)。