Python 什么时候在进程上调用 .join() ?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14429703/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:28:48  来源:igfitidea点击:

When to call .join() on a process?

pythonmultiprocessing

提问by Justin

I am reading various tutorials on the multiprocessing module in Python, and am having trouble understanding why/when to call process.join(). For example, I stumbled across this example:

我正在阅读有关 Python 多处理模块的各种教程,但无法理解为什么/何时调用process.join(). 例如,我偶然发现了这个例子:

nums = range(100000)
nprocs = 4

def worker(nums, out_q):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outdict = {}
    for n in nums:
        outdict[n] = factorize_naive(n)
    out_q.put(outdict)

# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []

for i in range(nprocs):
    p = multiprocessing.Process(
            target=worker,
            args=(nums[chunksize * i:chunksize * (i + 1)],
                  out_q))
    procs.append(p)
    p.start()

# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
    resultdict.update(out_q.get())

# Wait for all worker processes to finish
for p in procs:
    p.join()

print resultdict

From what I understand, process.join()will block the calling process until the process whose join method was called has completed execution. I also believe that the child processes which have been started in the above code example complete execution upon completing the target function, that is, after they have pushed their results to the out_q. Lastly, I believe that out_q.get()blocks the calling process until there are results to be pulled. Thus, if you consider the code:

据我了解,process.join()将阻塞调用进程,直到调用其 join 方法的进程完成执行。我也相信上面代码示例中启动的子进程在完成目标函数后即完成执行,即在它们将结果推送到out_q. 最后,我相信这会out_q.get()阻止调用过程,直到有结果要提取为止。因此,如果您考虑以下代码:

resultdict = {}
for i in range(nprocs):
    resultdict.update(out_q.get())

# Wait for all worker processes to finish
for p in procs:
    p.join()

the main process is blocked by the out_q.get()calls until every single worker processhas finished pushing its results to the queue. Thus, by the time the main process exits the for loop, each child process should have completed execution, correct?

主进程被out_q.get()调用阻塞,直到每个工作进程完成将其结果推送到队列。因此,当主进程退出 for 循环时,每个子进程都应该完成执行,对吗?

If that is the case, is there any reason for calling the p.join()methods at this point? Haven't all worker processes already finished, so how does that cause the main process to "wait for all worker processes to finish?" I ask mainly because I have seen this in multiple different examples, and I am curious if I have failed to understand something.

如果是这种情况,此时是否有任何理由调用这些p.join()方法?不是所有的工作进程都已经完成了,那么这如何导致主进程“等待所有工作进程完成”?我问这个问题主要是因为我在多个不同的例子中看到了这一点,我很好奇我是否没有理解某些东西。

采纳答案by Bakuriu

Try to run this:

尝试运行这个:

import math
import time
from multiprocessing import Queue
import multiprocessing

def factorize_naive(n):
    factors = []
    for div in range(2, int(n**.5)+1):
        while not n % div:
            factors.append(div)
            n //= div
    if n != 1:
        factors.append(n)
    return factors

nums = range(100000)
nprocs = 4

def worker(nums, out_q):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outdict = {}
    for n in nums:
        outdict[n] = factorize_naive(n)
    out_q.put(outdict)

# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []

for i in range(nprocs):
    p = multiprocessing.Process(
            target=worker,
            args=(nums[chunksize * i:chunksize * (i + 1)],
                  out_q))
    procs.append(p)
    p.start()

# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
    resultdict.update(out_q.get())

time.sleep(5)

# Wait for all worker processes to finish
for p in procs:
    p.join()

print resultdict

time.sleep(15)

And open the task-manager. You should be able to see that the 4 subprocesses go in zombie state for some seconds before being terminated by the OS(due to the join calls):

并打开任务管理器。您应该能够看到 4 个子进程在被操作系统终止之前进入僵尸状态几秒钟(由于加入调用):

enter image description here

在此处输入图片说明

With more complex situations the child processes could stay in zombie state forever(like the situation you was asking about in an other question), and if you create enough child-processes you could fill the process table causing troubles to the OS(which may kill your main process to avoid failures).

在更复杂的情况下,子进程可能永远处于僵尸状态(就像您在另一个问题中询问的情况一样),如果您创建足够多的子进程,您可以填充进程表,从而给操作系统带来麻烦(这可能会杀死避免失败的主要过程)。

回答by oefe

At the point just before you call join, all workers have put their results into their queues, but they did not necessarily return, and their processes may not yet have terminated. They may or may not have done so, depending on timing.

在您调用 之前的那一刻join,所有工作人员都将他们的结果放入了他们的队列中,但他们不一定返回,他们的进程可能尚未终止。他们可能会也可能不会这样做,这取决于时间。

Calling joinmakes sure that all processes are given the time to properly terminate.

调用join确保所有进程都有时间正确终止。

回答by sarpu

I am not exactly sure of the implementation details, but join also seems to be necessary to reflect that a process has indeed terminated (after calling terminate on it for example). In the example here, if you don't call join after terminating a process, process.is_alive()returns True, even though the process was terminated with a process.terminate()call.

我不太确定实现细节,但 join 似乎也是反映进程确实终止的必要条件(例如,在调用终止之后)。在此处的示例中,如果您在终止进程后不调用 join ,则process.is_alive()返回True,即使进程已通过process.terminate()调用终止。