Python 3 中的 Concurrent.futures 与多处理
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20776189/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Concurrent.futures vs Multiprocessing in Python 3
提问by GIS-Jonathan
Python 3.2 introduced Concurrent Futures, which appear to be some advanced combination of the older threading and multiprocessingmodules.
Python 3.2 引入了Concurrent Futures,它似乎是旧线程和多处理模块的一些高级组合。
What are the advantages and disadvantages of using this for CPU bound tasks over the older multiprocessing module?
与旧的多处理模块相比,将其用于 CPU 密集型任务的优点和缺点是什么?
This articlesuggests they're much easier to work with - is that the case?
这篇文章表明它们更容易使用 - 是这样吗?
采纳答案by Tim Peters
I wouldn't call concurrent.futuresmore "advanced" - it's a simplerinterface that works very much the same regardless of whether you use multiple threads or multiple processes as the underlying parallelization gimmick.
我不会称之为concurrent.futures“高级”——它是一个更简单的界面,无论您使用多线程还是多进程作为底层并行化噱头,它的工作方式都非常相似。
So, like virtually all instances of "simpler interface", much the same tradeoffs are involved: it has a shallower learning curve, in large part just because there's so much less available tobe learned; but, because it offers fewer options, it may eventually frustrate you in ways the richer interfaces won't.
所以,像“简单的界面”的几乎所有情况下,大同小异权衡都参与:它有一个浅的学习曲线,这在很大程度上只是因为有可用的要少得多,以学习; 但是,因为它提供的选项较少,它最终可能会以更丰富的界面不会的方式使您感到沮丧。
So far as CPU-bound tasks go, that's waaaay too under-specified to say much meaningful. For CPU-bound tasks under CPython, you need multiple processes rather than multiple threads to have any chance of getting a speedup. But how much (if any) of a speedup you get depends on the details of your hardware, your OS, and especially on how much inter-process communication your specific tasks require. Under the covers, all inter-process parallelization gimmicks rely on the same OS primitives - the high-level API you use to get at those isn't a primary factor in bottom-line speed.
就 CPU 绑定任务而言,这太不明确了,说不出多大意义。对于 CPython 下的 CPU 密集型任务,您需要多个进程而不是多个线程才能获得加速。但是您获得的加速程度(如果有的话)取决于您的硬件、操作系统的详细信息,尤其是您的特定任务需要多少进程间通信。在幕后,所有进程间并行化噱头都依赖于相同的操作系统原语——你用来获得这些原语的高级 API 并不是影响底线速度的主要因素。
Edit: example
编辑:示例
Here's the final code shown in the article you referenced, but I'm adding an import statement needed to make it work:
这是您引用的文章中显示的最终代码,但我添加了使其工作所需的导入语句:
from concurrent.futures import ProcessPoolExecutor
def pool_factorizer_map(nums, nprocs):
# Let the executor divide the work among processes by using 'map'.
with ProcessPoolExecutor(max_workers=nprocs) as executor:
return {num:factors for num, factors in
zip(nums,
executor.map(factorize_naive, nums))}
Here's exactly the same thing using multiprocessinginstead:
这是完全相同的事情,multiprocessing而不是使用:
import multiprocessing as mp
def mp_factorizer_map(nums, nprocs):
with mp.Pool(nprocs) as pool:
return {num:factors for num, factors in
zip(nums,
pool.map(factorize_naive, nums))}
Note that the ability to use multiprocessing.Poolobjects as context managers was added in Python 3.3.
请注意,multiprocessing.Pool在 Python 3.3 中添加了将对象用作上下文管理器的功能。
Which one is easier to work with? LOL ;-) They're essentially identical.
哪个更容易合作?大声笑 ;-) 它们本质上是相同的。
One difference is that Poolsupports so many different ways of doing things that you may not realize how easy it canbe until you've climbed quite a way up the learning curve.
一个区别是Pool支持这样的事情,你可能不知道是多么容易的许多不同的方式可以是直到你攀上了学习曲线相当一路上扬。
Again, all those different ways are both a strength and a weakness. They're a strength because the flexibility may be required in some situations. They're a weakness because of "preferably only one obvious way to do it". A project sticking exclusively (if possible) to concurrent.futureswill probably be easier to maintain over the long run, due to the lack of gratuitous novelty in how its minimalistic API can be used.
同样,所有这些不同的方式既是优点也是缺点。它们是一种优势,因为在某些情况下可能需要灵活性。它们是一个弱点,因为“最好只有一种明显的方法”。concurrent.futures从长远来看,一个完全坚持(如果可能)的项目可能更容易维护,因为在如何使用其简约的 API 方面缺乏新奇的东西。

