Python 多处理的池进程限制

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20039659/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 19:21:16  来源:igfitidea点击:

Python multiprocessing's Pool process limit

pythonmultiprocessingcpu-cores

提问by rottentomato56

In using the Pool object from the multiprocessing module, is the number of processes limited by the number of CPU cores? E.g. if I have 4 cores, even if I create a Pool with 8 processes, only 4 will be running at one time?

在使用 multiprocessing 模块中的 Pool 对象时,进程数是否受 CPU 内核数限制?例如,如果我有 4 个内核,即使我创建了一个包含 8 个进程的池,一次也只有 4 个在运行?

回答by Back2Basics

That is correct. If you have 4 cores then 4 processes can be running at once. Remember that you have system stuff that needs to go on, and it would be nice for you to define the process number to be number_of_cores - 1. This is a preference and not mandatory. For each process that you create there is overhead, so you are actually using more memory to do this. But if RAM isn't a problem then go for it. If you are running Cuda or some other GPU based library then you have a different paradigm, but that's for another question.

那是正确的。如果您有 4 个内核,则可以同时运行 4 个进程。请记住,您有需要继续执行的系统内容,最好将进程编号定义为number_of_cores - 1. 这是一个偏好,而不是强制性的。对于您创建的每个进程,都有开销,因此您实际上使用了更多内存来执行此操作。但如果 R​​AM 不是问题,那就去吧。如果您正在运行 Cuda 或其他一些基于 GPU 的库,那么您有不同的范例,但这是另一个问题。

回答by Tim Peters

You can ask for as many processes as you like. Any limit that may exist will be imposed by your operating system, not by multiprocessing. For example,

您可以根据需要请求任意数量的进程。可能存在的任何限制将由您的操作系统施加,而不是由multiprocessing. 例如,

 p = multiprocessing.Pool(1000000)

is likely to suffer an ugly death on any machine. I'm trying it on my box as I type this, and the OS is grinding my disk to dust swapping out RAM madly - finally killed it after it had created about 3000 processes ;-)

在任何机器上都可能遭受丑陋的死亡。我正在我的盒子上尝试它,因为我正在输入它,操作系统正在将我的磁盘磨成灰尘,疯狂地换出 RAM - 最终在它创建了大约 3000 个进程后将其杀死;-)

As to how many will run "at one time", Python has no say in that. It depends on:

至于“一次”运行多少个,Python 没有发言权。这取决于:

  1. How many your hardware is capableof running simultaneously; and,
  2. How your operating system decides to give hardware resources to allthe processes on your machine currently running.
  1. 您的硬件能够同时运行多少;和,
  2. 您的操作系统如何决定为您机器上当前运行的所有进程提供硬件资源。

For CPU-bound tasks, it doesn't make senseto create more Poolprocesses than you have cores to run them on. If you're trying to use your machine for other things too, then you should create fewer processes than cores.

对于 CPU 密集型任务,创建比运行它们的内核更多的进程是没有意义的Pool。如果您也尝试将您的机器用于其他用途,那么您应该创建比核心更少的进程。

For I/O-bound tasks, it maymake sense to create a quite a few more Poolprocesses than cores, since the processes will probably spend most their time blocked (waiting for I/O to complete).

对于 I/O 密集型任务,创建比内核更多的进程可能是有意义的Pool,因为这些进程可能会花费大部分时间被阻塞(等待 I/O 完成)。

回答by Sravan K Ghantasala

Yes. Theoretically there is no limit on processes you can create, but an insane amount of processes started at once will cause death to the system because of the running out of memory. Note that processes occupy a much larger footprint than threads as they don't use shared space between them but use an individual space for each process.

是的。理论上,您可以创建的进程没有限制,但是由于内存不足,一次启动的大量进程会导致系统死亡。请注意,进程占用的空间比线程大得多,因为它们不使用它们之间的共享空间,而是为每个进程使用单独的空间。

so the best programming practice is to use semaphore restricted to the number of processors of your system. likely

所以最好的编程实践是使用受系统处理器数量限制的信号量。很可能

pool = multiprocessing.Semaphore(4) # no of cpus of your system.

If you are not aware of the number of cores of your system or if you want to use the code in many systems, a generic code like the below will do...

如果您不知道系统的核心数,或者您想在许多系统中使用该代码,则可以使用如下所示的通用代码...

pool = multiprocessing.Semaphore(multiprocessing.cpu_count()) 
#this will detect the number of cores in your system and creates a semaphore with that  value.  

P.S.But it is good to use number of cores-1 always.

PS但是总是使用 number of cores-1 是好的。

Hope this helps :)

希望这可以帮助 :)

回答by Steve D.

While there is no limit you can set, if you are looking to understand a convenient number to use for CPU bound processes (which I suspect you are looking for here), you can run the following:

虽然没有可以设置的限制,但如果您想了解一个用于 CPU 绑定进程的方便数字(我怀疑您正在此处寻找),您可以运行以下命令:

>>> import multiprocessing
>>> multiprocessing.cpu_count()
1

Some good notes on limitations (especially in linux) are noted in the answer here:

此处的答案中指出了一些关于限制的好说明(尤其是在 linux 中):