Python 的 multiprocessing.Pool.map 中的“chunksize”参数

Question

提问by sergio

If I have a pool object with 2 processors for example:

例如，如果我有一个带有 2 个处理器的池对象：

p=multiprocessing.Pool(2)

and I want to iterate over a list of files on directory and use the map function

我想遍历目录中的文件列表并使用 map 函数

could someone explain what is the chunksize of this function:

有人可以解释一下这个函数的块大小是多少：

p.map(func, iterable[, chunksize])

If I set the chunksize for example to 10 does that means every 10 files should be processed with one processor?

例如，如果我将块大小设置为 10，这是否意味着每 10 个文件应该用一个处理器处理？

Answer 1

回答by detly

Looking at the documentation for Pool.mapit seems you're almost correct: the chunksizeparameter will cause the iterable to be split into pieces of approximatelythat size, and each piece is submitted as a separate task.

查看 Pool.map的文档，您似乎几乎是正确的：该chunksize参数将导致可迭代对象被拆分为大约该大小的部分，并且每个部分都作为单独的任务提交。

So in your example, yes, mapwill take the first 10 (approximately), submit it as a task for a single processor... then the next 10 will be submitted as another task, and so on. Note that it doesn't mean that this will make the processors alternate every 10 files, it's quite possible that processor #1 ends up getting 1-10 AND 11-20, and processor #2 gets 21-30 and 31-40.

因此，在您的示例中，是的，map将采用前 10 个（大约），将其作为单个处理器的任务提交......然后接下来的 10 个将作为另一个任务提交，依此类推。请注意，这并不意味着这将使处理器每 10 个文件交替一次，很可能处理器 #1 最终得到 1-10 和 11-20，而处理器 #2 得到 21-30 和 31-40。

Python 的 multiprocessing.Pool.map 中的“chunksize”参数

提问by sergio

回答by detly

相关推荐

最近更新

标签

Python 的 multiprocessing.Pool.map 中的“chunksize”参数

提问by sergio

回答by detly

相关推荐

Python中嵌套列表的总和

Python Flask/Werkzeug 如何将 HTTP 内容长度标头附加到文件下载

Python 时间段后停止代码

Python 将 URL 转换为屏幕截图（脚本）

相关推荐

最近更新

标签