将多个参数传递给 Python 中的 pool.map() 函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25553919/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:22:08  来源:igfitidea点击:

Passing multiple parameters to pool.map() function in Python

pythonmultiprocessingpoolmap-function

提问by DJMcCarthy12

I need some way to use a function within pool.map() that accepts more than one parameter. As per my understanding, the target function of pool.map() can only have one iterable as a parameter but is there a way that I can pass other parameters in as well? In this case, I need to pass in a few configuration variables, like my Lock() and logging information to the target function.

我需要某种方法来使用 pool.map() 中接受多个参数的函数。根据我的理解,pool.map() 的目标函数只能有一个可迭代的作为参数,但有没有一种方法可以传入其他参数?在这种情况下,我需要传入一些配置变量,比如我的 Lock() 和将信息记录到目标函数。

I have tried to do some research and I think that I may be able to use partial functions to get it to work? However I don't fully understand how these work. Any help would be greatly appreciated! Here is a simple example of what I want to do:

我试图做一些研究,我认为我可以使用部分函数来让它工作?但是我并不完全理解这些是如何工作的。任何帮助将不胜感激!这是我想要做的一个简单的例子:

def target(items, lock):
    for item in items:
        # Do cool stuff
        if (... some condition here ...):
            lock.acquire()
            # Write to stdout or logfile, etc.
            lock.release()

def main():
    iterable = [1, 2, 3, 4, 5]
    pool = multiprocessing.Pool()
    pool.map(target(PASS PARAMS HERE), iterable)
    pool.close()
    pool.join()

采纳答案by dano

You can use functools.partialfor this (as you suspected):

您可以functools.partial为此使用(正如您所怀疑的那样):

from functools import partial

def target(lock, iterable_item):
    for item in iterable_item:
        # Do cool stuff
        if (... some condition here ...):
            lock.acquire()
            # Write to stdout or logfile, etc.
            lock.release()

def main():
    iterable = [1, 2, 3, 4, 5]
    pool = multiprocessing.Pool()
    l = multiprocessing.Lock()
    func = partial(target, l)
    pool.map(func, iterable)
    pool.close()
    pool.join()

Example:

例子:

def f(a, b, c):
    print("{} {} {}".format(a, b, c))

def main():
    iterable = [1, 2, 3, 4, 5]
    pool = multiprocessing.Pool()
    a = "hi"
    b = "there"
    func = partial(f, a, b)
    pool.map(func, iterable)
    pool.close()
    pool.join()

if __name__ == "__main__":
    main()

Output:

输出:

hi there 1
hi there 2
hi there 3
hi there 4
hi there 5

回答by TheSoundDefense

In case you don't have access to functools.partial, you could use a wrapper function for this, as well.

如果您无权访问functools.partial,您也可以为此使用包装函数。

def target(lock):
    def wrapped_func(items):
        for item in items:
            # Do cool stuff
            if (... some condition here ...):
                lock.acquire()
                # Write to stdout or logfile, etc.
                lock.release()
    return wrapped_func

def main():
    iterable = [1, 2, 3, 4, 5]
    pool = multiprocessing.Pool()
    lck = multiprocessing.Lock()
    pool.map(target(lck), iterable)
    pool.close()
    pool.join()

This makes target()into a function that accepts a lock (or whatever parameters you want to give), and it will return a function that only takes in an iterable as input, but can still use all your other parameters. That's what is ultimately passed in to pool.map(), which then should execute with no problems.

这将target()变成一个接受锁(或您想提供的任何参数)的函数,它将返回一个仅接受可迭代对象作为输入的函数,但仍可以使用所有其他参数。这就是最终传递给 的内容pool.map(),然后它应该可以毫无问题地执行。

回答by Mike McKerns

You could use a map function that allows multiple arguments, as does the fork of multiprocessingfound in pathos.

您可以使用允许多个参数的 map 函数,就像multiprocessingpathos.

>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> 
>>> def add_and_subtract(x,y):
...   return x+y, x-y
... 
>>> res = Pool().map(add_and_subtract, range(0,20,2), range(-5,5,1))
>>> res
[(-5, 5), (-2, 6), (1, 7), (4, 8), (7, 9), (10, 10), (13, 11), (16, 12), (19, 13), (22, 14)]
>>> Pool().map(add_and_subtract, *zip(*res))
[(0, -10), (4, -8), (8, -6), (12, -4), (16, -2), (20, 0), (24, 2), (28, 4), (32, 6), (36, 8)]

pathosenables you to easily nest hierarchical parallel maps with multiple inputs, so we can extend our example to demonstrate that.

pathos使您能够轻松嵌套具有多个输入的分层并行映射,因此我们可以扩展我们的示例来演示这一点。

>>> from pathos.multiprocessing import ThreadingPool as TPool
>>> 
>>> res = TPool().amap(add_and_subtract, *zip(*Pool().map(add_and_subtract, range(0,20,2), range(-5,5,1))))
>>> res.get()
[(0, -10), (4, -8), (8, -6), (12, -4), (16, -2), (20, 0), (24, 2), (28, 4), (32, 6), (36, 8)]

Even more fun, is to build a nested function that we can pass into the Pool. This is possible because pathosuses dill, which can serialize almost anything in python.

更有趣的是构建一个嵌套函数,我们可以将其传递到池中。这是可能的,因为pathosuses dill,它可以序列化 python 中的几乎任何东西。

>>> def build_fun_things(f, g):
...   def do_fun_things(x, y):
...     return f(x,y), g(x,y)
...   return do_fun_things
... 
>>> def add(x,y):
...   return x+y
... 
>>> def sub(x,y):
...   return x-y
... 
>>> neato = build_fun_things(add, sub)
>>> 
>>> res = TPool().imap(neato, *zip(*Pool().map(neato, range(0,20,2), range(-5,5,1))))
>>> list(res)
[(0, -10), (4, -8), (8, -6), (12, -4), (16, -2), (20, 0), (24, 2), (28, 4), (32, 6), (36, 8)]

If you are not able to go outside of the standard library, however, you will have to do this another way. Your best bet in that case is to use multiprocessing.starmapas seen here: Python multiprocessing pool.map for multiple arguments(noted by @Roberto in the comments on the OP's post)

但是,如果您无法走出标准库,则必须以另一种方式执行此操作。在这种情况下,您最好的选择是使用multiprocessing.starmap如下所示:Python multiprocessing pool.map for multiple arguments(@Roberto 在 OP 帖子的评论中指出)

Get pathoshere: https://github.com/uqfoundation

获取pathos此:https://github.com/uqfoundation