Python 多处理池映射:AttributeError:无法腌制本地对象
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/52265120/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Multiprocessing Pool Map: AttributeError: Can't pickle local object
提问by Amit
I have a method inside a class that needs to do a lot of work in a loop, and I would like to spread the work over all of my cores.
我在一个类中有一个方法需要在循环中做很多工作,我想把工作分散到我的所有核心上。
I wrote the following code, which works if I use normal map
, but with pool.map
returns an error.
我编写了以下代码,如果我使用 normal map
,它可以工作,但pool.map
返回错误。
import multiprocessing
pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)
class OtherClass:
def run(sentence, graph):
return False
class SomeClass:
def __init__(self):
self.sentences = [["Some string"]]
self.graphs = ["string"]
def some_method(self):
other = OtherClass()
def single(params):
sentences, graph = params
return [other.run(sentence, graph) for sentence in sentences]
return list(pool.map(single, zip(self.sentences, self.graphs)))
SomeClass().some_method()
Error:
错误:
AttributeError: Can't pickle local object 'SomeClass.some_method..single'
AttributeError: 无法腌制本地对象“SomeClass.some_method..single”
Why can't it pickle single
? I even tried to movesingle
to the global module scope (not inside the class - makes it independent of the context):
为什么不能腌制single
?我什至尝试移动single
到全局模块范围(不在类内部 - 使其独立于上下文):
import multiprocessing
pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)
class OtherClass:
def run(sentence, graph):
return False
def single(params):
other = OtherClass()
sentences, graph = params
return [other.run(sentence, graph) for sentence in sentences]
class SomeClass:
def __init__(self):
self.sentences = [["Some string"]]
self.graphs = ["string"]
def some_method(self):
return list(pool.map(single, zip(self.sentences, self.graphs)))
SomeClass().some_method()
and I get the following error:
我收到以下错误:
AttributeError: Can't get attribute 'single' on module 'main' from '.../test.py'
AttributeError: 无法从“.../test.py”获取模块“ main”上的属性“single ”
回答by Darkonaut
Error 1:
错误 1:
AttributeError: Can't pickle local object 'SomeClass.some_method..single'
AttributeError: 无法腌制本地对象“SomeClass.some_method..single”
You solved this error yourself by moving the nested target-function single()
out to the top-level.
您通过将嵌套的目标函数single()
移到顶层来自己解决了这个错误。
Background:
背景:
Pool needs to pickle (serialize) everything it sends to its worker-processes (IPC). Pickling actually only saves the name of a function and unpickling requires re-importing the function by name. For that to work, the function needs to be defined at the top-level, nested functions won't be importable by the child and already trying to pickle them raises an exception (more).
池需要腌制(序列化)它发送到其工作进程(IPC)的所有内容。Pickling 实际上只保存函数的名称,而 unpickling 需要按名称重新导入函数。为此,该函数需要在顶级定义,嵌套函数将无法被子进程导入,并且已经尝试对它们进行腌制会引发异常(more)。
Error 2:
错误 2:
AttributeError: Can't get attribute 'single' on module 'main' from '.../test.py'
AttributeError: 无法从“.../test.py”获取模块“main”上的属性“single”
You are starting the pool beforeyou define your function and classes, that way the child processes cannot inherit any code. Move your pool start up to the bottom and protect (why?) it with if __name__ == '__main__':
您在定义函数和类之前启动池,这样子进程就不能继承任何代码。将您的泳池开始移至底部并保护(为什么?)if __name__ == '__main__':
import multiprocessing
class OtherClass:
def run(self, sentence, graph):
return False
def single(params):
other = OtherClass()
sentences, graph = params
return [other.run(sentence, graph) for sentence in sentences]
class SomeClass:
def __init__(self):
self.sentences = [["Some string"]]
self.graphs = ["string"]
def some_method(self):
return list(pool.map(single, zip(self.sentences, self.graphs)))
if __name__ == '__main__': # <- prevent RuntimeError for 'spawn'
# and 'forkserver' start_methods
with multiprocessing.Pool(multiprocessing.cpu_count() - 1) as pool:
print(SomeClass().some_method())
Appendix
附录
...I would like to spread the work over all of my cores.
...我想把工作分散到我所有的核心上。
Potentially helpful background on how multiprocessing.Pool
is chunking work:
关于multiprocessing.Pool
分块如何工作的潜在有用背景:
Python multiprocessing: understanding logic behind chunksize
回答by Marcell Pigniczki
I accidentally discovered a very nasty solution. It works, as long as you
use a def
statement. If you declare the function, that you want to use in Pool.map
with the global
keyword at the beginning of the function that solves it. But I would not rely on this in serious applications
我无意中发现了一个非常讨厌的解决方案。只要您使用def
语句,它就可以工作。如果声明的功能,要在使用Pool.map
与global
能够解决它的关键字在函数的开始。但我不会在严肃的应用中依赖这个
import multiprocessing
pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)
class OtherClass:
def run(sentence, graph):
return False
class SomeClass:
def __init__(self):
self.sentences = [["Some string"]]
self.graphs = ["string"]
def some_method(self):
global single # This is ugly, but does the trick XD
other = OtherClass()
def single(params):
sentences, graph = params
return [other.run(sentence, graph) for sentence in sentences]
return list(pool.map(single, zip(self.sentences, self.graphs)))
SomeClass().some_method()