python 当函数是lambda函数或嵌套函数时,concurrent.futures.ProcessPoolExecutor挂起

a11xaf1n  于 12个月前  发布在  Python
关注(0)|答案(1)|浏览(67)

有没有人能解释一下为什么在下面的代码示例中使用lambda或嵌套函数(f)会使concurrent.futures.ProcessPoolExecutor挂起?

import concurrent.futures
​
​
def f2(s):
    return len(s)
​
​
def main():
    def f(s):
        return len(s)
​
    data = ["a", "b", "c"]
​
    with concurrent.futures.ProcessPoolExecutor(max_workers=1) as pool:
        # results = pool.map(f, data) # hangs
        # results = pool.map(lambda d: len(d), data)  # hangs
        # results = pool.map(len, data)  # works
        results = pool.map(f2, data) # works
​
    print(list(results))
​
​
if __name__ == "__main__":
    main()
5cg8jx4n

5cg8jx4n1#

长话短说,Pool/ProcessPoolExecutor都必须在将它们发送给worker之前序列化所有内容。序列化(有时也称为pickling)实际上是保存函数名称的过程,只有在Pool想要访问它时才能再次导入。要使此过程正常工作,必须在顶层定义函数,因为嵌套函数不能由子级导入,这就是出现以下错误的原因:

AttributeError: Can't pickle local object 'MyClass.mymethod.<locals>.mymethod'

为了避免这个问题,有一些解决方案,我还没有找到可靠的。如果你可以灵活地使用其他软件包,pathos是一个真正有效的替代方案。例如,以下内容不会挂起:

import pathos
import os

class SomeClass:

    def __init__(self):
         self.words = ["a", "b", "c"]

    def some_method(self):
    
        def run(s):
            return len(s)
    
        return list(pool.map(run, self.words))

pool = pathos.multiprocessing.Pool(os.cpu_count())
print(SomeClass().some_method())

它确实会打印出来

[1, 1, 1]

相关问题