python 生成器线程安全吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1131430/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 21:32:57  来源:igfitidea点击:

Are Generators Threadsafe?

pythonmultithreadingthread-safetygenerator

提问by Corey Goldberg

I have a multithreaded program where I create a generator function and then pass it to new threads. I want it to be shared/global in nature so each thread can get the next value from the generator.

我有一个多线程程序,我在其中创建了一个生成器函数,然后将其传递给新线程。我希望它本质上是共享/全局的,这样每个线程都可以从生成器中获取下一个值。

Is it safe to use a generator like this, or will I run into problems/conditions accessing the shared generator from multiple threads?

使用这样的生成器是否安全,或者我会遇到从多个线程访问共享生成器的问题/条件吗?

If not, is there a better way to approach the problem? I need something that will cycle through a list and produce the next value for whichever thread calls it.

如果没有,有没有更好的方法来解决这个问题?我需要一些可以循环遍历列表并为调用它的线程生成下一个值的东西。

回答by Martin v. L?wis

It's not thread-safe; simultaneous calls may interleave, and mess with the local variables.

它不是线程安全的;同时调用可能会交错,并与局部变量混淆。

The common approach is to use the master-slave pattern (now called farmer-worker pattern in PC). Make a third thread which generates data, and add a Queue between the master and the slaves, where slaves will read from the queue, and the master will write to it. The standard queue module provides the necessary thread safety and arranges to block the master until the slaves are ready to read more data.

常见的做法是使用主从模式(现在在PC中称为农民-工人模式)。创建第三个产生数据的线程,在master和slave之间添加一个Queue,slave从队列中读取,master写入。标准队列模块提供必要的线程安全性,并安排阻塞主节点,直到从节点准备好读取更多数据。

回答by Glenn Maynard

Edited to add benchmark below.

编辑以在下面添加基准。

You can wrap a generator with a lock. For example,

你可以用锁包裹一个生成器。例如,

import threading
class LockedIterator(object):
    def __init__(self, it):
        self.lock = threading.Lock()
        self.it = it.__iter__()

    def __iter__(self): return self

    def next(self):
        self.lock.acquire()
        try:
            return self.it.next()
        finally:
            self.lock.release()

gen = [x*2 for x in [1,2,3,4]]
g2 = LockedIterator(gen)
print list(g2)


Locking takes 50ms on my system, Queue takes 350ms. Queue is useful when you really do have a queue; for example, if you have incoming HTTP requests and you want to queue them for processing by worker threads. (That doesn't fit in the Python iterator model--once an iterator runs out of items, it's done.) If you really do have an iterator, then LockedIterator is a faster and simpler way to make it thread safe.

在我的系统上锁定需要 50 毫秒,队列需要 350 毫秒。当您确实有队列时,队列很有用;例如,如果您有传入的 HTTP 请求,并且希望将它们排入队列以供工作线程处理。(这不适合 Python 迭代器模型——一旦迭代器用完项目,它就完成了。)如果你真的有一个迭代器,那么 LockedIterator 是一种使其线程安全的更快更简单的方法。

from datetime import datetime
import threading
num_worker_threads = 4

class LockedIterator(object):
    def __init__(self, it):
        self.lock = threading.Lock()
        self.it = it.__iter__()

    def __iter__(self): return self

    def next(self):
        self.lock.acquire()
        try:
            return self.it.next()
        finally:
            self.lock.release()

def test_locked(it):
    it = LockedIterator(it)
    def worker():
        try:
            for i in it:
                pass
        except Exception, e:
            print e
            raise

    threads = []
    for i in range(num_worker_threads):
        t = threading.Thread(target=worker)
        threads.append(t)
        t.start()

    for t in threads:
        t.join()

def test_queue(it):
    from Queue import Queue
    def worker():
        try:
            while True:
                item = q.get()
                q.task_done()
        except Exception, e:
            print e
            raise

    q = Queue()
    for i in range(num_worker_threads):
         t = threading.Thread(target=worker)
         t.setDaemon(True)
         t.start()

    t1 = datetime.now()

    for item in it:
        q.put(item)

    q.join()

start_time = datetime.now()
it = [x*2 for x in range(1,10000)]

test_locked(it)
#test_queue(it)
end_time = datetime.now()
took = end_time-start_time
print "took %.01f" % ((took.seconds + took.microseconds/1000000.0)*1000)

回答by Mikhail Churbanov

No, they are not thread-safe. You can find interesting info about generators and multi-threading in:

不,它们不是线程安全的。您可以在以下位置找到有关生成器和多线程的有趣信息:

http://www.dabeaz.com/generators/Generators.pdf

http://www.dabeaz.com/generators/Generators.pdf

回答by Algorias

It depends on which python implementation you're using. In CPython, the GIL makes all operations on python objects threadsafe, as only one thread can be executing code at any given time.

这取决于您使用的 Python 实现。在 CPython 中,GIL 使 Python 对象上的所有操作都是线程安全的,因为在任何给定时间只有一个线程可以执行代码。

http://en.wikipedia.org/wiki/Global_Interpreter_Lock

http://en.wikipedia.org/wiki/Global_Interpreter_Lock