python 为什么全局解释器锁?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/265687/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why the Global Interpreter Lock?
提问by Federico A. Ramponi
What is exactly the function of Python's Global Interpreter Lock? Do other languages that are compiled to bytecode employ a similar mechanism?
Python的全局解释器锁的作用究竟是什么?编译为字节码的其他语言是否采用类似的机制?
回答by Brian
In general, for any thread safety problem you will need to protect your internal data structures with locks. This can be done with various levels of granularity.
通常,对于任何线程安全问题,您都需要使用锁来保护内部数据结构。这可以通过不同级别的粒度来完成。
You can use fine-grained locking, where every separate structure has its own lock.
You can use coarse-grained locking where one lock protects everything (the GIL approach).
您可以使用细粒度锁定,其中每个单独的结构都有自己的锁定。
您可以使用粗粒度锁定,其中一个锁定保护所有内容(GIL 方法)。
There are various pros and cons of each method. Fine-grained locking allows greater parallelism - two threads can execute in parallel when they don't share any resources. However there is a much larger administrative overhead. For every line of code, you may need to acquire and release several locks.
每种方法都有各种优缺点。细粒度锁定允许更大的并行度——当两个线程不共享任何资源时,它们可以并行执行。但是,管理开销要大得多。对于每一行代码,您可能需要获取和释放多个锁。
The coarse grained approach is the opposite. Two threads can't run at the same time, but an individual thread will run faster because its not doing so much bookkeeping. Ultimately it comes down to a tradeoff between single-threaded speed and parallelism.
粗粒度的方法则相反。两个线程不能同时运行,但一个单独的线程会运行得更快,因为它没有做太多的簿记。最终归结为单线程速度和并行性之间的权衡。
There have been a few attempts to remove the GIL in python, but the extra overhead for single threaded machines was generally too large. Some cases can actually be slower even on multi-processor machines due to lock contention.
已经有一些尝试在 python 中删除 GIL,但是单线程机器的额外开销通常太大。由于锁争用,即使在多处理器机器上,某些情况实际上也可能更慢。
Do other languages that are compiled to bytecode employ a similar mechanism?
编译为字节码的其他语言是否采用类似的机制?
It varies, and it probably shouldn't be considered a language property so much as an implementation property. For instance, there are Python implementations such as Jython and IronPython which use the threading approach of their underlying VM, rather than a GIL approach. Additionally, the next version of Ruby looks to be moving towardsintroducing a GIL.
它因人而异,它可能不应该被视为一种语言属性,而是一种实现属性。例如,有一些 Python 实现,例如 Jython 和 IronPython,它们使用其底层 VM 的线程方法,而不是 GIL 方法。此外,Ruby 的下一个版本似乎正在朝着引入 GIL 的方向发展。
回答by Eli Bendersky
The following is from the official Python/C API Reference Manual:
以下来自官方 Python/C API 参考手册:
The Python interpreter is not fully thread safe. In order to support multi-threaded Python programs, there's a global lock that must be held by the current thread before it can safely access Python objects. Without the lock, even the simplest operations could cause problems in a multi-threaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice.
Therefore, the rule exists that only the thread that has acquired the global interpreter lock may operate on Python objects or call Python/C API functions. In order to support multi-threaded Python programs, the interpreter regularly releases and reacquires the lock -- by default, every 100 bytecode instructions (this can be changed with sys.setcheckinterval()). The lock is also released and reacquired around potentially blocking I/O operations like reading or writing a file, so that other threads can run while the thread that requests the I/O is waiting for the I/O operation to complete.
Python 解释器不是完全线程安全的。为了支持多线程 Python 程序,当前线程必须持有一个全局锁,才能安全地访问 Python 对象。如果没有锁,即使是最简单的操作也可能导致多线程程序出现问题:例如,当两个线程同时增加同一个对象的引用计数时,引用计数可能最终只会增加一次而不是两次。
因此,存在只有获得全局解释器锁的线程才能对 Python 对象进行操作或调用 Python/C API 函数的规则。为了支持多线程 Python 程序,解释器会定期释放和重新获取锁——默认情况下,每 100 个字节码指令(这可以通过 sys.setcheckinterval() 更改)。还会在可能阻塞 I/O 操作(如读取或写入文件)时释放和重新获取锁,以便其他线程可以在请求 I/O 的线程等待 I/O 操作完成时运行。
I think it sums up the issue pretty well.
我认为它很好地概括了这个问题。
回答by David Nehme
The global interpreter lock is a big mutex-type lock that protects reference counters from getting hosed. If you are writing pure python code, this all happens behind the scenes, but if you embedding Python into C, then you might have to explicitly take/release the lock.
全局解释器锁是一个大的互斥类型锁,可以保护引用计数器免于被灌输。如果您正在编写纯 Python 代码,这一切都发生在幕后,但是如果您将 Python 嵌入到 C 中,那么您可能必须明确获取/释放锁。
This mechanism is not related to Python being compiled to bytecode. It's not needed for Java. In fact, it's not even needed for Jython(python compiled to jvm).
这种机制与 Python 被编译为字节码无关。Java 不需要它。事实上,Jython甚至不需要它(python 编译为 jvm)。
see also this question
另见这个问题
回答by Edward KMETT
Python, like perl 5, was not designed from the ground up to be thread safe. Threads were grafted on after the fact, so the global interpreter lock is used to maintain mutual exclusion to where only one thread is executing code at a given time in the bowels of the interpreter.
Python 与 perl 5 一样,并不是从一开始就设计为线程安全的。线程是事后嫁接的,因此全局解释器锁用于保持互斥,即在解释器内部的给定时间只有一个线程正在执行代码。
Individual Python threads are cooperatively multitasked by the interpreter itself by cycling the lock every so often.
解释器本身通过每隔一段时间循环锁来协作处理单个 Python 线程。
Grabbing the lock yourself is needed when you are talking to Python from C when other Python threads are active to 'opt in' to this protocol and make sure that nothing unsafe happens behind your back.
当其他 Python 线程处于活动状态以“选择加入”此协议并确保没有任何不安全的事情发生在您的背后时,当您从 C 与 Python 交谈时,需要自己获取锁。
Other systems that have a single-threaded heritage that later evolved into mulithreaded systems often have some mechanism of this sort. For instance, the Linux kernel has the "Big Kernel Lock" from its early SMP days. Gradually over time as multi-threading performance becomes an issue there is a tendency to try to break these sorts of locks up into smaller pieces or replace them with lock-free algorithms and data structures where possible to maximize throughput.
其他具有单线程遗产的系统,后来演变为多线程系统,通常具有某种此类机制。例如,Linux 内核从其早期的 SMP 时代就具有“大内核锁”。随着时间的推移,随着多线程性能逐渐成为一个问题,有一种趋势是尝试将这些类型的锁分解成更小的部分,或者用无锁算法和数据结构替换它们,以最大限度地提高吞吐量。
回答by Eli Bendersky
Regarding your second question, not all scripting languages use this, but it only makes them less powerful. For instance, the threads in Ruby are greenand not native.
关于您的第二个问题,并非所有脚本语言都使用它,但这只会使它们的功能减弱。例如,Ruby 中的线程是绿色的而不是原生的。
In Python, the threads are native and the GIL only prevents them from running on different cores.
在 Python 中,线程是原生的,GIL 只会阻止它们在不同的内核上运行。
In Perl, the threads are even worse. They just copy the whole interpreter, and are far from being as usable as in Python.
在 Perl 中,线程甚至更糟。它们只是复制整个解释器,远没有像在 Python 中那样可用。