Python 支持多线程吗?它可以加快执行时间吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20939299/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 21:36:15  来源:igfitidea点击:

Does Python support multithreading? Can it speed up execution time?

pythonmultithreading

提问by Karim Bahgat

I'm slightly confused about whether multithreading works in Python or not.

我对多线程是否适用于 Python 感到有些困惑。

I know there has been a lot of questions about this and I've read many of them, but I'm still confused. I know from my own experience and have seen others post their own answers and examples here on StackOverflow that multithreading is indeed possible in Python. So why is it that everyone keep saying that Python is locked by the GIL and that only one thread can run at a time? It clearly does work. Or is there some distinction I'm not getting here?

我知道有很多关于这个的问题,我已经阅读了很多,但我仍然感到困惑。我从自己的经验中了解到,也看到其他人在 StackOverflow 上发布了他们自己的答案和示例,多线程确实可以在 Python 中实现。那么为什么大家老是说Python被GIL锁住了,一次只能运行一个线程呢?它显然有效。或者我没有得到一些区别?

Many posters/respondents also keep mentioning that threading is limited because it does not make use of multiple cores. But I would say they are still useful because they do work simultaneously and thus get the combined workload done faster. I mean why would there even be a Python thread module otherwise?

许多发帖者/受访者也不断提到线程是有限的,因为它不使用多核。但我会说它们仍然很有用,因为它们同时工作,因此可以更快地完成组合工作量。我的意思是,否则为什么会有一个 Python 线程模块?

Update:

更新:

Thanks for all the answers so far. The way I understand it is that multithreading will only run in parallel for some IO tasks, but can only run one at a time for CPU-bound multiple core tasks.

感谢您到目前为止的所有答案。我的理解是,多线程只会对某些 IO 任务并行运行,但对于受 CPU 限制的多核任务,一次只能运行一个。

I'm not entirely sure what this means for me in practical terms, so I'll just give an example of the kind of task I'd like to multithread. For instance, let's say I want to loop through a very long list of strings and I want to do some basic string operations on each list item. If I split up the list, send each sublist to be processed by my loop/string code in a new thread, and send the results back in a queue, will these workloads run roughly at the same time? Most importantly will this theoretically speed up the time it takes to run the script?

我不完全确定这在实际中对我意味着什么,所以我只举一个我想要多线程的任务类型的例子。例如,假设我想遍历一个很长的字符串列表,并且我想对每个列表项执行一些基本的字符串操作。如果我拆分列表,将每个子列表发送到新线程中由我的循环/字符串代码处理,并将结果发送回队列,这些工作负载是否会大致同时运行?最重要的是,这在理论上会加快运行脚本所需的时间吗?

Another example might be if I can render and save four different pictures using PIL in four different threads, and have this be faster than processing the pictures one by one after each other? I guess this speed-component is what I'm really wondering about rather than what the correct terminology is.

另一个例子可能是,如果我可以在四个不同的线程中使用 PIL 渲染和保存四个不同的图片,并且这比一个接一个地处理图片更快?我想这个速度组件是我真正想知道的,而不是正确的术语是什么。

I also know about the multiprocessing module but my main interest right now is for small-to-medium task loads (10-30 secs) and so I think multithreading will be more appropriate because subprocesses can be slow to initiate.

我也知道多处理模块,但我现在的主要兴趣是中小型任务负载(10-30 秒),所以我认为多线程会更合适,因为子进程启动速度可能很慢。

采纳答案by Martijn Pieters

The GIL does not prevent threading. All the GIL does is make sure only one thread is executing Python code at a time; control still switches between threads.

GIL 不会阻止线程。GIL 所做的就是确保一次只有一个线程在执行 Python 代码;控制仍然在线程之间切换。

What the GIL prevents then, is making use of more than one CPU core or separate CPUs to run threads in parallel.

GIL 所阻止的是使用多个 CPU 内核或单独的 CPU 来并行运行线程。

This only applies to Python code. C extensions can and do release the GIL to allow multiple threads of C code and one Python thread to run across multiple cores. This extends to I/O controlled by the kernel, such as select()calls for socket reads and writes, making Python handle network events reasonably efficiently in a multi-threaded multi-core setup.

这仅适用于 Python 代码。C 扩展可以并且确实发布了 GIL,以允许多个 C 代码线程和一个 Python 线程跨多个内核运行。这扩展到由内核控制的 I/O,例如select()调用套接字读取和写入,使 Python 在多线程多核设置中合理有效地处理网络事件。

What many server deployments then do, is run more than one Python process, to let the OS handle the scheduling between processes to utilize your CPU cores to the max. You can also use the multiprocessinglibraryto handle parallel processing across multiple processes from one codebase and parent process, if that suits your use cases.

许多服务器部署所做的是运行多个 Python 进程,让操作系统处理进程之间的调度,以最大限度地利用您的 CPU 内核。如果适合您的用例,您还可以使用该multiprocessing处理来自一个代码库和父进程的多个进程的并行处理。

Note that the GIL is only applicable to the CPython implementation; Jython and IronPython use a different threading implementation (the native Java VM and .NET common runtime threads respectively).

请注意,GIL 仅适用于 CPython 实现;Jython 和 IronPython 使用不同的线程实现(分别是本机 Java VM 和 .NET 公共运行时线程)。

To address your update directly: Any task that tries to get a speed boost from parallel execution, using pure Python code, will not see a speed-up as threaded Python code is locked to one thread executing at a time. If you mix in C extensions and I/O, however (such as PIL or numpy operations) and any C code can run in parallel with oneactive Python thread.

直接解决您的更新:任何尝试从并行执行中获得速度提升的任务,使用纯 Python 代码,都不会看到加速,因为线程 Python 代码被锁定到一次执行的一个线程。但是,如果您混合使用 C 扩展和 I/O(例如 PIL 或 numpy 操作)并且任何 C 代码都可以与一个活动的 Python 线程并行运行。

Python threading is great for creating a responsive GUI, or for handling multiple short web requests where I/O is the bottleneck more than the Python code. It is not suitable for parallelizing computationally intensive Python code, stick to the multiprocessingmodule for such tasks or delegate to a dedicated external library.

Python 线程非常适合创建响应式 GUI,或处理多个短 Web 请求,其中 I/O 比 Python 代码更成为瓶颈。它不适合并行化计算密集型 Python 代码,不适合multiprocessing此类任务的模块或委托给专用的外部库。

回答by zord

Yes. :)

是的。:)

You have the low level threadmodule and the higher level threadingmodule. But it you simply want to use multicore machines, the multiprocessingmodule is the way to go.

您有低级线程模块和高级线程模块。但是如果你只是想使用多核机器,多处理模块是要走的路。

Quote from the docs:

来自文档的引用:

In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.

在 CPython 中,由于全局解释器锁,一次只有一个线程可以执行 Python 代码(即使某些面向性能的库可能会克服这一限制)。如果您希望您的应用程序更好地利用多核机器的计算资源,建议您使用多处理。但是,如果您想同时运行多个 I/O 密集型任务,线程仍然是一个合适的模型。

回答by r.guerbab

Threading is Allowed in Python, the only problem is that the GIL will make sure that just one thread is executed at a time (no parallelism).

Python 中允许线程,唯一的问题是 GIL 将确保一次只执行一个线程(无并行性)。

So basically if you want to multi-thread the code to speed up calculation it won't speed it up as just one thread is executed at a time, but if you use it to interact with a database for example it will.

所以基本上如果你想多线程代码来加速计算它不会因为一次只执行一个线程而加速它,但是如果你使用它来与数据库交互,例如它会。