Ipython Notebook 上的多核和多线程
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37083116/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Multicore and multithread on Ipython Notebook
提问by Hyman_The_Ripper
I am currently using the threadingfunction in python and got the following:
我目前在 python 中使用线程函数并得到以下结果:
In [1]:
import threading
threading.activeCount()
Out[1]:
4
Now on my terminal, I use lscpuand learned there are 2 threads per core and I have access to 4 cores:
现在在我的终端上,我使用lscpu并了解到每个内核有 2 个线程,我可以访问 4 个内核:
kitty@FelineFortress:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 60
Stepping: 3
CPU MHz: 800.000
BogoMIPS: 5786.45
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-7
Hence, I should have a lot more than 4 threads to access. Is there a python function I can use to increase the number of cores I am using (with example) to get more than 4 threads? Or even something to type on the terminal when launching ipython notebook like below:
因此,我应该有超过 4 个线程可以访问。是否有一个 python 函数可以用来增加我使用的内核数量(例如)以获得超过 4 个线程?或者甚至在启动 ipython notebook 时在终端上输入一些内容,如下所示:
ipython notebook n_cores=3
采纳答案by Alexander Huszagh
You can use multiprocessingto allow Python to use multiple cores. Just one, big caveat: all the data you pass between Python sessions has to be picklable or passed via inheritance, and a new Python instance is spawned on Windows, while on Unix systems it can be forked over. This has notabled performance implications on a Windows system.
您可以使用多处理来允许 Python 使用多个内核。一个重要的警告:您在 Python 会话之间传递的所有数据都必须是可pickle的或通过继承传递的,并且在 Windows 上生成了一个新的 Python 实例,而在 Unix 系统上它可以被分叉。这对 Windows 系统具有显着的性能影响。
A basic exampleusing multiprocessing is as follows from "Python Module of the Week":
使用多处理的基本示例如下来自“本周 Python 模块”:
import multiprocessing
def worker():
"""worker function"""
print 'Worker'
return
if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker)
jobs.append(p)
p.start()
When executed, it outputs:
执行时,它输出:
Worker
Worker
Worker
Worker
Worker
Multiprocessing allows you to do independent calculations on different cores, allowing CPU-bound tasks with little overhead to execute much more rapidly than a traditional process.
多处理允许您在不同的内核上进行独立计算,从而使 CPU 密集型任务的执行速度比传统进程快得多。
You should also realize that threading in Python does not improve performance. It exists for convenience (such as maintaining the responsiveness of a GUI during long calculations). The reason for this is these are not native threads due to Python's Global Interpreter Lock, or GIL.
您还应该意识到 Python 中的线程并不会提高性能。它的存在是为了方便(例如在长时间计算期间保持 GUI 的响应能力)。这样做的原因是由于 Python 的全局解释器锁或GIL,这些不是本机线程。
Update Feburary 2018
2018 年 2 月更新
This is still very much applicable, and will be for the foreseeable future. The Cpython implementation uses the following definitionfor reference counting:
这仍然非常适用,并且在可预见的未来也是如此。Cpython 实现使用以下定义进行引用计数:
typedef struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;
Notably, this is notthread-safe, so a global-interpreter lock must be implemented to allow only one thread of execution with Python objects to avoid data races leading to memory issues.
值得注意的是,这不是线程安全的,因此必须实现全局解释器锁以仅允许一个线程使用 Python 对象执行,以避免导致内存问题的数据竞争。
There are numerous tools to try to side-step the global interpreter lock, in addition to multiprocessing (which requires a complete copy of the interpreter on Windows, rather than a fork, making it very slow and unamenable to improving performance).
除了多处理之外,还有许多工具可以尝试绕过全局解释器锁(这需要 Windows 上解释器的完整副本,而不是分叉,这使得它非常缓慢且无法提高性能)。
Cython
赛通
Your simplest solution is Cython. Simply cdef a function, without any internal objects, and release the GIL with the with nogil
keyword.
您最简单的解决方案是 Cython。简单地 cdef 一个函数,没有任何内部对象,并用with nogil
关键字释放 GIL 。
A simple example taken from the documentation, which shows you how to release, and temporarily re-enable the GIL:
文档中的一个简单示例,向您展示了如何发布和临时重新启用 GIL:
from cython.parallel import prange
cdef int func(Py_ssize_t n):
cdef Py_ssize_t i
for i in prange(n, nogil=True):
if i == 8:
with gil:
raise Exception()
elif i == 4:
break
elif i == 2:
return i
Using a Different Interpreter
使用不同的口译员
CPython has a GI, while Jython and IronPython do not. Be careful, as numerous C-libraries for high-performance computing may not work with IronPython or Jython (SciPy flirted with IronPython support, but dropped it long ago, and it will not work on a modern Python version).
CPython 有一个 GI,而 Jython 和 IronPython 没有。小心,因为许多用于高性能计算的 C 库可能无法与 IronPython 或 Jython 一起使用(SciPy 与 IronPython 支持调情,但很久以前就放弃了,它不能在现代 Python 版本上工作)。
Using MPI4Py
使用 MPI4Py
MPI, or Message Passing Interface, is a high-performance interface for languages like C and C++. It allows efficient parallel computations, and MPI4Py creates bindings for MPI for Python. For efficiency, you should only use MPI4Py with NumPy arrays.
MPI,即消息传递接口,是一种用于 C 和 C++ 等语言的高性能接口。它允许高效的并行计算,并且 MPI4Py 为 Python 的 MPI 创建绑定。为了提高效率,您应该只将 MPI4Py 与 NumPy 数组一起使用。
An example from their documentationis:
他们的文档中的一个例子是:
from mpi4py import MPI
import numpy
def matvec(comm, A, x):
m = A.shape[0] # local rows
p = comm.Get_size()
xg = numpy.zeros(m*p, dtype='d')
comm.Allgather([x, MPI.DOUBLE],
[xg, MPI.DOUBLE])
y = numpy.dot(A, xg)
return y