Python 什么是矢量化？

Question

提问by Jairus Patrick Vallon

What does it mean to vectorize for-loops in Python? Is there another way to write nested for-loops?

在 Python 中向量化 for 循环是什么意思？还有另一种编写嵌套for循环的方法吗？

I am new to Python and on my research, I always come across the NumPy library.

我是 Python 新手，在我的研究中，我总是遇到 NumPy 库。

Answer 1

回答by DeepSpace

Python forloops are inherently slower than their C counterpart.

Pythonfor循环本质上比它们的 C 循环慢。

This is why numpyoffers vectorized actions on numpyarrays. It pushes the forloop you would usually do in Python down to the C level, which is much faster. numpyoffers vectorized ("C level forloop") alternatives to things that otherwise would need to be done in an element-wise manner ("Python level forloop).

这就是为什么numpy在numpy数组上提供矢量化操作的原因。它将for您通常在 Python 中执行的循环推到 C 级别，这要快得多。numpy提供矢量化（“C 级for循环”）替代方案，否则需要以元素方式（“Python 级for循环”）完成。

import numpy as np
from timeit import Timer

li = list(range(500000))
nump_arr = np.array(li)

def python_for():
    return [num + 1 for num in li]

def numpy_add():
    return nump_arr + 1

print(min(Timer(python_for).repeat(10, 10)))
print(min(Timer(numpy_add).repeat(10, 10)))

#  0.725692612368003
#  0.010465986942008954

The numpyvectorized addition was x70 times faster.

该numpy矢量加快X70倍。

Answer 2

回答by Brad Solomon

Here's a definitionfrom Wes McKinney:

这是Wes McKinney的定义：

Arrays are important because they enable you to express batch operations on data without writing any for loops. This is usually called vectorization. Any arithmetic operations between equal-size arrays applies the operation elementwise.

数组很重要，因为它们使您能够在不编写任何 for 循环的情况下对数据进行批处理。这通常称为矢量化。等长数组之间的任何算术运算都按元素应用运算。

Vectorized version:

矢量化版本：

>>> import numpy as np
>>> arr = np.array([[1., 2., 3.], [4., 5., 6.]])
>>> arr * arr
array([[  1.,   4.,   9.],
       [ 16.,  25.,  36.]])

The same thing with loops on a native Python (nested) list:

与本机 Python（嵌套）列表上的循环相同的事情：

>>> arr = arr.tolist()
>>> res = [[0., 0., 0.], [0., 0., 0.]]
>>> for idx1, row in enumerate(arr):
        for idx2, val2 in enumerate(row):
            res[idx1][idx2] = val2 * val2
>>> res
[[1.0, 4.0, 9.0], [16.0, 25.0, 36.0]]

How do these two operations compare? The NumPy version takes 436 ns; the Python version takes 3.52 μs (3520 ns). This large difference in "small" times is called microperformance, and it becomes important when you're working with larger data or repeating operations thousands or millions of times.

这两个操作如何比较？NumPy 版本需要 436 ns；Python 版本需要 3.52 μs (3520 ns)。这种“小”时间的巨大差异称为微性能，当您处理较大的数据或重复操作数千或数百万次时，这一点变得很重要。

Python 什么是矢量化？

提问by Jairus Patrick Vallon

回答by DeepSpace

回答by Brad Solomon

相关推荐

最近更新

标签

Python 什么是矢量化？

提问by Jairus Patrick Vallon

回答by DeepSpace

回答by Brad Solomon

相关推荐

如何在 Python 中将 True 切换为 False

Python Django：您正在使用 staticfiles 应用程序，而没有设置 STATIC_ROOT 设置

Python Django 2.0 路径错误 ?: (2_0.W001) 的路由包含 '(?P<', 以 '^' 开头，或以 '$' 结尾

为什么我通过 Home brew 安装的 Python 不包括 Tkinter

相关推荐

最近更新

标签