Python 将函数应用于 ndarray 的每一行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/22581763/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Apply a function to each row of a ndarray
提问by Vahid Mirjalili
I have this function to calculate squared Mahalanobis distance of vector x to mean:
我有这个函数来计算向量 x 的平方马哈拉诺比斯距离来表示:
def mahalanobis_sqdist(x, mean, Sigma):
   '''
    Calculates squared Mahalanobis Distance of vector x 
    to distibutions' mean 
   '''
   Sigma_inv = np.linalg.inv(Sigma)
   xdiff = x - mean
   sqmdist = np.dot(np.dot(xdiff, Sigma_inv), xdiff)
   return sqmdist
I have an numpy array that has a shape of (25, 4). So, I want to apply that function to all 25 rows of my array without a for loop. So, basically, how can I write the vectorized form of this loop:
我有一个形状为(25, 4). 因此,我想将该函数应用于数组的所有 25 行,而无需 for 循环。所以,基本上,我怎样才能写出这个循环的矢量化形式:
for r in d1:
    mahalanobis_sqdist(r[0:4], mean1, Sig1)
where mean1and Sig1are :
哪里mean1和Sig1是:
>>> mean1
array([ 5.028,  3.48 ,  1.46 ,  0.248])
>>> Sig1 = np.cov(d1[0:25, 0:4].T)
>>> Sig1
array([[ 0.16043333,  0.11808333,  0.02408333,  0.01943333],
       [ 0.11808333,  0.13583333,  0.00625   ,  0.02225   ],
       [ 0.02408333,  0.00625   ,  0.03916667,  0.00658333],
       [ 0.01943333,  0.02225   ,  0.00658333,  0.01093333]])
I have tried the following but it didn't work:
我已经尝试了以下但没有奏效:
>>> vecdist = np.vectorize(mahalanobis_sqdist)
>>> vecdist(d1, mean1, Sig1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1862, in __call__
    theout = self.thefunc(*newargs)
  File "<stdin>", line 6, in mahalanobis_sqdist
  File "/usr/lib/python2.7/dist-packages/numpy/linalg/linalg.py", line 445, in inv
    return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
IndexError: tuple index out of range
采纳答案by unutbu
To apply a function to each row of an array, you could use:
要将函数应用于数组的每一行,您可以使用:
np.apply_along_axis(mahalanobis_sqdist, 1, d1, mean1, Sig1)    
In this case, however, there is a better way. You don't have to apply a function to each row. Instead, you can apply NumPy operations to the entire d1array to calculate the same result. np.einsumcan replace the for-loopand the two calls to np.dot:
但是,在这种情况下,有更好的方法。您不必对每一行应用函数。相反,您可以将 NumPy 操作应用于整个d1数组以计算相同的结果。np.einsum可以替换for-loop和 两个调用np.dot:
def mahalanobis_sqdist2(d, mean, Sigma):
   Sigma_inv = np.linalg.inv(Sigma)
   xdiff = d - mean
   return np.einsum('ij,im,mj->i', xdiff, xdiff, Sigma_inv)
Here are some benchmarks:
以下是一些基准:
import numpy as np
np.random.seed(1)
def mahalanobis_sqdist(x, mean, Sigma):
   '''
   Calculates squared Mahalanobis Distance of vector x 
   to distibutions mean 
   '''
   Sigma_inv = np.linalg.inv(Sigma)
   xdiff = x - mean
   sqmdist = np.dot(np.dot(xdiff, Sigma_inv), xdiff)
   return sqmdist
def mahalanobis_sqdist2(d, mean, Sigma):
   Sigma_inv = np.linalg.inv(Sigma)
   xdiff = d - mean
   return np.einsum('ij,im,mj->i', xdiff, xdiff, Sigma_inv)
def using_loop(d1, mean, Sigma):
    expected = []
    for r in d1:
        expected.append(mahalanobis_sqdist(r[0:4], mean1, Sig1))
    return np.array(expected)
d1 = np.random.random((25,4))
mean1 = np.array([ 5.028,  3.48 ,  1.46 ,  0.248])
Sig1 = np.cov(d1[0:25, 0:4].T)
expected = using_loop(d1, mean1, Sig1)
result = np.apply_along_axis(mahalanobis_sqdist, 1, d1, mean1, Sig1)
result2 = mahalanobis_sqdist2(d1, mean1, Sig1)
assert np.allclose(expected, result)
assert np.allclose(expected, result2)
In [92]: %timeit mahalanobis_sqdist2(d1, mean1, Sig1)
10000 loops, best of 3: 31.1 μs per loop
In [94]: %timeit using_loop(d1, mean1, Sig1)
1000 loops, best of 3: 569 μs per loop
In [91]: %timeit np.apply_along_axis(mahalanobis_sqdist, 1, d1, mean1, Sig1)
1000 loops, best of 3: 806 μs per loop
Thus mahalanobis_sqdist2is about 18x faster than a for-loop, and 26x faster than using np.apply_along_axis.
因此mahalanobis_sqdist2比 a 快 18for-loop倍,比使用np.apply_along_axis.快 26 倍。
Note that np.apply_along_axis, np.vectorize, np.frompyfuncare Python utility functions. Under the hood they use for-or while-loops. There is no real "vectorization" going on here. They can provide syntactic assistance, but don't expect them to make your code perform any better than a for-loopyou write yourself.
请注意np.apply_along_axis,np.vectorize,np.frompyfunc是 Python 实用程序函数。在他们使用的引擎盖下for-或while-loops。这里没有真正的“矢量化”。它们可以提供语法帮助,但不要指望它们使您的代码执行得比for-loop您自己编写的代码更好。
回答by abacabadabacaba
The problem is that np.vectorizevectorizes over all arguments, but you need to vectorize only over the first one. You need to use excludedkeyword argument to vectorize:
问题是np.vectorize对所有参数进行矢量化,但您只需要对第一个参数进行矢量化。您需要使用excluded关键字参数来vectorize:
np.vectorize(mahalanobis_sqdist, excluded=[1, 2])
回答by IanH
The answer by @unutbu works very nicely for applying any function to the rows of an array. In this particular case, there are some mathematical symmetries you can use that will speed things up considerably if you are working with large arrays.
@unutbu 的答案非常适合将任何函数应用于数组的行。在这种特殊情况下,如果您使用大型数组,您可以使用一些数学对称性来大大加快速度。
Here is a modified version of your function:
这是您的函数的修改版本:
def mahalanobis_sqdist3(x, mean, Sigma):
    Sigma_inv = np.linalg.inv(Sigma)
    xdiff = x - mean
    return (xdiff.dot(Sigma_inv)*xdiff).sum(axis=-1)
If you end up using any sort of large Sigma, I would recommend that you cache Sigma_invand pass that in as an argument to your function instead.
Since it is 4x4 in this example, this doesn't matter.
I'll show how to deal with large Sigmaanyway for anyone else who comes across this.
如果您最终使用任何类型的 large Sigma,我建议您缓存Sigma_inv并将其作为参数传递给您的函数。由于在此示例中是 4x4,因此这无关紧要。Sigma无论如何,我将向遇到此问题的其他人展示如何处理大文件。
If you aren't going to be using the same Sigmarepeatedly, you won't be able to cache it, so, instead of inverting the matrix, you could use a different method to solve the linear system.
Here I'll use the LU decomposition built in to SciPy.
This only improves the time if the number of columns of xis large relative to its number of rows.
如果您不打算Sigma重复使用相同的内容,您将无法缓存它,因此,您可以使用不同的方法来求解线性系统,而不是反转矩阵。在这里,我将使用 SciPy 内置的 LU 分解。如果 的列数x相对于其行数较大,这只会提高时间。
Here is a function that shows that approach:
这是一个显示该方法的函数:
from scipy.linalg import lu_factor, lu_solve
def mahalanobis_sqdist4(x, mean, Sigma):
    xdiff = x - mean
    Sigma_inv = lu_factor(Sigma)
    return (xdiff.T*lu_solve(Sigma_inv, xdiff.T)).sum(axis=0)
Here are some timings.
I'll include the version with einsumas mentioned in the other answer.
这里有一些时间。我将包含einsum另一个答案中提到的版本。
import numpy as np
Sig1 = np.array([[ 0.16043333,  0.11808333,  0.02408333,  0.01943333],
                 [ 0.11808333,  0.13583333,  0.00625   ,  0.02225   ],
                 [ 0.02408333,  0.00625   ,  0.03916667,  0.00658333],
                 [ 0.01943333,  0.02225   ,  0.00658333,  0.01093333]])
mean1 = np.array([ 5.028,  3.48 ,  1.46 ,  0.248])
x = np.random.rand(25, 4)
%timeit np.apply_along_axis(mahalanobis_sqdist, 1, x, mean1, Sig1)
%timeit mahalanobis_sqdist2(x, mean1, Sig1)
%timeit mahalanobis_sqdist3(x, mean1, Sig1)
%timeit mahalanobis_sqdist4(x, mean1, Sig1)
giving:
给予:
1000 loops, best of 3: 973 μs per loop
10000 loops, best of 3: 36.2 μs per loop
10000 loops, best of 3: 40.8 μs per loop
10000 loops, best of 3: 83.2 μs per loop
However, changing the sizes of the arrays involved changes the timing results.
For example, letting x = np.random.rand(2500, 4), the timings are:
但是,更改所涉及数组的大小会更改计时结果。例如,让x = np.random.rand(2500, 4),时间是:
10 loops, best of 3: 95 ms per loop
1000 loops, best of 3: 355 μs per loop
10000 loops, best of 3: 131 μs per loop
1000 loops, best of 3: 337 μs per loop
And letting x = np.random.rand(1000, 1000), Sigma1 = np.random.rand(1000, 1000), and mean1 = np.random.rand(1000), the timings are:
让x = np.random.rand(1000, 1000), Sigma1 = np.random.rand(1000, 1000), 和mean1 = np.random.rand(1000), 时间是:
1 loops, best of 3: 1min 24s per loop
1 loops, best of 3: 2.39 s per loop
10 loops, best of 3: 155 ms per loop
10 loops, best of 3: 99.9 ms per loop
Edit: I noticed that one of the other answers used the Cholesky decomposition.
Given that Sigmais symmetric and positive definite, we can actually do better than my above results.
There are some good routines from BLAS and LAPACK available through SciPy that can work with symmetric positive-definite matrices.
Here are two faster versions.
编辑:我注意到其他答案之一使用了 Cholesky 分解。鉴于这Sigma是对称的和正定的,我们实际上可以做得比我上面的结果更好。SciPy 提供了一些来自 BLAS 和 LAPACK 的很好的例程,可以处理对称正定矩阵。这里有两个更快的版本。
from scipy.linalg.fblas import dsymm
def mahalanobis_sqdist5(x, mean, Sigma_inv):
    xdiff = x - mean
    Sigma_inv = la.inv(Sigma)
    return np.einsum('...i,...i->...',dsymm(1., Sigma_inv, xdiff.T).T, xdiff)
from scipy.linalg.flapack import dposv
def mahalanobis_sqdist6(x, mean, Sigma):
    xdiff = x - mean
    return np.einsum('...i,...i->...', xdiff, dposv(Sigma, xdiff.T)[1].T)
The first one still inverts Sigma.
If you pre-compute the inverse and reuse it, it is much faster (the 1000x1000 case takes 35.6ms on my machine with the pre-computed inverse).
I also used einsum to take the product then sum along the last axis.
This ended up being marginally faster than doing something like (A * B).sum(axis=-1).
These two functions give the following timings:
第一个仍然反转Sigma。如果您预先计算逆并重用它,它会快得多(1000x1000 案例在我的机器上使用预先计算的逆需要 35.6 毫秒)。我还使用 einsum 取乘积,然后沿最后一个轴求和。这最终比做类似的事情快一点(A * B).sum(axis=-1)。这两个函数给出了以下时序:
First test case:
第一个测试用例:
10000 loops, best of 3: 55.3 μs per loop
100000 loops, best of 3: 14.2 μs per loop
Second test case:
第二个测试用例:
10000 loops, best of 3: 121 μs per loop
10000 loops, best of 3: 79 μs per loop
Third test case:
第三个测试用例:
10 loops, best of 3: 92.5 ms per loop
10 loops, best of 3: 48.2 ms per loop
回答by IanH
Just saw a really nice comment on redditthat might speed things up even a little more:
刚刚在reddit上看到一个非常好的评论,它可能会加快速度:
This is not surprising to anyone who uses numpy regularly. For loops in python are horribly slow. Actually, einsum is pretty slow too. Here's a version that is faster if you have lots of vectors (500 vectors in 4 dimensions is enough to make this version faster than einsum on my machine):
对于经常使用 numpy 的人来说,这并不奇怪。python中的for循环非常慢。实际上,einsum 也很慢。如果你有很多向量,这是一个更快的版本(4 维中的 500 个向量足以使这个版本比我机器上的 einsum 更快):
def no_einsum(d, mean, Sigma):
    L_inv = np.linalg.inv(numpy.linalg.cholesky(Sigma))
    xdiff = d - mean
    return np.sum(np.dot(xdiff, L_inv.T)**2, axis=1)
If your points are also high dimensional then computing the inverse is slow (and generally a bad idea anyway) and you can save time by solving the system directly (500 vectors in 250 dimensions is enough to make this version the fastest on my machine):
如果您的点也是高维的,那么计算逆的速度很慢(无论如何通常都是一个坏主意)并且您可以通过直接求解系统来节省时间(250 维中的 500 个向量足以使这个版本成为我机器上最快的版本):
def no_einsum_solve(d, mean, Sigma):
    L = numpy.linalg.cholesky(Sigma)
    xdiff = d - mean
    return np.sum(np.linalg.solve(L, xdiff.T)**2, axis=0)

