numpy dot() 和 Python 3.5+ 矩阵乘法的区别@

Question

提问by blaz

I recently moved to Python 3.5 and noticed the new matrix multiplication operator (@)sometimes behaves differently from the numpy dotoperator. In example, for 3d arrays:

我最近转向 Python 3.5 并注意到新的矩阵乘法运算符 (@)有时与numpy 点运算符的行为不同。例如，对于 3d 数组：

import numpy as np

a = np.random.rand(8,13,13)
b = np.random.rand(8,13,13)
c = a @ b  # Python 3.5+
d = np.dot(a, b)

The @operator returns an array of shape:

该@运算符返回一个形状数组：

c.shape
(8, 13, 13)

while the np.dot()function returns:

当np.dot()函数返回时：

d.shape
(8, 13, 8, 13)

How can I reproduce the same result with numpy dot? Are there any other significant differences?

如何使用 numpy dot 重现相同的结果？还有其他显着差异吗？

Answer 1

采纳答案by Alex Riley

The @operator calls the array's __matmul__method, not dot. This method is also present in the API as the function np.matmul.

该@运营商称阵列的__matmul__方法，而不是dot。该方法也作为函数出现在 API 中np.matmul。

>>> a = np.random.rand(8,13,13)
>>> b = np.random.rand(8,13,13)
>>> np.matmul(a, b).shape
(8, 13, 13)

From the documentation:

从文档：

matmuldiffers from dotin two important ways.
Multiplication by scalars is not allowed.
Stacks of matrices are broadcast together as if the matrices were elements.

matmul区别于dot两个重要方面。
不允许乘以标量。
矩阵堆栈一起广播，就好像矩阵是元素一样。

The last point makes it clear that dotand matmulmethods behave differently when passed 3D (or higher dimensional) arrays. Quoting from the documentation some more:

最后一点清楚地表明dot，matmul当传递 3D（或更高维）数组时，and方法的行为有所不同。从文档中引用更多：

For matmul:

对于matmul：

If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.

如果任一参数为 ND，N > 2，则将其视为驻留在最后两个索引中的矩阵堆栈并相应地广播。

For np.dot:

对于np.dot：

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation). For N dimensions it is a sum product over the last axis of a and the second-to-last of b

对于二维数组，它相当于矩阵乘法，对于一维数组，相当于向量的内积（没有复共轭）。对于 N 维，它是 a 的最后一个轴和 b 的倒数第二个轴的和积

Answer 2

回答by Nathan

The answer by @ajcr explains how the dotand matmul(invoked by the @symbol) differ. By looking at a simple example, one clearly sees how the two behave differently when operating on 'stacks of matricies' or tensors.

@ajcr 的回答解释了dotand matmul（由@符号调用）有何不同。通过查看一个简单的示例，可以清楚地看到两者在对“矩阵堆栈”或张量进行操作时的行为有何不同。

To clarify the differences take a 4x4 array and return the dotproduct and matmulproduct with a 3x4x2 'stack of matricies' or tensor.

为了澄清差异，采用 4x4 数组并返回具有 3x4x2“矩阵堆栈”或张量的dot乘积和matmul乘积。

import numpy as np
fourbyfour = np.array([
                       [1,2,3,4],
                       [3,2,1,4],
                       [5,4,6,7],
                       [11,12,13,14]
                      ])


threebyfourbytwo = np.array([
                             [[2,3],[11,9],[32,21],[28,17]],
                             [[2,3],[1,9],[3,21],[28,7]],
                             [[2,3],[1,9],[3,21],[28,7]],
                            ])

print('4x4*3x4x2 dot:\n {}\n'.format(np.dot(fourbyfour,twobyfourbythree)))
print('4x4*3x4x2 matmul:\n {}\n'.format(np.matmul(fourbyfour,twobyfourbythree)))

The products of each operation appear below. Notice how the dot product is,

每个操作的产品如下所示。注意点积是怎样的，

...a sum product over the last axis of a and the second-to-last of b

...a 的最后一个轴和 b 的倒数第二个轴上的和积

and how the matrix product is formed by broadcasting the matrix together.

以及如何通过一起广播矩阵来形成矩阵乘积。

4x4*3x4x2 dot:
 [[[232 152]
  [125 112]
  [125 112]]

 [[172 116]
  [123  76]
  [123  76]]

 [[442 296]
  [228 226]
  [228 226]]

 [[962 652]
  [465 512]
  [465 512]]]

4x4*3x4x2 matmul:
 [[[232 152]
  [172 116]
  [442 296]
  [962 652]]

 [[125 112]
  [123  76]
  [228 226]
  [465 512]]

 [[125 112]
  [123  76]
  [228 226]
  [465 512]]]

Answer 3

回答by Yong Yang

In mathematics, I think the dotin numpy makes more sense

在数学中，我认为numpy 中的点更有意义

dot(a,b)_{i,j,k,a,b,c} =

点(a,b)_{i,j,k,a,b,c} =

since it gives the dot product when a and b are vectors, or the matrix multiplication when a and b are matrices

因为当 a 和 b 是向量时它给出点积，或者当 a 和 b 是矩阵时给出矩阵乘法

As for matmuloperation in numpy, it consists of parts of dotresult, and it can be defined as

至于numpy 中的matmul操作，它由部分点结果组成，可以定义为

>matmul(a,b)_{i,j,k,c} =

> matmul(a,b)_{i,j,k,c} =

So, you can see that matmul(a,b)returns an array with a small shape, which has smaller memory consumption and make more sense in applications. In particular, combining with broadcasting, you can get

所以，你可以看到matmul(a,b)返回一个小形状的数组，它具有更小的内存消耗并且在应用程序中更有意义。特别是，结合广播，你可以得到

matmul(a,b)_{i,j,k,l} =

for example.

例如。

From the above two definitions, you can see the requirements to use those two operations. Assume a.shape=(s1,s2,s3,s4)and b.shape=(t1,t2,t3,t4)

从上面的两个定义，你可以看到使用这两个操作的要求。假设a.shape=(s1,s2,s3,s4)和b.shape=(t1,t2,t3,t4)

To use dot(a,b)you need
1. t3=s4;
To use matmul(a,b)you need
1. t3=s4
2. t2=s2, or one of t2 and s2 is 1
3. t1=s1, or one of t1 and s1 is 1

要使用dot(a,b)你需要
1. t3=s4;
要使用matmul(a,b)你需要
1. t3=s4
2. t2=s2，或 t2 和 s2 之一为 1
3. t1=s1或 t1 和 s1 之一为 1

Use the following piece of code to convince yourself.

使用下面的一段代码来说服自己。

Code sample

代码示例

import numpy as np
for it in xrange(10000):
    a = np.random.rand(5,6,2,4)
    b = np.random.rand(6,4,3)
    c = np.matmul(a,b)
    d = np.dot(a,b)
    #print 'c shape: ', c.shape,'d shape:', d.shape

    for i in range(5):
        for j in range(6):
            for k in range(2):
                for l in range(3):
                    if not c[i,j,k,l] == d[i,j,k,j,l]:
                        print it,i,j,k,l,c[i,j,k,l]==d[i,j,k,j,l] #you will not see them

Answer 4

回答by Nico Schl?mer

Just FYI, @and its numpy equivalents dotand matmulare all roughly equally fast. (Plot created with perfplot, a project of mine.)

仅供参考，@其numpy的等价物dot，并matmul都大致一样快。（用perfplot创建的图，我的一个项目。）

Code to reproduce the plot:

重现情节的代码：

import perfplot
import numpy


def setup(n):
    A = numpy.random.rand(n, n)
    x = numpy.random.rand(n)
    return A, x


def at(data):
    A, x = data
    return A @ x


def numpy_dot(data):
    A, x = data
    return numpy.dot(A, x)


def numpy_matmul(data):
    A, x = data
    return numpy.matmul(A, x)


perfplot.show(
    setup=setup,
    kernels=[at, numpy_dot, numpy_matmul],
    n_range=[2 ** k for k in range(12)],
    logx=True,
    logy=True,
)

Answer 5

回答by Sambath Parthasarathy

My experience with MATMUL and DOT

我在 MATMUL 和 DOT 方面的经验

I was constantly getting "ValueError: Shape of passed values is (200, 1), indices imply (200, 3)" when trying to use MATMUL. I wanted a quick workaround and found DOT to deliver the same functionality. I don't get any error using DOT. I get the correct answer

尝试使用 MATMUL 时，我不断收到“ValueError：传递值的形状为 (200, 1)，索引意味着 (200, 3)”。我想要一个快速的解决方法，并发现 DOT 可以提供相同的功能。使用 DOT 时我没有收到任何错误。我得到正确答案

with MATMUL

与 MATMUL

X.shape
>>>(200, 3)

type(X)

>>>pandas.core.frame.DataFrame

w

>>>array([0.37454012, 0.95071431, 0.73199394])

YY = np.matmul(X,w)

>>>  ValueError: Shape of passed values is (200, 1), indices imply (200, 3)"

with DOT

带点

YY = np.dot(X,w)
# no error message
YY
>>>array([ 2.59206877,  1.06842193,  2.18533396,  2.11366346,  0.28505879, …

YY.shape

>>> (200, )

numpy dot() 和 Python 3.5+ 矩阵乘法的区别@

提问by blaz

采纳答案by Alex Riley

回答by Nathan

回答by Yong Yang

>matmul(a,b)_{i,j,k,c} =

> matmul(a,b)_{i,j,k,c} =

Code sample

代码示例

回答by Nico Schl?mer

回答by Sambath Parthasarathy

相关推荐

最近更新

标签

numpy dot() 和 Python 3.5+ 矩阵乘法的区别@

提问by blaz

采纳答案by Alex Riley

回答by Nathan

回答by Yong Yang

>matmul(a,b)_{i,j,k,c} =

> matmul(a,b)_{i,j,k,c} =

Code sample

代码示例

回答by Nico Schl?mer

回答by Sambath Parthasarathy

相关推荐

Python 如何修复 TensorFlow 中的维度错误？

如何在 SublimeREPL 上运行 Python 代码

Python 如何在 TensorFlow 中将张量转换为 numpy 数组？

protobuf 到 python 中的 json

相关推荐

最近更新

标签