Python tf.multiply vs tf.matmul 计算点积

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47583501/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:16:15  来源:igfitidea点击:

tf.multiply vs tf.matmul to calculate the dot product

pythontensorflow

提问by Abrar

I have a matrix (of vectors) X with shape [3,4], and I want to calculate the dot product between each pair of vectors (X[1].X[1]) and (X[1].X[2])...etc.

我有一个形状为 [3,4] 的矩阵(向量)X,我想计算每对向量之间的点积 (X[1].X[1]) 和 (X[1].X[ 2])...等。

I saw a cosine similarity code were they use

我看到了他们使用的余弦相似度代码

tf.reduce_sum(tf.multyply(X, X),axis=1)

tf.reduce_sum(tf.multyply(X, X),axis=1)

to calculate the dot product between the vectors in a matrix of vectors.However, this result in only calculates the dot product between (X[i], X[i]).

计算向量矩阵中向量之间的点积。然而,这个结果只计算(X[i],X[i])之间的点积。

I used tf.matmul(X, X, transpose_b=True) which calculate the dot product between every two vectors but I am still confused why tf.multiply didn't do this I think the problem with my code.

我使用 tf.matmul(X, X, transpose_b=True) 计算每两个向量之间的点积,但我仍然很困惑为什么 tf.multiply 没有这样做我认为我的代码有问题。

the code is:

代码是:

data=[[1.0,2.0,4.0,5.0],[0.0,6.0,7.0,8.0],[8.0,1.0,1.0,1.0]]
X=tf.constant(data)
matResult=tf.matmul(X, X, transpose_b=True)

multiplyResult=tf.reduce_sum(tf.multiply(X,X),axis=1)
with tf.Session() as sess:
   print('matResult')
   print(sess.run([matResult]))
   print()
   print('multiplyResult')
   print(sess.run([multiplyResult]))

The output is:

输出是:

matResult
[array([[  46.,   80.,   19.],
       [  80.,  149.,   21.],
       [  19.,   21.,   67.]], dtype=float32)]

multiplyResult
 [array([  46.,  149.,   67.], dtype=float32)]

I would appreciate any advise

我将不胜感激任何建议

回答by patapouf_ai

tf.multiply(X, Y)does element-wise multiplicationso that

tf.multiply(X, Y)进行逐元素乘法,以便

[[1 2]    [[1 3]      [[1 6]
 [3 4]] .  [2 1]]  =   [6 4]]

wheras tf.matmuldoes matrix multiplicationso that

tf.matmul矩阵乘法,所以

[[1 0]    [[1 3]      [[1 3]
 [0 1]] .  [2 1]]  =   [2 1]]

using tf.matmul(X, X, transpose_b=True)means that you are calculating X . X^Twhere ^Tindicates the transposing of the matrix and .is the matrix multiplication.

usingtf.matmul(X, X, transpose_b=True)意味着您正在计算X . X^Twhere^T表示矩阵的转置,并且.是矩阵乘法。

tf.reduce_sum(_, axis=1)takes the sum along 1st axis (starting counting with 0) which means you are suming the rows:

tf.reduce_sum(_, axis=1)沿第一个轴求和(从 0 开始计数),这意味着您正在对行求和:

tf.reduce_sum([[a b], [c, d]], axis=1) = [a+b, c+d]

This means that:

这意味着:

tf.reduce_sum(tf.multiply(X, X), axis=1) = [X[1].X[1], ..., X[n].X[n]]

so that is the one you want if you only want the norms of each rows. On the other hand

所以如果你只想要每一行的规范,那就是你想要的。另一方面

 tf.matmul(X, X, transpose_b=True) = [[ X[1].X[1], X[1].X[2], ..., X[1].X[n]], 
                                       [X[2].X[1], ..., X[2].X[n]],
                                       ...
                                       [X[n].X[1], ..., X[n].X[n]]

so that is what you need if you want the similarity between all pairs of rows.

所以如果您想要所有行对之间的相似性,这就是您所需要的。

回答by Ben Usman

What tf.multiply(X, X)does is essentially multiplying each element of the matrix with itself, like

是什么tf.multiply(X, X)呢矩阵的每个元素基本上是乘以本身,像

[[1 2]
 [3 4]]

would turn into

会变成

[[1 4]
 [9 16]]

whereas tf.reduce_sum(_, axis=1)takes a sum of each row, so the result for the previous example will be

tf.reduce_sum(_, axis=1)对每行求和,因此上一个示例的结果将是

[5 25]

which is exactly (by definition) equal to [X[0, :] @ X[0, :], X[1, :] @ X[1, :]].

这完全(根据定义)等于[X[0, :] @ X[0, :], X[1, :] @ X[1, :]]

Just put it down with variable names [[a b] [c d]]instead of actual numbers and look at what does tf.matmul(X, X)and tf.multiply(X, X)do.

只需用变量名[[a b] [c d]]而不是实际数字记下它,然后看看做什么tf.matmul(X, X)tf.multiply(X, X)做什么。

回答by Seenivasan

In short tf.multiply()does element wise product(dot product). whereas tf.matmul()does actual matrix mutliplication. so tf.multiply()needs arguments of same shape so that element wise product is possible i.e.,shapes are (n,m) and (n,m). But tf.matmul()needs arguments of shape (n,m) and (m,p)so that resulting matrix is (n,p) [ usual math ].

简而言之tf.multiply()做元素明智的产品(点积)。而tf.matmul() 进行实际的矩阵乘法运算。所以tf.multiply()需要相同形状的参数,以便元素明智的产品是可能的,即形状是(n,m) 和 (n,m)。但是tf.matmul()需要形状为(n,m) 和 (m,p) 的参数,因此结果矩阵是 (n,p) [通常的数学]。

Once understood, this can be applied to Multi-Dimensional matrices easily.

一旦理解,这可以很容易地应用于多维矩阵。