Python 二进制交叉熵损失计算中 np.dot 和 np.multiply 与 np.sum 的区别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48201729/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:35:54  来源:igfitidea点击:

Difference between np.dot and np.multiply with np.sum in binary cross-entropy loss calculation

pythonnumpyneural-networksumdifference

提问by Asad Shakeel

I have tried the following code but didn't find the difference between np.dotand np.multiply with np.sum

我尝试了以下代码,但没有发现np.dotnp.multiply 与 np.sum之间的区别

Here is np.dotcode

这是np.dot代码

logprobs = np.dot(Y, (np.log(A2)).T) + np.dot((1.0-Y),(np.log(1 - A2)).T)
print(logprobs.shape)
print(logprobs)
cost = (-1/m) * logprobs
print(cost.shape)
print(type(cost))
print(cost)

Its output is

它的输出是

(1, 1)
[[-2.07917628]]
(1, 1)
<class 'numpy.ndarray'>
[[ 0.693058761039 ]]

Here is the code for np.multiply with np.sum

这是np.multiply 与 np.sum的代码

logprobs = np.sum(np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2)))
print(logprobs.shape)         
print(logprobs)
cost = - logprobs / m
print(cost.shape)
print(type(cost))
print(cost)

Its output is

它的输出是

()
-2.07917628312
()
<class 'numpy.float64'>
0.693058761039

I'm unable to understand the type and shape difference whereas the result value is same in both cases

我无法理解类型和形状的差异,而两种情况下的结果值都相同

Even in the case of squeezing former code cost value become same as later but type remains same

即使在压缩前代码成本值的情况下也 变成与后相同但类型保持不变

cost = np.squeeze(cost)
print(type(cost))
print(cost)

output is

输出是

<class 'numpy.ndarray'>
0.6930587610394646

采纳答案by kmario23

What you're doing is calculating the binary cross-entropy losswhich measures how bad the predictions (here: A2) of the model are when compared to the true outputs (here: Y).

您正在做的是计算二进制交叉熵损失,该损失衡量A2模型的预测(此处:)与真实输出(此处:)相比的糟糕程度Y

Here is a reproducible example for your case, which should explain why you get a scalar in the second case using np.sum

这是您的案例的可重现示例,它应该解释为什么在第二种情况下使用 np.sum

In [88]: Y = np.array([[1, 0, 1, 1, 0, 1, 0, 0]])

In [89]: A2 = np.array([[0.8, 0.2, 0.95, 0.92, 0.01, 0.93, 0.1, 0.02]])

In [90]: logprobs = np.dot(Y, (np.log(A2)).T) + np.dot((1.0-Y),(np.log(1 - A2)).T)

# `np.dot` returns 2D array since its arguments are 2D arrays
In [91]: logprobs
Out[91]: array([[-0.78914626]])

In [92]: cost = (-1/m) * logprobs

In [93]: cost
Out[93]: array([[ 0.09864328]])

In [94]: logprobs = np.sum(np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2)))

# np.sum returns scalar since it sums everything in the 2D array
In [95]: logprobs
Out[95]: -0.78914625761870361

Note that the np.dotsums along only the inner dimensionswhich match here (1x8) and (8x1). So, the 8s will be gone during the dot product or matrix multiplication yielding the result as (1x1)which is just a scalarbut returned as 2D array of shape (1,1).

请注意,仅在此处匹配的内部维度上的np.dot总和。因此,s 将在点积或矩阵乘法过程中消失,产生的结果只是一个标量,但作为 2D 形状数组返回。(1x8) and (8x1)8(1x1)(1,1)



Also, most importantly note that here np.dotis exactly sameas doing np.matmulsince the inputs are 2D arrays (i.e. matrices)

此外,最重要的注意这里np.dot完全相同因为这样做np.matmul,因为输入是二维数组(即矩阵)

In [107]: logprobs = np.matmul(Y, (np.log(A2)).T) + np.matmul((1.0-Y),(np.log(1 - A2)).T)

In [108]: logprobs
Out[108]: array([[-0.78914626]])

In [109]: logprobs.shape
Out[109]: (1, 1)


Return result as a scalarvalue

将结果作为量值返回

np.dotor np.matmulreturns whatever the resulting array shape would be, based on input arrays. Even with out=argument it's not possible to return a scalar, if the inputs are 2D arrays. However, we can use np.asscalar()on the result to convert it to a scalar if the result array is of shape (1,1)(or more generally a scalarvalue wrapped in an nD array)

np.dotnp.matmul根据输入数组返回任何结果数组形状。如果输入是二维数组,即使有out=参数也不可能返回scalar。但是,np.asscalar()如果结果数组具有形状(1,1)(或更一般地说是包裹在 nD 数组中的量值),我们可以使用结果将其转换为标量

In [123]: np.asscalar(logprobs)
Out[123]: -0.7891462576187036

In [124]: type(np.asscalar(logprobs))
Out[124]: float


ndarrayof size 1 to scalarvalue

大小为 1 的ndarray量值

In [127]: np.asscalar(np.array([[[23.2]]]))
Out[127]: 23.2

In [128]: np.asscalar(np.array([[[[23.2]]]]))
Out[128]: 23.2

回答by Anuj Gautam

np.dotis the dot productof two matrices.

np.dot是两个矩阵的点积

|A B| . |E F| = |A*E+B*G A*F+B*H|
|C D|   |G H|   |C*E+D*G C*F+D*H|

Whereas np.multiplydoes an element-wise multiplicationof two matrices.

np.multiply确实的逐元素乘法两个矩阵。

|A B| ⊙ |E F| = |A*E B*F|
|C D|   |G H|   |C*G D*H|

When used with np.sum, the result being equal is merely a coincidence.

与 一起使用时np.sum,结果相等只是巧合。

>>> np.dot([[1,2], [3,4]], [[1,2], [2,3]])
array([[ 5,  8],
       [11, 18]])
>>> np.multiply([[1,2], [3,4]], [[1,2], [2,3]])
array([[ 1,  4],
       [ 6, 12]])

>>> np.sum(np.dot([[1,2], [3,4]], [[1,2], [2,3]]))
42
>>> np.sum(np.multiply([[1,2], [3,4]], [[1,2], [2,3]]))
23

回答by hpaulj

If Yand A2are (1,N) arrays, then np.dot(Y,A.T)will produce a (1,1) result. It is doing a matrix multiplication of a (1,N) with a (N,1). The N'sare summed, leaving the (1,1).

如果YA2是 (1,N) 数组,np.dot(Y,A.T)则将产生 (1,1) 结果。它正在做 (1,N) 与 (N,1) 的矩阵乘法。该N's相加,离开(1,1)。

With multiplythe result is (1,N). Sum all values, and the result is a scalar.

multiply结果为(1,N)。对所有值求和,结果是一个标量。

If Yand A2were (N,) shaped (same number of elements, but 1d), the np.dot(Y,A2)(no .T) would also produce a scalar. From np.dotdocumentation:

如果YA2是 (N,) 形状(相同数量的元素,但为 1d),则np.dot(Y,A2)(no .T) 也会产生一个标量。从np.dot文档:

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors

Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned.

对于二维数组,它等效于矩阵乘法,对于一维数组等效于向量的内积

返回 a 和 b 的点积。如果 a 和 b 都是标量或都是一维数组,则返回一个标量;否则返回一个数组。

squeezereduces all size 1 dimensions, but still returns an array. In numpyan array can have any number of dimensions (from 0 to 32). So a 0d array is possible. Compare the shape of np.array(3), np.array([3])and np.array([[3]]).

squeeze减少所有大小为 1 的维度,但仍返回一个数组。在numpy一个数组中可以有任意数量的维度(从 0 到 32)。所以 0d 数组是可能的。比较np.array(3),np.array([3])和的形状np.array([[3]])

回答by Ashish S

In this example it just not a coincidence. Lets take an example we have two (1,3) and (1,3) matrices. 
// Lets code 

import numpy as np

x1=np.array([1, 2, 3]) // first array
x2=np.array([3, 4, 3]) // second array

//Then 

X_Res=np.sum(np.multiply(x1,x2)) 
// will result 20 as it will be calculated as - (1*3)+(2*4)+(3*3) , i.e element wise
// multiplication followed by sum.

Y_Res=np.dot(x1,x2.T) 

// in order to get (1,1) matrix) from a dot of (1,3) matrix and //(1,3) matrix we need to //transpose second one. 
//Hence|1 2 3| * |3|
//               |4| = |1*3+2*4+3*3| = |20|
//               |3|
// will result 20 as it will be (1*3)+(2*4)+(3*3) , i.e. dot product of two matrices

print X_Res //20

print Y_Res //20