Python NumPy 中的 np.mean() 与 np.average() 对比?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20054243/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
np.mean() vs np.average() in Python NumPy?
提问by Sibbs Gambling
I notice that
我注意到
In [30]: np.mean([1, 2, 3])
Out[30]: 2.0
In [31]: np.average([1, 2, 3])
Out[31]: 2.0
However, there should be some differences, since after all they are two different functions.
但是,应该有一些差异,因为它们毕竟是两个不同的功能。
What are the differences between them?
它们之间有什么区别?
采纳答案by Hammer
np.average takes an optional weight parameter. If it is not supplied they are equivalent. Take a look at the source code: Mean, Average
np.average 采用可选的权重参数。如果未提供,则它们是等效的。看一下源码:Mean, Average
np.mean:
np.mean:
try:
mean = a.mean
except AttributeError:
return _wrapit(a, 'mean', axis, dtype, out)
return mean(axis, dtype, out)
np.average:
np.average:
...
if weights is None :
avg = a.mean(axis)
scl = avg.dtype.type(a.size/avg.size)
else:
#code that does weighted mean here
if returned: #returned is another optional argument
scl = np.multiply(avg, 0) + scl
return avg, scl
else:
return avg
...
回答by Prashant Kumar
回答by Amber
np.meanalways computes an arithmetic mean, and has some additional options for input and output (e.g. what datatypes to use, where to place the result).
np.mean总是计算算术平均值,并有一些额外的输入和输出选项(例如使用什么数据类型,放置结果的位置)。
np.averagecan compute a weighted average if the weightsparameter is supplied.
np.average如果weights提供了参数,则可以计算加权平均值。
回答by G M
In some version of numpythere is another imporant difference that you must be aware:
在某些版本的 numpy 中,您必须注意另一个重要区别:
averagedo not take in account masks, so compute the average over the whole set of data.
average不考虑掩码,因此计算整个数据集的平均值。
meantakes in account masks, so compute the mean only over unmasked values.
mean考虑掩码,因此仅计算未掩码值的平均值。
g = [1,2,3,55,66,77]
f = np.ma.masked_greater(g,5)
np.average(f)
Out: 34.0
np.mean(f)
Out: 2.0
回答by Grant Petty
In addition to the differences already noted, there's another extremely important difference that I just now discovered the hard way: unlike np.mean, np.averagedoesn't allow the dtypekeyword, which is essential for getting correct results in some cases. I have a very large single-precision array that is accessed from an h5file. If I take the mean along axes 0 and 1, I get wildly incorrect results unless I specify dtype='float64':
除了已经注意到的差异之外,还有一个我刚刚发现的非常重要的差异:不像np.mean,np.average不允许使用dtype关键字,这对于在某些情况下获得正确结果至关重要。我有一个从h5文件访问的非常大的单精度数组。如果我沿 0 轴和 1 轴取平均值,除非我指定dtype='float64':
>T.shape
(4096, 4096, 720)
>T.dtype
dtype('<f4')
m1 = np.average(T, axis=(0,1)) # garbage
m2 = np.mean(T, axis=(0,1)) # the same garbage
m3 = np.mean(T, axis=(0,1), dtype='float64') # correct results
Unfortunately, unless you know what to look for, you can't necessarily tell your results are wrong. I will never use np.averageagain for this reason but will always use np.mean(.., dtype='float64')on any large array. If I want a weighted average, I'll compute it explicitly using the product of the weight vector and the target array and then either np.sumor np.mean, as appropriate (with appropriate precision as well).
不幸的是,除非您知道要查找什么,否则您不一定能判断出您的结果是错误的。np.average由于这个原因,我永远不会再使用,但将始终np.mean(.., dtype='float64')在任何大型阵列上使用。如果我想的加权平均,我会计算它明确地使用权重向量的乘积和所述目标阵列,然后或者np.sum或np.mean酌情(具有适当的精度以及)。

