python中的numpy var()和统计方差()有什么区别？

Question

提问by Michail Michailidis

I was trying one Dataquest exercise and I figured out that the variance I am getting is different for the two packages..

我正在尝试一个 Dataquest 练习，我发现这两个包的差异是不同的。

e.g for [1,2,3,4]

例如对于 [1,2,3,4]

from statistics import variance
import numpy as np
print(np.var([1,2,3,4]))
print(variance([1,2,3,4]))
//1.25
//1.6666666666666667

The expected answer of the exercise is calculated with np.var()

练习的预期答案是用 np.var() 计算的

EditI guess it has to do that the later one is sample variance and not variance.. Anyone could explain the difference?

编辑我想它必须这样做，后者是样本方差而不是方差..任何人都可以解释这种差异吗？

Answer 1

回答by FallAndLearn

Use this

用这个

print(np.var([1,2,3,4],ddof=1))

1.66666666667

Delta Degrees of Freedom: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default, ddofis zero.

Delta 自由度：计算中使用的除数是N - ddof，其中 N 表示元素的数量。默认情况下，ddof为零。

The mean is normally calculated as x.sum() / N, where N = len(x). If, however, ddofis specified, the divisor N - ddofis used instead.

平均值通常计算为x.sum() / N，其中N = len(x)。但是，如果ddof指定了，N - ddof则使用除数。

In standard statistical practice, ddof=1provides an unbiased estimator of the variance of a hypothetical infinite population. ddof=0provides a maximum likelihood estimate of the variance for normally distributed variables.

在标准统计实践中，ddof=1提供假设无限总体方差的无偏估计量。ddof=0提供正态分布变量方差的最大似然估计。

Statistical libraries like numpy use the variance nfor what they call var or variance and the standard deviation

像 numpy 这样的统计库使用方差n作为他们所谓的 var 或方差和标准偏差

For more information refer this documentation : numpy doc

有关更多信息，请参阅此文档：numpy doc

Answer 2

回答by Andrew Cameron Morris

It is correct that dividing by N-1 gives an unbiased estimate for the mean, which can give the impression that dividing by N-1 is therefore slightly more accurate, albeit a little more complex. What is too often not stated is that dividing by N gives the minimum variance estimate for the mean, which is likely to be closer to the true mean than the unbiased estimate, as well as being somewhat simpler.

除以 N-1 给出均值的无偏估计是正确的，这可能给人的印象是除以 N-1 因此稍微更准确，尽管稍微复杂一些。通常没有说明的是，除以 N 给出了均值的最小方差估计，这可能比无偏估计更接近真实均值，并且更简单一些。

python中的numpy var()和统计方差()有什么区别？

提问by Michail Michailidis

回答by FallAndLearn

回答by Andrew Cameron Morris

相关推荐

最近更新

标签

python中的numpy var()和统计方差()有什么区别？

提问by Michail Michailidis

回答by FallAndLearn

回答by Andrew Cameron Morris

相关推荐

Python 如何抑制或捕获 subprocess.run() 的输出？

Python 设置 tk.Frame 宽度和高度

通过Selenium和python切换到iframe

Python 电报机器人“找不到聊天”

相关推荐

最近更新

标签