用 Python 和 Numpy 计算协方差

Question

提问by Dave

I am trying to figure out how to calculate covariance with the Python Numpy function cov. When I pass it two one-dimentional arrays, I get back a 2x2 matrix of results. I don't know what to do with that. I'm not great at statistics, but I believe covariance in such a situation should be a single number. Thisis what I am looking for. I wrote my own:

我想弄清楚如何使用 Python Numpy 函数 cov 计算协方差。当我传递两个一维数组时，我得到一个 2x2 的结果矩阵。我不知道该怎么办。我不擅长统计，但我相信这种情况下的协方差应该是一个单一的数字。这就是我正在寻找的。我自己写的：

def cov(a, b):

    if len(a) != len(b):
        return

    a_mean = np.mean(a)
    b_mean = np.mean(b)

    sum = 0

    for i in range(0, len(a)):
        sum += ((a[i] - a_mean) * (b[i] - b_mean))

    return sum/(len(a)-1)

That works, but I figure the Numpy version is much more efficient, if I could figure out how to use it.

这是有效的，但我认为 Numpy 版本效率更高，如果我能弄清楚如何使用它。

Does anybody know how to make the Numpy cov function perform like the one I wrote?

有人知道如何让 Numpy cov 函数像我写的那样执行吗？

Thanks,

谢谢，

Dave

戴夫

Answer 1

采纳答案by unutbu

When aand bare 1-dimensional sequences, numpy.cov(a,b)[0][1]is equivalent to your cov(a,b).

当a和b是一维序列时，numpy.cov(a,b)[0][1]相当于你的cov(a,b).

The 2x2 array returned by np.cov(a,b)has elements equal to

返回的 2x2 数组的np.cov(a,b)元素等于

cov(a,a)  cov(a,b)

cov(a,b)  cov(b,b)

(where, again, covis the function you defined above.)

（同样，这里cov是您在上面定义的函数。）

Answer 2

回答by Osian

Thanks to unutbu for the explanation. By default numpy.cov calculates the sample covariance. To obtain the population covariance you can specify normalisation by the total N samples like this:

感谢 unutbu 的解释。默认情况下 numpy.cov 计算样本协方差。要获得总体协方差，您可以通过总 N 个样本指定归一化，如下所示：

Covariance = numpy.cov(a, b, bias=True)[0][1]
print(Covariance)

or like this:

或者像这样：

Covariance = numpy.cov(a, b, ddof=0)[0][1]
print(Covariance)

用 Python 和 Numpy 计算协方差

提问by Dave

采纳答案by unutbu

回答by Osian

相关推荐

最近更新

标签

用 Python 和 Numpy 计算协方差

提问by Dave

采纳答案by unutbu

回答by Osian

相关推荐

Python 熊猫：有条件的滚动计数

Python django.core.exceptions.ImproperlyConfigured：加载 MySQLdb 模块时出错：没有名为 MySQLdb 的模块

Python 多行 pprint 字典

Python 将列按名称移动到熊猫表的前面

相关推荐

最近更新

标签