Python 为什么 corrcoef 返回一个矩阵?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3425439/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:00:48  来源:igfitidea点击:

Why does corrcoef return a matrix?

pythonmathnumpy

提问by Dan

It seems strange to me that np.corrcoef returns a matrix.

我觉得 np.corrcoef 返回一个矩阵很奇怪。

 correlation1 = corrcoef(Strategy1Returns,Strategy2Returns)

[[ 1.         -0.99598935]
 [-0.99598935  1.        ]]

Does anyone know why this is the case and whether it is possible to return just one value in the classical sense?

有谁知道为什么会这样,以及是否可以只返回经典意义上的一个值?

回答by Katriel

corrcoefreturns the normalised covariance matrix.

corrcoef返回归一化协方差矩阵。

The covariance matrix is the matrix

协方差矩阵是矩阵

Cov( X, X )    Cov( X, Y )

Cov( Y, X )    Cov( Y, Y )

Normalised, this will yield the matrix:

归一化,这将产生矩阵:

Corr( X, X )    Corr( X, Y )

Corr( Y, X )    Corr( Y, Y )

correlation1[0, 0 ]is the correlation between Strategy1Returnsand itself, which must be 1. You just want correlation1[ 0, 1 ].

correlation1[0, 0 ]Strategy1Returns和 本身之间的相关性,必须为 1。您只需要correlation1[ 0, 1 ].

回答by Philipp

The correlation matrix is the standard way to express correlations between an arbitrary finite number of variables. The correlation matrix of Ndata vectors is a symmetric N× Nmatrix with unity diagonal. Only in the case N= 2 does this matrix have one free parameter.

相关矩阵是表达任意有限数量变量之间相关性的标准方法。N个数据向量的相关矩阵是一个具有单位对角线的对称N× N矩阵。只有在N= 2的情况下,这个矩阵才有一个自由参数。

回答by kennytm

It allows you to compute correlation coefficients of >2 data sets, e.g.

它允许您计算 >2 个数据集的相关系数,例如

>>> from numpy import *
>>> a = array([1,2,3,4,6,7,8,9])
>>> b = array([2,4,6,8,10,12,13,15])
>>> c = array([-1,-2,-2,-3,-4,-6,-7,-8])
>>> corrcoef([a,b,c])
array([[ 1.        ,  0.99535001, -0.9805214 ],
       [ 0.99535001,  1.        , -0.97172394],
       [-0.9805214 , -0.97172394,  1.        ]])

Here we can get the correlation coefficient of a,b (0.995), a,c (-0.981) and b,c (-0.972) at once. The two-data-set case is just a special case of N-data-set class. And probably it's better to keep the same return type. Since the "one value" can be obtained simply with

在这里我们可以一次性得到 a,b (0.995), a,c (-0.981) 和 b,c (-0.972) 的相关系数。两个数据集的情况只是 N 数据集类的一个特例。并且可能最好保持相同的返回类型。由于可以简单地获得“一个值”

>>> corrcoef(a,b)[1,0]
0.99535001355530017

there's no big reason to create the special case.

没有什么大的理由来创建特殊情况。

回答by schwater

Consider using matplotlib.cbook pieces

考虑使用 matplotlib.cbook 片段

for example:

例如:

import matplotlib.cbook as cbook
segments = cbook.pieces(np.arange(20), 3)
for s in segments:
     print s

回答by Sergio

The function Correlate of numpy works with 2 1D arrays that you want to correlate and returns one correlation value.

numpy 的函数 Correlate 处理要关联的 2 个一维数组并返回一个相关值。

回答by Arman Aynaszyan

You can use the following function to return only the correlation coefficient:

您可以使用以下函数仅返回相关系数:

def pearson_r(x, y):
"""Compute Pearson correlation coefficient between two arrays."""

   # Compute correlation matrix
   corr_mat = np.corrcoef(x, y)

   # Return entry [0,1]
   return corr_mat[0,1]