pandas 在数据框的两列之间运行基本关联

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35095249/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:35:30  来源:igfitidea点击:

Run a basic correlation between two columns of a dataframe

pythonpython-2.7pandas

提问by Tiberius

I am trying to be able to produce a correlation matrix from a pandas dataframe using data from specified columns

我试图能够使用来自指定列的数据从Pandas数据帧生成相关矩阵

Here is my csv data:

这是我的 csv 数据:

col0,col1,col2,col3,col4
122468.9071,1417464.203,3546600,151804924,10839476
14691.1139,170036.0407,103847,19208604,2365065

Here are the two dataframes I created:

这是我创建的两个数据框:

df1 = pd.read_csv('c:/temp/test_1.csv', usecols=[0])
df2 = pd.read_csv('c:/temp/test_1.csv', usecols=[1])

I tried the corr and corrwith functions and get the following errors:

我尝试了 corr 和 corrwith 函数并得到以下错误:

Corr Function:

print df1.corr(df2)

Result: 

Error: Could not compare ['pearson'] with block values

Corrwith:

print df1.corrwith(df2)

Result:    

col0   NaN
col1   NaN
dtype: float64

As you can see, there are no null values in the data set and the float64 should be able to handle decimals.

如您所见,数据集中没有空值,并且 float64 应该能够处理小数。

Any assistance on a solve would be greatly appreciated.

任何有关解决的帮助将不胜感激。

Tiberius

提比略

回答by Josh Baker

If you are trying to create a correlation matrix between the two columns, I would suggest bringing them into the same dataframe, like so:

如果您尝试在两列之间创建相关矩阵,我建议将它们放入同一个数据框中,如下所示:

df = pd.read_csv('c:/temp/test_1.csv', usecols=[0,1])
df.corr()

I loaded your data into a csv myself and got a 2x2 correlation matrix of all 1s, which is expected.

我自己将您的数据加载到 csv 中,并得到了一个全为 1 的 2x2 相关矩阵,这是预期的。

You can find documentation on the pandas correlation here: http://pandas.pydata.org/pandas-docs/stable/computation.html#correlation

您可以在此处找到有关Pandas相关性的文档:http: //pandas.pydata.org/pandas-docs/stable/computation.html#correlation