在 scipy/pandas 中使用 Pearson 的 r 删除“nan”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38894488/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Dropping 'nan' with Pearson's r in scipy/pandas
提问by Lodore66
Quick question: Is there a way to use 'dropna' with the Pearson's r function in scipy? I'm using it in conjunction with pandas, and some of my data has holes in it. I know you used to be able suppress 'nan' with Spearman's r in older versions of scipy, but that functionality is now missing.
快速提问:有没有办法在 scipy 中使用带有 Pearson r 函数的“dropna”?我将它与 Pandas 结合使用,我的一些数据中有漏洞。我知道您曾经可以在旧版本的 scipy 中使用 Spearman 的 r 抑制“nan” ,但是现在缺少该功能。
To my mind, this seems like a disimprovement, so I wonder if I'm missing something obvious.
在我看来,这似乎是一种进步,所以我想知道我是否遗漏了一些明显的东西。
My code:
我的代码:
for i in range(len(frame3.columns)):
correlation.append(sp.pearsonr(frame3.iloc[ :,i], control['CONTROL']))
回答by Ami Tavory
回答by Daniel Gibson
You can also try creating temporary dataframe, and used pandas built-in method for computing pearson correlation, or use the .dropna method in the temporary dataframe to drup null values before using sp.pearsonr
您也可以尝试创建临时数据框,并使用pandas内置方法计算皮尔逊相关,或者在使用sp.pearsonr之前使用临时数据框中的.dropna方法删除空值
for col in frame3.columns:
correlation.append(frame3[col].to_frame(name='3').join(control['CONTROL']).corr()['3']['CONTROL'])