在 scipy/pandas 中使用 Pearson 的 r 删除“nan”

Question

提问by Lodore66

Quick question: Is there a way to use 'dropna' with the Pearson's r function in scipy? I'm using it in conjunction with pandas, and some of my data has holes in it. I know you used to be able suppress 'nan' with Spearman's r in older versions of scipy, but that functionality is now missing.

快速提问：有没有办法在 scipy 中使用带有 Pearson r 函数的“dropna”？我将它与 Pandas 结合使用，我的一些数据中有漏洞。我知道您曾经可以在旧版本的 scipy 中使用 Spearman 的 r 抑制“nan” ，但是现在缺少该功能。

To my mind, this seems like a disimprovement, so I wonder if I'm missing something obvious.

在我看来，这似乎是一种进步，所以我想知道我是否遗漏了一些明显的东西。

My code:

我的代码：

for i in range(len(frame3.columns)):    
    correlation.append(sp.pearsonr(frame3.iloc[ :,i], control['CONTROL']))

Answer 1

回答by Ami Tavory

You can use np.isnanlike this:

你可以这样使用np.isnan：

for i in range(len(frame3.columns)):    
    x, y = frame3.iloc[ :,i].values, control['CONTROL'].values
    nas = np.logical_or(x.isnan(), y.isnan())
    corr = sp.pearsonr(x[~nas], y[~nas])
    correlation.append(corr)

Answer 2

回答by Daniel Gibson

You can also try creating temporary dataframe, and used pandas built-in method for computing pearson correlation, or use the .dropna method in the temporary dataframe to drup null values before using sp.pearsonr

您也可以尝试创建临时数据框，并使用pandas内置方法计算皮尔逊相关，或者在使用sp.pearsonr之前使用临时数据框中的.dropna方法删除空值

for col in frame3.columns:    
     correlation.append(frame3[col].to_frame(name='3').join(control['CONTROL']).corr()['3']['CONTROL'])

在 scipy/pandas 中使用 Pearson 的 r 删除“nan”

提问by Lodore66

回答by Ami Tavory

回答by Daniel Gibson

相关推荐

最近更新

标签

在 scipy/pandas 中使用 Pearson 的 r 删除“nan”

提问by Lodore66

回答by Ami Tavory

回答by Daniel Gibson

相关推荐

pandas 列上的熊猫数据框排序会引发索引上的关键错误

python-pandas：处理熊猫数据帧日期列中的 NaT 类型值

在 Python pandas DataFrame 中将浮点数舍入/近似到小数点后 3 位

pandas 跳过 read_csv 中缺失值的行

相关推荐

最近更新

标签