pandas Python 错误无法从空轴执行非空取值

Question

提问by ELI

I have a pandas dataframe with more than 400 thousands rows and now I want to calculate the interquartile range for each row but my code produced the following errors:

我有一个超过 40 万行的 Pandas 数据框，现在我想计算每一行的四分位距，但我的代码产生了以下错误：

cannot do a non empty take from an empty axes

不能从空轴进行非空取

My code:

我的代码：

def calIQR(x):
    x=x.dropna()
    return (np.percentile(x,75),np.percentile(x,25))

df["count"]=df.iloc[:,2:64].apply(calIQR,axis=1)

I am running Python 2.7.13

我正在运行 Python 2.7.13

I searched online but still had no idea why this error occurred.

我在网上搜索，但仍然不知道为什么会发生此错误。

The 2 to 64 columns of dataset basically look like that:

数据集的 2 到 64 列基本上是这样的：

In each row, there are some NaN values, but I am sure that there is no row will all NaN.

在每一行中，都有一些 NaN 值，但我确信没有一行将全部为 NaN。

Answer 1

采纳答案by jezrael

I think here is problem row has all NaNs values in 2to 63columns and x = x.dropnareturn empty Series.

我觉得这里是问题行有所有NaN的价值观2，以63列x = x.dropna空车返回Series。

So need add dropnaafter iloc:

所以需要在dropna后面添加iloc：

np.random.seed(100)
df = pd.DataFrame(np.random.random((5,5)))
df.loc[3, [3,4]] = np.nan
df.loc[2] = np.nan
print (df)
         0         1         2         3         4
0  0.543405  0.278369  0.424518  0.844776  0.004719
1  0.121569  0.670749  0.825853  0.136707  0.575093
2       NaN       NaN       NaN       NaN       NaN
3  0.978624  0.811683  0.171941       NaN       NaN
4  0.431704  0.940030  0.817649  0.336112  0.175410

def calIQR(x):
    x = x.dropna()
    return (np.percentile(x,75),np.percentile(x,25))

df["count"]=df.iloc[:,2:4].dropna(how='all').apply(calIQR,axis=1)
print (df)
          0         1         2         3         4  \
0  0.543405  0.278369  0.424518  0.844776  0.004719   
1  0.121569  0.670749  0.825853  0.136707  0.575093   
2       NaN       NaN       NaN       NaN       NaN   
3  0.978624  0.811683  0.171941       NaN       NaN   
4  0.431704  0.940030  0.817649  0.336112  0.175410   

                              count  
0  (0.739711496927, 0.529582226142)  
1    (0.65356621375, 0.30899313104)  
2                               NaN  
3  (0.171941012733, 0.171941012733)  
4  (0.697265021613, 0.456496307285)

Or use Series.quantile:

或使用Series.quantile：

 def calIQR(x):
    return (x.quantile(.75),x.quantile(.25))

#with real data change 2;4 to 2:64
df["count"]=df.iloc[:,2:4].apply(calIQR,axis=1)
print (df)
          0         1         2         3         4  \
0  0.543405  0.278369  0.424518  0.844776  0.004719   
1  0.121569  0.670749  0.825853  0.136707  0.575093   
2       NaN       NaN       NaN       NaN       NaN   
3  0.978624  0.811683  0.171941       NaN       NaN   
4  0.431704  0.940030  0.817649  0.336112  0.175410   

                                       count  
0   (0.7397114969272109, 0.5295822261418257)  
1    (0.653566213750024, 0.3089931310399766)  
2                                 (nan, nan)  
3   (0.1719410127325942, 0.1719410127325942)  
4  (0.6972650216127702, 0.45649630728485585)

pandas Python 错误无法从空轴执行非空取值

提问by ELI

采纳答案by jezrael

相关推荐

最近更新

标签

pandas Python 错误无法从空轴执行非空取值

提问by ELI

采纳答案by jezrael

相关推荐

Python Pandas：字符串到日期时间

Pandas：拆分一个字符串然后创建一个新列？

pandas 将csv文件作为浮点数读取到pandas数据帧

pandas 按列表过滤熊猫数据框

相关推荐

最近更新

标签