pandas Python 错误无法从空轴执行非空取值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45138917/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python error cannot do a non empty take from an empty axes
提问by ELI
I have a pandas dataframe with more than 400 thousands rows and now I want to calculate the interquartile range for each row but my code produced the following errors:
我有一个超过 40 万行的 Pandas 数据框,现在我想计算每一行的四分位距,但我的代码产生了以下错误:
cannot do a non empty take from an empty axes
不能从空轴进行非空取
My code:
我的代码:
def calIQR(x):
x=x.dropna()
return (np.percentile(x,75),np.percentile(x,25))
df["count"]=df.iloc[:,2:64].apply(calIQR,axis=1)
I am running Python 2.7.13
我正在运行 Python 2.7.13
I searched online but still had no idea why this error occurred.
我在网上搜索,但仍然不知道为什么会发生此错误。
The 2 to 64 columns of dataset basically look like that:
In each row, there are some NaN values, but I am sure that there is no row will all NaN.
在每一行中,都有一些 NaN 值,但我确信没有一行将全部为 NaN。
采纳答案by jezrael
I think here is problem row has all NaN
s values in 2
to 63
columns and x = x.dropna
return empty Series
.
我觉得这里是问题行有所有NaN
的价值观2
,以63
列x = x.dropna
空车返回Series
。
So need add dropna
after iloc
:
所以需要在dropna
后面添加iloc
:
np.random.seed(100)
df = pd.DataFrame(np.random.random((5,5)))
df.loc[3, [3,4]] = np.nan
df.loc[2] = np.nan
print (df)
0 1 2 3 4
0 0.543405 0.278369 0.424518 0.844776 0.004719
1 0.121569 0.670749 0.825853 0.136707 0.575093
2 NaN NaN NaN NaN NaN
3 0.978624 0.811683 0.171941 NaN NaN
4 0.431704 0.940030 0.817649 0.336112 0.175410
def calIQR(x):
x = x.dropna()
return (np.percentile(x,75),np.percentile(x,25))
df["count"]=df.iloc[:,2:4].dropna(how='all').apply(calIQR,axis=1)
print (df)
0 1 2 3 4 \
0 0.543405 0.278369 0.424518 0.844776 0.004719
1 0.121569 0.670749 0.825853 0.136707 0.575093
2 NaN NaN NaN NaN NaN
3 0.978624 0.811683 0.171941 NaN NaN
4 0.431704 0.940030 0.817649 0.336112 0.175410
count
0 (0.739711496927, 0.529582226142)
1 (0.65356621375, 0.30899313104)
2 NaN
3 (0.171941012733, 0.171941012733)
4 (0.697265021613, 0.456496307285)
Or use Series.quantile
:
或使用Series.quantile
:
def calIQR(x):
return (x.quantile(.75),x.quantile(.25))
#with real data change 2;4 to 2:64
df["count"]=df.iloc[:,2:4].apply(calIQR,axis=1)
print (df)
0 1 2 3 4 \
0 0.543405 0.278369 0.424518 0.844776 0.004719
1 0.121569 0.670749 0.825853 0.136707 0.575093
2 NaN NaN NaN NaN NaN
3 0.978624 0.811683 0.171941 NaN NaN
4 0.431704 0.940030 0.817649 0.336112 0.175410
count
0 (0.7397114969272109, 0.5295822261418257)
1 (0.653566213750024, 0.3089931310399766)
2 (nan, nan)
3 (0.1719410127325942, 0.1719410127325942)
4 (0.6972650216127702, 0.45649630728485585)