在带有分层索引的 Pandas 数据框中使用 iloc 时遇到问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20016360/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:20:58  来源:igfitidea点击:

Trouble with using iloc in pandas dataframe with hierarchical index

pythonpython-2.7pandas

提问by user2205

I'm getting this ValueError whenever I try to give a list to iloc on a dataframe with a hierarchical index. I'm not sure if I'm doing something wrong or if this is a bug. I haven't had any issues using iloc the same way with a non-hierarchical index. This is using Pandas 0.12.0.

每当我尝试在具有分层索引的数据帧上向 iloc 提供列表时,我都会收到此 ValueError。我不确定是我做错了什么还是这是一个错误。我没有以与非分层索引相同的方式使用 iloc 的任何问题。这是使用 Pandas 0.12.0。

In [25]: df
Out[25]: 
            D         E         F
a x -1.050681 -0.084306 -1.635852 
  y  1.544577  1.594976 -0.084866
b x  0.462529 -1.873250  1.252685
  y -0.468074  0.673112 -0.900547
c x  0.901710 -0.432554  0.260157
  y  0.101522 -0.550223  1.389497

In [26]: df.iloc[[1,3]]
..... snip .....
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'

In [27]: df.iloc[range(2)]
...... snip .....
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'

采纳答案by Andy Hayden

This was a bugand has been fixed in master (0.13), a temporary workaround is to use ix (!):

这是一个错误并已在 master (0.13) 中修复,临时解决方法是使用 ix (!):

In [11]: df1.ix[[1, 3]]
Out[11]: 
            D         E         F
a y  1.544577  1.594976 -0.084866
b y -0.468074  0.673112 -0.900547

In master, 0.13:

在大师,0.13:

In [12]: df1.iloc[[1, 3]]
Out[12]: 
            D         E         F
a y  1.544577  1.594976 -0.084866
b y -0.468074  0.673112 -0.900547

回答by alko

It seems that pandas can't convert [[1,3]]to a proper MultiIndex. You might want to fill a bug in pandas issues tracker. The only workaround I found is to construct it manually, this way it is passed as is.

似乎Pandas无法转换[[1,3]]为正确的 MultiIndex。您可能想要填充 pandas issues tracker 中的错误。我发现的唯一解决方法是手动构建它,这样它就按原样传递。

>>> tup = zip(*[['a','a','b','b'],['x','y','x','y']])
>>> index = pd.MultiIndex.from_tuples(tup, names=['f','s'])
>>> df = pd.DataFrame(np.random.randn(4, 4))
>>> df
            0         1         2         3
f s
a x -0.334280  0.479317 -0.358416 -0.245807
  y  1.279348 -0.096336  0.100285  0.037231
b x -0.368452  0.219868 -0.103722 -0.575399
  y -0.813583 -0.042694  0.897361  1.636304
>>> idx = [i in [1,3] for i in range(len(df.index))]
>>> idx
[False, True, False, True]
>>> df.iloc[idx]
            0         1         2         3
f s
a y  1.279348 -0.096336  0.100285  0.037231
b y -0.813583 -0.042694  0.897361  1.636304

Other ways is to use get_level_valuesto access MultiIndexby level

其他方式是使用按级别get_level_values访问MultiIndex

>>> df.iloc[df.index.get_level_values('f') == 'a']
            0         1         2         3
f s
a x -0.334280  0.479317 -0.358416 -0.245807
  y  1.279348 -0.096336  0.100285  0.037231

On contrast, slice is correctly converted to MultiIndex:

相比之下, slice 正确转换为 MultiIndex:

>>> df.iloc[0:2,:]
           0         1         2         3
f s
a x -0.33428  0.479317 -0.358416 -0.245807
a y  1.279348 -0.096336  0.100285  0.037231