在带有分层索引的 Pandas 数据框中使用 iloc 时遇到问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20016360/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Trouble with using iloc in pandas dataframe with hierarchical index
提问by user2205
I'm getting this ValueError whenever I try to give a list to iloc on a dataframe with a hierarchical index. I'm not sure if I'm doing something wrong or if this is a bug. I haven't had any issues using iloc the same way with a non-hierarchical index. This is using Pandas 0.12.0.
每当我尝试在具有分层索引的数据帧上向 iloc 提供列表时,我都会收到此 ValueError。我不确定是我做错了什么还是这是一个错误。我没有以与非分层索引相同的方式使用 iloc 的任何问题。这是使用 Pandas 0.12.0。
In [25]: df
Out[25]:
D E F
a x -1.050681 -0.084306 -1.635852
y 1.544577 1.594976 -0.084866
b x 0.462529 -1.873250 1.252685
y -0.468074 0.673112 -0.900547
c x 0.901710 -0.432554 0.260157
y 0.101522 -0.550223 1.389497
In [26]: df.iloc[[1,3]]
..... snip .....
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'
In [27]: df.iloc[range(2)]
...... snip .....
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'
采纳答案by Andy Hayden
This was a bugand has been fixed in master (0.13), a temporary workaround is to use ix (!):
这是一个错误并已在 master (0.13) 中修复,临时解决方法是使用 ix (!):
In [11]: df1.ix[[1, 3]]
Out[11]:
D E F
a y 1.544577 1.594976 -0.084866
b y -0.468074 0.673112 -0.900547
In master, 0.13:
在大师,0.13:
In [12]: df1.iloc[[1, 3]]
Out[12]:
D E F
a y 1.544577 1.594976 -0.084866
b y -0.468074 0.673112 -0.900547
回答by alko
It seems that pandas can't convert [[1,3]]to a proper MultiIndex. You might want to fill a bug in pandas issues tracker. The only workaround I found is to construct it manually, this way it is passed as is.
似乎Pandas无法转换[[1,3]]为正确的 MultiIndex。您可能想要填充 pandas issues tracker 中的错误。我发现的唯一解决方法是手动构建它,这样它就按原样传递。
>>> tup = zip(*[['a','a','b','b'],['x','y','x','y']])
>>> index = pd.MultiIndex.from_tuples(tup, names=['f','s'])
>>> df = pd.DataFrame(np.random.randn(4, 4))
>>> df
0 1 2 3
f s
a x -0.334280 0.479317 -0.358416 -0.245807
y 1.279348 -0.096336 0.100285 0.037231
b x -0.368452 0.219868 -0.103722 -0.575399
y -0.813583 -0.042694 0.897361 1.636304
>>> idx = [i in [1,3] for i in range(len(df.index))]
>>> idx
[False, True, False, True]
>>> df.iloc[idx]
0 1 2 3
f s
a y 1.279348 -0.096336 0.100285 0.037231
b y -0.813583 -0.042694 0.897361 1.636304
Other ways is to use get_level_valuesto access MultiIndexby level
其他方式是使用按级别get_level_values访问MultiIndex
>>> df.iloc[df.index.get_level_values('f') == 'a']
0 1 2 3
f s
a x -0.334280 0.479317 -0.358416 -0.245807
y 1.279348 -0.096336 0.100285 0.037231
On contrast, slice is correctly converted to MultiIndex:
相比之下, slice 正确转换为 MultiIndex:
>>> df.iloc[0:2,:]
0 1 2 3
f s
a x -0.33428 0.479317 -0.358416 -0.245807
a y 1.279348 -0.096336 0.100285 0.037231

