pandas 从 MultiIndex 中选择特定级别的数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10175068/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Select data at a particular level from a MultiIndex
提问by elyase
I have the following Pandas Dataframe with a MultiIndex(Z,A):
我有以下带有 MultiIndex(Z,A) 的 Pandas 数据框:
H1 H2
Z A
0 100 200 0.3112 -0.4197
1 100 201 0.2967 0.4893
2 100 202 0.3084 -0.4873
3 100 203 0.3069 NaN
4 101 203 -0.4956 NaN
Question: How can I select all items with A=203?
I tried df[:,'A']but it doesn't work. Then I found thisin the online documentation so I tried:df.xs(203,level='A')
but I get:
"TypeError: xs() got an unexpected keyword argument 'level'"
Also I dont see this parameter in the installed doc(df.xs?):
"Parameters ---------- key : object Some label contained in the index, or partially in a MultiIndex axis : int, default 0 Axis to retrieve cross-section on copy : boolean, default True Whether to make a copy of the data"
Note:I have the development version.
问题:如何选择 A=203 的所有项目?我试过了,df[:,'A']但没有用。然后我在在线文档中找到了这个,所以我尝试了:df.xs(203,level='A')
但我得到:
“ TypeError: xs() got an unexpected keyword argument 'level'”
而且我在安装的 doc( df.xs?) 中没有看到这个参数:
“参数 ---------- key : object Some label contains in索引,或部分在 MultiIndex 轴中:int,默认 0 轴在复制时检索横截面:布尔值,默认为 True 是否制作数据副本”
注意:我有开发版本。
Edit: I found this thread. They recommend something like:
编辑:我找到了这个线程。他们推荐类似的东西:
df.select(lambda x: x[1]==200, axis=0)
I still would like to know what happened with df.xs with the level parameter or what is the recommended way in the current version.
我仍然想知道带有 level 参数的 df.xs 发生了什么,或者当前版本中推荐的方式是什么。
采纳答案by elyase
The problem lies in my assumption(incorrect) that I was in the dev version while in reality I had 1.6.1, one can check the current installed version with:
问题在于我假设(不正确)我在开发版本中,而实际上我有 1.6.1,可以通过以下方式检查当前安装的版本:
import pandas
print pandas.__version__
in the current version df.xs()with the level parameter works ok.
在当前版本中df.xs()使用 level 参数工作正常。
回答by rogueleaderr
Not a directanswer to the question, but if you want to select more than one value you can use the "slice()" notation:
不是问题的直接答案,但如果您想选择多个值,您可以使用“slice()”表示法:
import numpy
from pandas import MultiIndex, Series
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = Series(numpy.random.randn(8), index=index)
In [10]: s
Out[10]:
first second
bar one 0.181621
two 1.016225
baz one 0.716589
two -0.353731
foo one -0.326301
two 1.009143
qux one 0.098225
two -1.087523
dtype: float64
In [11]: s.loc[slice(None)]
Out[11]:
first second
bar one 0.181621
two 1.016225
baz one 0.716589
two -0.353731
foo one -0.326301
two 1.009143
qux one 0.098225
two -1.087523
dtype: float64
In [12]: s.loc[slice(None), "one"]
Out[12]:
first
bar 0.181621
baz 0.716589
foo -0.326301
qux 0.098225
dtype: float64
In [13]: s.loc["bar", slice(None)]
Out[13]:
first second
bar one 0.181621
two 1.016225
dtype: float64

