pandas 从 MultiIndex 中选择特定级别的数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10175068/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 15:42:34  来源:igfitidea点击:

Select data at a particular level from a MultiIndex

pythonpandas

提问by elyase

I have the following Pandas Dataframe with a MultiIndex(Z,A):

我有以下带有 MultiIndex(Z,A) 的 Pandas 数据框:

             H1       H2  
   Z    A 
0  100  200  0.3112   -0.4197   
1  100  201  0.2967   0.4893    
2  100  202  0.3084   -0.4873   
3  100  203  0.3069   NaN        
4  101  203  -0.4956  NaN       

Question: How can I select all items with A=203? I tried df[:,'A']but it doesn't work. Then I found thisin the online documentation so I tried:
df.xs(203,level='A')
but I get:
"TypeError: xs() got an unexpected keyword argument 'level'"
Also I dont see this parameter in the installed doc(df.xs?):
"Parameters ---------- key : object Some label contained in the index, or partially in a MultiIndex axis : int, default 0 Axis to retrieve cross-section on copy : boolean, default True Whether to make a copy of the data"
Note:I have the development version.

问题:如何选择 A=203 的所有项目?我试过了,df[:,'A']但没有用。然后我在在线文档中找到了这个,所以我尝试了:
df.xs(203,level='A')
但我得到:
TypeError: xs() got an unexpected keyword argument 'level'
而且我在安装的 doc( df.xs?) 中没有看到这个参数:
“参数 ---------- key : object Some label contains in索引,或部分在 MultiIndex 轴中:int,默认 0 轴在复制时检索横截面:布尔值,默认为 True 是否制作数据副本”
注意:我有开发版本。

Edit: I found this thread. They recommend something like:

编辑:我找到了这个线程。他们推荐类似的东西:

df.select(lambda x: x[1]==200, axis=0)  

I still would like to know what happened with df.xs with the level parameter or what is the recommended way in the current version.

我仍然想知道带有 level 参数的 df.xs 发生了什么,或者当前版本中推荐的方式是什么。

采纳答案by elyase

The problem lies in my assumption(incorrect) that I was in the dev version while in reality I had 1.6.1, one can check the current installed version with:

问题在于我假设(不正确)我在开发版本中,而实际上我有 1.6.1,可以通过以下方式检查当前安装的版本:

import pandas
print pandas.__version__

in the current version df.xs()with the level parameter works ok.

在当前版本中df.xs()使用 level 参数工作正常。

回答by rogueleaderr

Not a directanswer to the question, but if you want to select more than one value you can use the "slice()" notation:

不是问题的直接答案,但如果您想选择多个值,您可以使用“slice()”表示法:

import numpy
from pandas import  MultiIndex, Series

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
              ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = Series(numpy.random.randn(8), index=index)

In [10]: s
Out[10]:
first  second
bar    one       0.181621
       two       1.016225
baz    one       0.716589
       two      -0.353731
foo    one      -0.326301
       two       1.009143
qux    one       0.098225
       two      -1.087523
dtype: float64

In [11]: s.loc[slice(None)]
Out[11]:
first  second
bar    one       0.181621
       two       1.016225
baz    one       0.716589
       two      -0.353731
foo    one      -0.326301
       two       1.009143
qux    one       0.098225
       two      -1.087523
dtype: float64

In [12]: s.loc[slice(None), "one"]
Out[12]:
first
bar      0.181621
baz      0.716589
foo     -0.326301
qux      0.098225
dtype: float64

In [13]: s.loc["bar", slice(None)]
Out[13]:
first  second
bar    one       0.181621
       two       1.016225
dtype: float64