pandas 从 MultiIndex 中选择特定级别的数据

Question

提问by elyase

I have the following Pandas Dataframe with a MultiIndex(Z,A):

我有以下带有 MultiIndex(Z,A) 的 Pandas 数据框：

             H1       H2  
   Z    A 
0  100  200  0.3112   -0.4197   
1  100  201  0.2967   0.4893    
2  100  202  0.3084   -0.4873   
3  100  203  0.3069   NaN        
4  101  203  -0.4956  NaN

Question: How can I select all items with A=203? I tried df[:,'A']but it doesn't work. Then I found thisin the online documentation so I tried:
df.xs(203,level='A')
but I get:
"TypeError: xs() got an unexpected keyword argument 'level'"
Also I dont see this parameter in the installed doc(df.xs?):
"Parameters ---------- key : object Some label contained in the index, or partially in a MultiIndex axis : int, default 0 Axis to retrieve cross-section on copy : boolean, default True Whether to make a copy of the data"
Note:I have the development version.

问题：如何选择 A=203 的所有项目？我试过了，df[:,'A']但没有用。然后我在在线文档中找到了这个，所以我尝试了：
df.xs(203,level='A')
但我得到：
“ TypeError: xs() got an unexpected keyword argument 'level'”
而且我在安装的 doc( df.xs?) 中没有看到这个参数：
“参数 ---------- key : object Some label contains in索引，或部分在 MultiIndex 轴中：int，默认 0 轴在复制时检索横截面：布尔值，默认为 True 是否制作数据副本”
注意：我有开发版本。

Edit: I found this thread. They recommend something like:

编辑：我找到了这个线程。他们推荐类似的东西：

df.select(lambda x: x[1]==200, axis=0)

I still would like to know what happened with df.xs with the level parameter or what is the recommended way in the current version.

我仍然想知道带有 level 参数的 df.xs 发生了什么，或者当前版本中推荐的方式是什么。

Answer 1

采纳答案by elyase

The problem lies in my assumption(incorrect) that I was in the dev version while in reality I had 1.6.1, one can check the current installed version with:

问题在于我假设（不正确）我在开发版本中，而实际上我有 1.6.1，可以通过以下方式检查当前安装的版本：

import pandas
print pandas.__version__

in the current version df.xs()with the level parameter works ok.

在当前版本中df.xs()使用 level 参数工作正常。

Answer 2

回答by rogueleaderr

Not a directanswer to the question, but if you want to select more than one value you can use the "slice()" notation:

不是问题的直接答案，但如果您想选择多个值，您可以使用“slice()”表示法：

import numpy
from pandas import  MultiIndex, Series

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
              ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = Series(numpy.random.randn(8), index=index)

In [10]: s
Out[10]:
first  second
bar    one       0.181621
       two       1.016225
baz    one       0.716589
       two      -0.353731
foo    one      -0.326301
       two       1.009143
qux    one       0.098225
       two      -1.087523
dtype: float64

In [11]: s.loc[slice(None)]
Out[11]:
first  second
bar    one       0.181621
       two       1.016225
baz    one       0.716589
       two      -0.353731
foo    one      -0.326301
       two       1.009143
qux    one       0.098225
       two      -1.087523
dtype: float64

In [12]: s.loc[slice(None), "one"]
Out[12]:
first
bar      0.181621
baz      0.716589
foo     -0.326301
qux      0.098225
dtype: float64

In [13]: s.loc["bar", slice(None)]
Out[13]:
first  second
bar    one       0.181621
       two       1.016225
dtype: float64

pandas 从 MultiIndex 中选择特定级别的数据

提问by elyase

采纳答案by elyase

回答by rogueleaderr

相关推荐

最近更新

标签

pandas 从 MultiIndex 中选择特定级别的数据

提问by elyase

采纳答案by elyase

回答by rogueleaderr

相关推荐

WPF 响应式设计（液体布局）

wpf 从代码中关闭 Material Design DialogHost

如何使用 .NET Core 3 和 Visual Studio 创建 WPF 应用程序

C# WPF - Material Design 文本框的问题

相关推荐

最近更新

标签