Python 仅选择多索引 DataFrame 的一个索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28140771/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 02:47:52  来源:igfitidea点击:

Select only one index of multiindex DataFrame

pythonpandasselectdataframeindexing

提问by Skorpeo

I am trying to create a new DataFrame using only one index from a multi-indexed DataFrame.

我正在尝试仅使用多索引 DataFrame 中的一个索引创建一个新的 DataFrame。

                   A         B         C
first second                              
bar   one     0.895717  0.410835 -1.413681
      two     0.805244  0.813850  1.607920
baz   one    -1.206412  0.132003  1.024180
      two     2.565646 -0.827317  0.569605
foo   one     1.431256 -0.076467  0.875906
      two     1.340309 -1.187678 -2.211372
qux   one    -1.170299  1.130127  0.974466
      two    -0.226169 -1.436737 -2.006747

Ideally, I would like something like this:

理想情况下,我想要这样的东西:

In: df.ix[level="first"]

and:

和:

Out:

               A         B         C
first                               
bar        0.895717  0.410835 -1.413681
           0.805244  0.813850  1.607920
baz       -1.206412  0.132003  1.024180
           2.565646 -0.827317  0.569605
foo        1.431256 -0.076467  0.875906
           1.340309 -1.187678 -2.211372
qux       -1.170299  1.130127  0.974466
          -0.226169 -1.436737 -2.006747
`

Essentially I want to drop all the other indexes of the multi-index other than level first. Is there an easy way to do this?

基本上我想删除除 level 之外的多索引的所有其他索引first。是否有捷径可寻?

采纳答案by Alex Riley

One way could be to simply rebind df.indexto the desired level of the MultiIndex. You can do this by specifying the label name you want to keep:

一种方法可能是简单地重新绑定df.index到所需的 MultiIndex 级别。您可以通过指定要保留的标签名称来执行此操作:

df.index = df.index.get_level_values('first')

or use the level's integer value:

或使用级别的整数值:

df.index = df.index.get_level_values(0)

All other levels of the MultiIndex would disappear here.

MultiIndex 的所有其他级别都将在此处消失。

回答by Alexander McFarlane

The solution is fairly new and uses the df.xsfunction as

该解决方案是相当新的,并将该df.xs功能用作

In [88]: df.xs('bar', level='first')
Out[88]:
Second  Third
one     A       -2.315312
        B        0.497769
        C        0.108523
two     A       -0.778303
        B       -1.555389
        C       -2.625022
dtype: float64

Can also do with multiple indices as

也可以使用多个索引作为

In [89]: df.xs(('bar', 'A'), level=('First', 'Third'))
Out[89]:
Second
one   -2.315312
two   -0.778303
dtype: float64

The setup for the examples is below

示例的设置如下

import pandas as pd
import numpy as np
arrays = [
    np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
    np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])
]
index = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
df.index.names = pd.core.indexes.frozen.FrozenList(['First', 'Second', 'Third'])
df = df.unstack()