pandas 在 MultiIndex 中设置级别值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32892751/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:57:31  来源:igfitidea点击:

Set level values in MultiIndex

pythonpandas

提问by LondonRob

How can I set the level values of a Series, either by using a dictionary to replace the values, or just with a list of values as long as the series?

如何设置系列的级别值,通过使用字典来替换值,或者仅使用与系列一样长的值列表?

Here's a sample DataFrame:

这是一个示例数据帧:

     sector from_country to_country           0
0  Textiles          FRA        AUS   47.502096
1  Textiles          FRA        USA  431.890710
2  Textiles          GBR        AUS   83.500590
3  Textiles          GBR        USA  324.836158
4      Wood          FRA        AUS   27.515607
5      Wood          FRA        USA  276.501148
6      Wood          GBR        AUS    1.406096
7      Wood          GBR        USA    8.996177

Now set the index:

现在设置索引:

df = df.set_index(['sector', 'from_country', 'to_country']).squeeze()

For example, if I wanted to change based on the following key/value pairs:

例如,如果我想根据以下键/值对进行更改:

In [69]: replace_dict = {'FRA':'France', 'GBR':'UK'}
In [70]: new_vals = [replace_dict[x] for x in df.index.get_level_values('from_country')]

I would like the output to look like:

我希望输出看起来像:

In [68]: df.index.set_level_values(new_vals, level='from_country')
Out[68]: 
sector    from_country  to_country
Textiles  France        AUS            47.502096
                        USA           431.890710
          UK            AUS            83.500590
                        USA           324.836158
Wood      France        AUS            27.515607
                        USA           276.501148
          UK            AUS             1.406096
                        USA             8.996177

I currently do this, but it seems pretty dumb to me:

我目前这样做,但对我来说似乎很愚蠢:

def set_index_values(df_or_series, new_values, level):
    """
    Replace the MultiIndex level `level` with `new_values`

    `new_values` must be the same length as `df_or_series`
    """
    levels = df_or_series.index.names
    retval = df_or_series.reset_index(level)
    retval[level] = new_values
    retval = retval.set_index(level, append=True).reorder_levels(levels).sortlevel().squeeze()
    return retval

回答by Andy Hayden

Slightly hacky, but you can do this with .index.set_levels:

有点hacky,但你可以这样做.index.set_levels

In [11]: df1.index.levels[1]
Out[11]: Index(['FRA', 'GBR'], dtype='object', name='from_country')

In [12]: df1.index.levels[1].map(replace_dict.get)
Out[12]: array(['France', 'UK'], dtype=object)

In [13]: df1.index = df1.index.set_levels(df1.index.levels[1].map(replace_dict.get), "from_country")

In [14]: df1
Out[14]:
sector    from_country  to_country
Textiles  France        AUS            47.502096
                        USA           431.890710
          UK            AUS            83.500590
                        USA           324.836158
Wood      France        AUS            27.515607
                        USA           276.501148
          UK            AUS             1.406096
                        USA             8.996177
Name: 0, dtype: float64

Note: There isa way to get the level number from the name, but I don't recall it.

注意:有一种方式来获得从名称的级别数,但我不记得它。