如何在 Pandas 0.20.1+ 中重命名多级组中的所有列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43895292/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:35:01  来源:igfitidea点击:

How do you rename all columns in multi level group by in pandas 0.20.1+

pythonpandas

提问by Mark Doom

With the release of Pandas 0.20.1, there is a new deprecation of the functionality to groupby.agg() with a dictionary for renaming.

随着 Pandas 0.20.1 的发布,使用重命名字典对 groupby.agg() 的功能进行了新的弃用。

Deprecation documentation

弃用文档

I'm trying to find best way to update my code to account for this, however I'm struggling with how I've currently been utilizing this rename functionality.

我正在尝试找到更新我的代码以解决此问题的最佳方法,但是我正在努力解决我目前如何使用此重命名功能的问题。

When I am doing an aggregate, I often have multiple functions for each source column, and I have been using this rename functionality to get to a single level index with these new column names.

当我进行聚合时,我经常为每个源列使用多个函数,并且我一直在使用此重命名功能来使用这些新列名获得单级索引。

Example:

例子:

df = pd.DataFrame({'A': [1, 1, 1, 2, 2],'B': range(5),'C': range(5)})

In [30]: df
Out[30]: 
   A  B  C
0  1  0  0
1  1  1  1
2  1  2  2
3  2  3  3
4  2  4  4

frame = df.groupby('A').agg({'B' : {'foo':'sum'}, 'C': {'bar' : 'min', 'bar2': 'max'}})

Which results in:

结果是:

Out[33]: 
    B   C     
  foo bar bar2
A             
1   3   0    2
2   7   3    4

Which I then typically do:

然后我通常会这样做:

frame = pd.DataFrame(frame).reset_index(col_level=1)

frame.columns = frame.columns.get_level_values(1)

frame
Out[42]: 
   A  foo  bar  bar2
0  1    3    0     2
1  2    7    3     4

So I'm looking for good ways to get a result dataframe that is single level index, but has new unique column names. Where multiple columns originated from an aggregate from a single source column. Any recommendations of best approach is greatly appreciated.

因此,我正在寻找获得单级索引但具有新的唯一列名的结果数据框的好方法。多个列源自单个源列的聚合。非常感谢任何最佳方法的建议。

采纳答案by jezrael

This works perfectly in 0.20.1version:

这在0.20.1版本中完美运行:

d = {'sum':'foo','min':'bar','max':'bar2'}
frame = df.groupby('A').agg({'B' : ['sum'], 'C': ['min', 'max']}).rename(columns=d)
frame.columns = frame.columns.droplevel(0)
frame = frame.reset_index()
print (frame)
   A  foo  bar  bar2
0  1    3    0     2
1  2    7    3     4

If multiple mins:

如果多个mins:

d = {'B_sum':'foo','C_min':'bar','C_max':'bar2'}
frame = df.groupby('A').agg({'B' : ['sum'], 'C': ['min', 'max']})
frame.columns = frame.columns.map('_'.join)
frame = frame.reset_index().rename(columns=d)
print (frame)
   A  foo  bar  bar2
0  1    3    0     2
1  2    7    3     4

回答by MaxU

Here is bit shorter alternative:

这是更短的替代方案:

In [78]: d={'C_min':'min_C', 'C_sum':'sum_C','B_min':'min_B','B_sum':'sum_B'}

In [79]: frame
Out[79]:
    C       B
  min sum min sum
A
1   0   3   0   3
2   3   7   3   7

In [80]: frame.columns = frame.columns.map('_'.join).to_series().map(d)

In [81]: frame
Out[81]:
   min_C  sum_C  min_B  sum_B
A
1      0      3      0      3
2      3      7      3      7

回答by EdChum

You could just call droplevelon the columns and then reset_index:

你可以只调用droplevel列,然后reset_index

In [46]:
frame.columns = frame.columns.droplevel(0)
frame = frame.reset_index()
frame

Out[46]:
   A  bar  bar2  foo
0  1    0     2    3
1  2    3     4    7