Python 熊猫数据透视表列名

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33290374/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 13:09:29  来源:igfitidea点击:

pandas pivot_table column names

pythonpandaspivot-tablereshape

提问by muon

For a dataframe like this:

对于这样的数据框:

d = {'id': [1,1,1,2,2], 'Month':[1,2,3,1,3],'Value':[12,23,15,45,34], 'Cost':[124,214,1234,1324,234]}
df = pd.DataFrame(d)

     Cost  Month  Value  id  
0    124       1     12   1  
1    214       2     23   1  
2    1234      3     15   1  
3    1324      1     45   2  
4    234       3     34   2  

to which I apply pivot_table

我应用了pivot_table

df2 =    pd.pivot_table(df, 
                        values=['Value','Cost'],
                        index=['id'],
                        columns=['Month'],
                        aggfunc=np.sum,
                        fill_value=0)

to get df2:

获取 df2:

       Cost            Value          
Month     1    2     3     1   2   3   
id                                  
1       124  214  1234    12  23  15
2      1324    0   234    45   0  34

is there an easy way to format resulting dataframe column names like

有没有一种简单的方法来格式化生成的数据框列名称,例如

id     Cost1    Cost2     Cost3 Value1   Value2   Value3   
1       124      214      1234    12        23       15
2      1324       0       234     45         0       34

If I do:

如果我做:

df2.columns =[s1 + str(s2) for (s1,s2) in df2.columns.tolist()]

I get:

我得到:

    Cost1  Cost2  Cost3  Value1  Value2  Value3
id                                             
1     124    214   1234      12      23      15
2    1324      0    234      45       0      34

How to get rid of the extra level?

如何摆脱额外的水平?

thanks!

谢谢!

采纳答案by muon

Using clues from @chrisb's answer, this gave me exactly what I was after:

使用@chrisb 的回答中的线索,这正是我所追求的:

df2.reset_index(inplace=True)

which gives:

这使:

id     Cost1    Cost2     Cost3 Value1   Value2   Value3   
1       124      214      1234    12        23       15
2      1324       0       234     45         0       34

and in case of multiple index columns, this postexplains it well. just to be complete, here is how:

并且在多个索引列的情况下,这篇文章解释得很好。只是为了完成,这里是如何:

df2.columns = [' '.join(col).strip() for col in df2.columns.values]

回答by chrisb

'id'is the index name, which you can set to Noneto remove.

'id'是索引名称,您可以将其设置None为删除。

In [35]: df2.index.name = None

In [36]: df2
Out[36]: 
   Cost1  Cost2  Cost3  Value1  Value2  Value3
1    124    214   1234      12      23      15
2   1324      0    234      45       0      34