Python 重置列的 MultiIndex 级别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14189695/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 10:42:54  来源:igfitidea点击:

Reset a column's MultiIndex levels

pythonpandasdataframe

提问by dmvianna

Is there a shorter way of dropping a column MultiIndex level (in my case, basic_amt) except transposing it twice?

basic_amt除了转置两次之外,是否有更短的方法来删除列 MultiIndex 级别(在我的情况下,)?

In [704]: test
Out[704]: 
           basic_amt               
Faculty          NSW  QLD  VIC  All
All                1    1    2    4
Full Time          0    1    0    1
Part Time          1    0    2    3

In [705]: test.reset_index(level=0, drop=True)
Out[705]: 
         basic_amt               
Faculty        NSW  QLD  VIC  All
0                1    1    2    4
1                0    1    0    1
2                1    0    2    3

In [711]: test.transpose().reset_index(level=0, drop=True).transpose()
Out[711]: 
Faculty    NSW  QLD  VIC  All
All          1    1    2    4
Full Time    0    1    0    1
Part Time    1    0    2    3

采纳答案by jezrael

Another solution is use use MultiIndex.droplevelwith rename_axis(new in pandas0.18.0):

另一种解决方案是使用 use MultiIndex.droplevelwith rename_axis(new in pandas0.18.0):

import pandas as pd

cols = pd.MultiIndex.from_arrays([['basic_amt']*4,
                                     ['NSW','QLD','VIC','All']], 
                                     names = [None, 'Faculty'])
idx = pd.Index(['All', 'Full Time', 'Part Time'])

df = pd.DataFrame([(1,1,2,4),
                   (0,1,0,1),
                   (1,0,2,3)], index = idx, columns=cols)

print (df)
          basic_amt            
Faculty         NSW QLD VIC All
All               1   1   2   4
Full Time         0   1   0   1
Part Time         1   0   2   3

df.columns = df.columns.droplevel(0)
#pandas 0.18.0 and higher
df = df.rename_axis(None, axis=1)
#pandas bellow 0.18.0
#df.columns.name = None

print (df)
           NSW  QLD  VIC  All
All          1    1    2    4
Full Time    0    1    0    1
Part Time    1    0    2    3

print (df.columns)
Index(['NSW', 'QLD', 'VIC', 'All'], dtype='object')

If need both column names use listcomprehension:

如果需要两个列名使用list理解:

df.columns = ['_'.join(col) for col in df.columns]
print (df)
           basic_amt_NSW  basic_amt_QLD  basic_amt_VIC  basic_amt_All
All                    1              1              2              4
Full Time              0              1              0              1
Part Time              1              0              2              3

print (df.columns)
Index(['basic_amt_NSW', 'basic_amt_QLD', 'basic_amt_VIC', 'basic_amt_All'], dtype='object')

回答by unutbu

How about simply reassigning df.columns:

如何简单地重新分配df.columns

levels = df.columns.levels
labels = df.columns.labels
df.columns = levels[1][labels[1]]

For example:

例如:

import pandas as pd

columns = pd.MultiIndex.from_arrays([['basic_amt']*4,
                                     ['NSW','QLD','VIC','All']])
index = pd.Index(['All', 'Full Time', 'Part Time'], name = 'Faculty')
df = pd.DataFrame([(1,1,2,4),
                   (0,01,0,1),
                   (1,0,2,3)])
df.columns = columns
df.index = index

Before:

前:

print(df)

           basic_amt               
                 NSW  QLD  VIC  All
Faculty                            
All                1    1    2    4
Full Time          0    1    0    1
Part Time          1    0    2    3

After:

后:

levels = df.columns.levels
labels = df.columns.labels
df.columns = levels[1][labels[1]]
print(df)

           NSW  QLD  VIC  All
Faculty                      
All          1    1    2    4
Full Time    0    1    0    1
Part Time    1    0    2    3

回答by firelynx

Zip levels together

将级别压缩在一起

Here is an alternative solution which zips the levels together and joins them with underscore.

这是一个替代解决方案,它将级别压缩在一起并用下划线将它们连接起来。

Derived from the above answer, and this was what I wanted to do when I found this answer. Thought I would share even if it does not answer the exact above question.

源自上面的答案,这就是我找到这个答案时想做的事情。我想即使它没有回答上述问题,我也会分享。

["_".join(pair) for pair in df.columns]

gives

['basic_amt_NSW', 'basic_amt_QLD', 'basic_amt_VIC', 'basic_amt_All']

Just set this as a the columns

只需将其设置为列

df.columns = ["_".join(pair) for pair in df.columns]

           basic_amt_NSW  basic_amt_QLD  basic_amt_VIC  basic_amt_All
Faculty                                                              
All                    1              1              2              4
Full Time              0              1              0              1
Part Time              1              0              2              3