pandas 如何在特定级别重新排序多索引数据框列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11194610/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I reorder multi-indexed dataframe columns at a specific level
提问by Tim Whitcomb
I have a multi-indexed DataFramewith names attached to the column levels. I'd like to be able to easily shuffle the columns around so that they match the order specified by the user. Since this is down the pipeline, I'm not able to use this recommended solutionand order them properly at creation time.
我有一个多索引DataFrame的名称附加到列级别。我希望能够轻松地将列打乱,以便它们与用户指定的顺序相匹配。由于这是在管道中,我无法使用此推荐的解决方案并在创建时正确订购它们。
I have a data table that looks (something) like
我有一个数据表,看起来(某事)像
Experiment BASE IWWGCW IWWGDW
Lead Time 24 48 24 48 24 48
2010-11-27 12:00:00 0.997 0.991 0.998 0.990 0.998 0.990
2010-11-28 12:00:00 0.998 0.987 0.997 0.990 0.997 0.990
2010-11-29 12:00:00 0.997 0.992 0.997 0.992 0.997 0.992
2010-11-30 12:00:00 0.997 0.987 0.997 0.987 0.997 0.987
2010-12-01 12:00:00 0.996 0.986 0.996 0.986 0.996 0.986
I want to take in a list like ['IWWGCW', 'IWWGDW', 'BASE']and reorder this to be:
我想接受一个像这样的列表['IWWGCW', 'IWWGDW', 'BASE']并将其重新排序为:
Experiment IWWGCW IWWGDW BASE
Lead Time 24 48 24 48 24 48
2010-11-27 12:00:00 0.998 0.990 0.998 0.990 0.997 0.991
2010-11-28 12:00:00 0.997 0.990 0.997 0.990 0.998 0.987
2010-11-29 12:00:00 0.997 0.992 0.997 0.992 0.997 0.992
2010-11-30 12:00:00 0.997 0.987 0.997 0.987 0.997 0.987
2010-12-01 12:00:00 0.996 0.986 0.996 0.986 0.996 0.986
with the caveat that I don't always know at what level "Experiment" will be. I tried (where dfis the multi-indexed frame shown above)
需要注意的是,我并不总是知道“实验”的级别。我试过了(df上面显示的多索引框架在哪里)
df2 = df.reindex_axis(['IWWGCW', 'IWWGDW', 'BASE'], axis=1, level='Experiment')
but that didn't seem to work - it completed successfully, but the DataFrame that was returned had its column order unchanged.
但这似乎不起作用 - 它成功完成,但返回的 DataFrame 的列顺序未更改。
My workaround is to have a function like:
我的解决方法是具有如下功能:
def reorder_columns(frame, column_name, new_order):
"""Shuffle the specified columns of the frame to match new_order."""
index_level = frame.columns.names.index(column_name)
new_position = lambda t: new_order.index(t[index_level])
new_index = sorted(frame.columns, key=new_position)
new_frame = frame.reindex_axis(new_index, axis=1)
return new_frame
where reorder_columns(df, 'Experiment', ['IWWGCW', 'IWWGDW', 'BASE'])does what I expect but it feels like I'm doing extra work. Is there an easier way to do this?
reorder_columns(df, 'Experiment', ['IWWGCW', 'IWWGDW', 'BASE'])我的期望在哪里,但感觉就像我在做额外的工作。有没有更简单的方法来做到这一点?
采纳答案by Wes McKinney
I don't know of anything off-hand. Created an enhancement ticket about it:
我不知道任何现成的。创建了一个关于它的增强票:
回答by ragesz
There is a very simple way: just create a new dataframe based on the original, with the correct order of multiindex columns:
有一个非常简单的方法:只需根据原始数据框创建一个新的数据框,并使用正确的多索引列顺序:
multi_tuples = [('IWWGCW',24), ('IWWGCW',48), ('IWWGDW',24), ('IWWGDW',48)
, ('BASE',24), ('BASE',48)]
multi_cols = pd.MultiIndex.from_tuples(multi_tuples, names=['Experiment', 'Lead Time'])
df_ordered_multi_cols = pd.DataFrame(df_ori, columns=multi_cols)
回答by n1000
The comment by andrew_reeceshould be the accepted answer. Simply use reindex().
andrew_reece的评论应该是公认的答案。只需使用reindex()。
Copy & pasting from the github issue:
从github 问题复制和粘贴:
>>> df
vals
first second third
mid 3rd 992 1.96
562 12.06
1st 73 -6.46
818 -15.75
658 5.90
btm 2nd 915 9.75
474 -1.47
905 -6.03
1st 717 8.01
909 -21.12
3rd 616 11.91
675 1.06
579 -4.01
top 1st 241 1.79
363 1.71
3rd 677 13.38
238 -16.77
407 17.19
2nd 728 -21.55
36 8.09
>>> df.reindex(['top', 'mid', 'btm'], level='first')
vals
first second third
top 1st 241 1.79
363 1.71
3rd 677 13.38
238 -16.77
407 17.19
2nd 728 -21.55
36 8.09
mid 3rd 992 1.96
562 12.06
1st 73 -6.46
818 -15.75
658 5.90
btm 2nd 915 9.75
474 -1.47
905 -6.03
1st 717 8.01
909 -21.12
3rd 616 11.91
675 1.06
579 -4.01
>>> df.reindex(['1st', '2nd', '3rd'], level='second')
vals
first second third
mid 1st 73 -6.46
818 -15.75
658 5.90
3rd 992 1.96
562 12.06
btm 1st 717 8.01
909 -21.12
2nd 915 9.75
474 -1.47
905 -6.03
3rd 616 11.91
675 1.06
579 -4.01
top 1st 241 1.79
363 1.71
2nd 728 -21.55
36 8.09
3rd 677 13.38
238 -16.77
407 17.19
>>> df.reindex(['top', 'btm'], level='first').reindex(['1st', '2nd'], level='second')
vals
first second third
top 1st 241 1.79
363 1.71
2nd 728 -21.55
36 8.09
btm 1st 717 8.01
909 -21.12
2nd 915 9.75
474 -1.47
905 -6.03

