pandas 如何在特定级别重新排序多索引数据框列

Question

提问by Tim Whitcomb

I have a multi-indexed DataFramewith names attached to the column levels. I'd like to be able to easily shuffle the columns around so that they match the order specified by the user. Since this is down the pipeline, I'm not able to use this recommended solutionand order them properly at creation time.

我有一个多索引DataFrame的名称附加到列级别。我希望能够轻松地将列打乱，以便它们与用户指定的顺序相匹配。由于这是在管道中，我无法使用此推荐的解决方案并在创建时正确订购它们。

I have a data table that looks (something) like

我有一个数据表，看起来（某事）像

Experiment           BASE           IWWGCW         IWWGDW
Lead Time                24     48      24     48      24     48
2010-11-27 12:00:00   0.997  0.991   0.998  0.990   0.998  0.990
2010-11-28 12:00:00   0.998  0.987   0.997  0.990   0.997  0.990
2010-11-29 12:00:00   0.997  0.992   0.997  0.992   0.997  0.992
2010-11-30 12:00:00   0.997  0.987   0.997  0.987   0.997  0.987
2010-12-01 12:00:00   0.996  0.986   0.996  0.986   0.996  0.986

I want to take in a list like ['IWWGCW', 'IWWGDW', 'BASE']and reorder this to be:

我想接受一个像这样的列表['IWWGCW', 'IWWGDW', 'BASE']并将其重新排序为：

Experiment           IWWGCW         IWWGDW         BASE           
Lead Time                24     48      24     48      24     48  
2010-11-27 12:00:00   0.998  0.990   0.998  0.990   0.997  0.991  
2010-11-28 12:00:00   0.997  0.990   0.997  0.990   0.998  0.987  
2010-11-29 12:00:00   0.997  0.992   0.997  0.992   0.997  0.992  
2010-11-30 12:00:00   0.997  0.987   0.997  0.987   0.997  0.987  
2010-12-01 12:00:00   0.996  0.986   0.996  0.986   0.996  0.986

with the caveat that I don't always know at what level "Experiment" will be. I tried (where dfis the multi-indexed frame shown above)

需要注意的是，我并不总是知道“实验”的级别。我试过了（df上面显示的多索引框架在哪里）

df2 = df.reindex_axis(['IWWGCW', 'IWWGDW', 'BASE'], axis=1, level='Experiment')

but that didn't seem to work - it completed successfully, but the DataFrame that was returned had its column order unchanged.

但这似乎不起作用 - 它成功完成，但返回的 DataFrame 的列顺序未更改。

My workaround is to have a function like:

我的解决方法是具有如下功能：

def reorder_columns(frame, column_name, new_order):
    """Shuffle the specified columns of the frame to match new_order."""

    index_level  = frame.columns.names.index(column_name)
    new_position = lambda t: new_order.index(t[index_level])
    new_index    = sorted(frame.columns, key=new_position)
    new_frame    = frame.reindex_axis(new_index, axis=1)
    return new_frame

where reorder_columns(df, 'Experiment', ['IWWGCW', 'IWWGDW', 'BASE'])does what I expect but it feels like I'm doing extra work. Is there an easier way to do this?

reorder_columns(df, 'Experiment', ['IWWGCW', 'IWWGDW', 'BASE'])我的期望在哪里，但感觉就像我在做额外的工作。有没有更简单的方法来做到这一点？

Answer 1

采纳答案by Wes McKinney

I don't know of anything off-hand. Created an enhancement ticket about it:

我不知道任何现成的。创建了一个关于它的增强票：

http://github.com/pydata/pandas/issues/1864

Answer 2

回答by ragesz

There is a very simple way: just create a new dataframe based on the original, with the correct order of multiindex columns:

有一个非常简单的方法：只需根据原始数据框创建一个新的数据框，并使用正确的多索引列顺序：

multi_tuples = [('IWWGCW',24), ('IWWGCW',48), ('IWWGDW',24), ('IWWGDW',48)
    , ('BASE',24), ('BASE',48)]

multi_cols = pd.MultiIndex.from_tuples(multi_tuples, names=['Experiment', 'Lead Time'])

df_ordered_multi_cols = pd.DataFrame(df_ori, columns=multi_cols)

Answer 3

回答by n1000

The comment by andrew_reeceshould be the accepted answer. Simply use reindex().

andrew_reece的评论应该是公认的答案。只需使用reindex()。

Copy & pasting from the github issue:

从github 问题复制和粘贴：

>>> df
                     vals
first second third       
mid   3rd    992     1.96
             562    12.06
      1st    73     -6.46
             818   -15.75
             658     5.90
btm   2nd    915     9.75
             474    -1.47
             905    -6.03
      1st    717     8.01
             909   -21.12
      3rd    616    11.91
             675     1.06
             579    -4.01
top   1st    241     1.79
             363     1.71
      3rd    677    13.38
             238   -16.77
             407    17.19
      2nd    728   -21.55
             36      8.09
>>> df.reindex(['top', 'mid', 'btm'], level='first')
                     vals
first second third       
top   1st    241     1.79
             363     1.71
      3rd    677    13.38
             238   -16.77
             407    17.19
      2nd    728   -21.55
             36      8.09
mid   3rd    992     1.96
             562    12.06
      1st    73     -6.46
             818   -15.75
             658     5.90
btm   2nd    915     9.75
             474    -1.47
             905    -6.03
      1st    717     8.01
             909   -21.12
      3rd    616    11.91
             675     1.06
             579    -4.01
>>> df.reindex(['1st', '2nd', '3rd'], level='second')
                     vals
first second third       
mid   1st    73     -6.46
             818   -15.75
             658     5.90
      3rd    992     1.96
             562    12.06
btm   1st    717     8.01
             909   -21.12
      2nd    915     9.75
             474    -1.47
             905    -6.03
      3rd    616    11.91
             675     1.06
             579    -4.01
top   1st    241     1.79
             363     1.71
      2nd    728   -21.55
             36      8.09
      3rd    677    13.38
             238   -16.77
             407    17.19
>>> df.reindex(['top', 'btm'], level='first').reindex(['1st', '2nd'], level='second')
                     vals
first second third       
top   1st    241     1.79
             363     1.71
      2nd    728   -21.55
             36      8.09
btm   1st    717     8.01
             909   -21.12
      2nd    915     9.75
             474    -1.47
             905    -6.03

pandas 如何在特定级别重新排序多索引数据框列

提问by Tim Whitcomb

采纳答案by Wes McKinney

回答by ragesz

回答by n1000

相关推荐

最近更新

标签

pandas 如何在特定级别重新排序多索引数据框列

提问by Tim Whitcomb

采纳答案by Wes McKinney

回答by ragesz

回答by n1000

相关推荐

pandas 熊猫中的简单交叉表

如何使用 Pandas 计算加权移动平均值

Python Pandas：导入熊猫时找不到 numpy.core.multiarray

如何在 Pandas 中读取固定宽度格式的文本文件

相关推荐

最近更新

标签