pandas 根据行中的值对熊猫数据框的列进行排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12358360/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:25:16  来源:igfitidea点击:

Order columns of a pandas dataframe according to the values in a row

pythonsortingpandas

提问by Curious

How do I order columns according to the values of the last row? In the example below, my final dataframe should have columns in the following order: 'ddd' 'aaa' 'ppp' 'fff'.

如何根据最后一行的值对列进行排序?在下面的示例中,我的最终数据框应具有按以下顺序排列的列:'ddd' 'aaa' 'ppp' 'fff'。

>>> df = DataFrame(np.random.randn(10, 4), columns=['ddd', 'fff', 'aaa', 'ppp'])
>>> df
        ddd       fff       aaa       ppp
0 -0.177438  0.102561 -1.318710  1.321252
1  0.980348  0.786721  0.374506 -1.411019
2  0.405112  0.514216  1.761983 -0.529482
3  1.659710 -1.017048 -0.737615 -0.388145
4 -0.472223  1.407655 -0.129119 -0.912974
5  1.221324 -0.656599  0.563152 -0.900710
6 -1.816420 -2.898094 -0.232047 -0.648904
7  2.793261  0.568760 -0.850100  0.654704
8 -2.180891  2.054178 -1.050897 -1.461458
9 -1.123756  1.245987 -0.239863  0.359759

回答by DSM

[updated to simplify]

[更新以简化]

tl;dr:

tl;博士:

In [29]: new_columns = df.columns[df.ix[df.last_valid_index()].argsort()]

In [30]: df[new_columns]
Out[30]: 
        aaa       ppp       fff       ddd
0  0.328281  0.375458  1.188905  0.503059
1  0.305457  0.186163  0.077681 -0.543215
2  0.684265  0.681724  0.210636 -0.532685
3 -1.134292  1.832272  0.067946  0.250131
4 -0.834393  0.010211  0.649963 -0.551448
5 -1.032405 -0.749949  0.442398  1.274599


Some explanation follows. First, build the DataFrame:

一些解释如下。首先,构建DataFrame

In [24]: df = pd.DataFrame(np.random.randn(6, 4), columns=['ddd', 'fff', 'aaa', 'ppp'])

In [25]: df
Out[25]: 
        ddd       fff       aaa       ppp
0  0.503059  1.188905  0.328281  0.375458
1 -0.543215  0.077681  0.305457  0.186163
2 -0.532685  0.210636  0.684265  0.681724
3  0.250131  0.067946 -1.134292  1.832272
4 -0.551448  0.649963 -0.834393  0.010211
5  1.274599  0.442398 -1.032405 -0.749949

Get the last row:

获取最后一行:

In [26]: last_row = df.ix[df.last_valid_index()]

Get the indices that would sort it:

获取对其进行排序的索引:

In [27]: last_row.argsort()
Out[27]: 
ddd    2
fff    3
aaa    1
ppp    0
Name: 5, Dtype: int32

Use this to index df:

使用它来索引df

In [28]: df[last_row.argsort()]
Out[28]: 
        aaa       ppp       fff       ddd
0  0.328281  0.375458  1.188905  0.503059
1  0.305457  0.186163  0.077681 -0.543215
2  0.684265  0.681724  0.210636 -0.532685
3 -1.134292  1.832272  0.067946  0.250131
4 -0.834393  0.010211  0.649963 -0.551448
5 -1.032405 -0.749949  0.442398  1.274599

Profit!

利润!

回答by DSM

The sort_valuesmethod does this directly when given axis=1argument.

sort_values方法在给定axis=1参数时直接执行此操作。

sorted_df = df.sort_values(df.last_valid_index(), axis=1)


So, it is no longer necessary to transpose the dataframe to sort by a row. Also, the sortmethod is now deprecated.

因此,不再需要将数据帧转置为按行排序。此外,该sort方法现已弃用。

回答by sanguineturtle

I would use transpose and the sort method (which works on columns):

我会使用转置和排序方法(适用于列):

df = pd.DataFrame(np.random.randn(10, 4), columns=['ddd', 'fff', 'aaa', 'ppp'])
last_row_name = df.index[-1]
sorted_df = df.T.sort(columns=last_row_name).T

You might suffer a performance hit but it is quick and easy.

您可能会遭受性能损失,但它既快速又简单。

回答by husimu

df=df[df.iloc[-1,:].sort_values().index]

This works

这有效