Python 如何从一列对熊猫数据框进行排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37787698/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:55:51  来源:igfitidea点击:

how to sort pandas dataframe from one column

pythonpandassorting

提问by Sachila Ranawaka

I have a data frame like this:

我有一个这样的数据框:

print(df)

        0          1     2
0   354.7      April   4.0
1    55.4     August   8.0
2   176.5   December  12.0
3    95.5   February   2.0
4    85.6    January   1.0
5     152       July   7.0
6   238.7       June   6.0
7   104.8      March   3.0
8   283.5        May   5.0
9   278.8   November  11.0
10  249.6    October  10.0
11  212.7  September   9.0

As you can see, months are not in calendar order. So I created a second column to get the month number corresponding to each month (1-12). From there, how can I sort this data frame according to calendar months' order?

如您所见,月份不是按日历顺序排列的。所以我创建了第二列来获取每个月(1-12)对应的月数。从那里,我如何根据日历月的顺序对这个数据框进行排序?

回答by EdChum

Use sort_valuesto sort the df by a specific column's values:

用于sort_values按特定列的值对 df 进行排序:

In [18]:
df.sort_values('2')

Out[18]:
        0          1     2
4    85.6    January   1.0
3    95.5   February   2.0
7   104.8      March   3.0
0   354.7      April   4.0
8   283.5        May   5.0
6   238.7       June   6.0
5   152.0       July   7.0
1    55.4     August   8.0
11  212.7  September   9.0
10  249.6    October  10.0
9   278.8   November  11.0
2   176.5   December  12.0

If you want to sort by two columns, pass a list of column labels to sort_valueswith the column labels ordered according to sort priority. If you use df.sort_values(['2', '0']), the result would be sorted by column 2then column 0. Granted, this does not really make sense for this example because each value in df['2']is unique.

如果要按两列排序sort_values,请将列标签列表传递给,并根据排序优先级对列标签进行排序。如果使用df.sort_values(['2', '0']),结果将按列排序,2然后按列排序0。当然,这对于这个例子来说没有意义,因为 中的每个值df['2']都是唯一的。

回答by Joel Carneiro

I tried the solutions above and I do not achieve results, so I found a different solution that works for me. The ascending=Falseis to order the dataframe in descendingorder, by default it is True. I am using python 3.6.6 and pandas 0.23.4 versions.

我尝试了上面的解决方案,但没有取得结果,所以我找到了一个适合我的不同解决方案。该升=假是订购数据框在递减顺序,默认为真。我正在使用 python 3.6.6 和 pandas 0.23.4 版本。

final_df = df.sort_values(by=['2'], ascending=False)

You can see more details in pandas documentation here.

您可以在此处查看熊猫文档中的更多详细信息。

回答by Harry_pb

Just adding some more operations on data. Suppose we have a dataframe df, we can do several operations to get desired outputs

只是在数据上添加更多操作。假设我们有一个数据框df,我们可以做几个操作来获得想要的输出

ID         cost      tax    label
1       216590      1600    test      
2       523213      1800    test 
3          250      1500    experiment

(df['label'].value_counts().to_frame().reset_index()).sort_values('label', ascending=False)

will give sortedoutput of labels as a dataframe

sorted输出标签作为dataframe

    index   label
0   test        2
1   experiment  1

回答by alireza yazdandoost

Just as another solution:

就像另一种解决方案:

you can categorize your string data(month name) and sort by that like this:

您可以对字符串数据(月份名称)进行分类并按如下方式排序:

df.rename(columns={1:'month'},inplace=True)
df['month'] = pd.Categorical(df['month'],categories=['December','November','October','September','August','July','June','May','April','March','February','January'],ordered=True)
df = df.sort_values('month',ascending=False)

It will give you the ordered data by month nameas you specified while creating the Categoricalobject.

它将month name按照您在创建Categorical对象时指定的方式为您提供有序数据。