使用 Pandas 进行自定义排序

Question

提问by Blark

I have the following dataframe that I would like to sort first by Criticality and then by Name:

我有以下数据框，我想先按重要性排序，然后按名称排序：

Name        Criticality
baz         High
foo         Critical
baz         Low
foo         Medium
bar         High
bar         Low
bar         Medium
...

I've been trying to do this using the answer provided in this postbut I just can't get it to work.

我一直在尝试使用这篇文章中提供的答案来做到这一点，但我无法让它发挥作用。

The end result should be like this

最终结果应该是这样的

Name        Criticality
bar         High
bar         Medium
bar         Low
baz         High
baz         Low
foo         Critical
foo         Medium

Answer 1

回答by EdChum

One approach would be to use a custom dict to create a 'rank' column, we then use to sort with and then drop the column after sorting:

一种方法是使用自定义 dict 创建一个“排名”列，然后我们使用排序，然后在排序后删除该列：

In [17]:
custom_dict = {'Critical':0, 'High':1, 'Medium':2, 'Low':3}  
df['rank'] = df['Criticality'].map(custom_dict)
df

Out[17]:

  Name Criticality  rank
0  baz        High     1
1  foo    Critical     0
2  baz         Low     3
3  foo      Medium     2
4  bar        High     1
5  bar         Low     3
6  bar      Medium     2

[7 rows x 3 columns]

In [19]:
# now sort by 'Name' and 'rank', it will first sort by 'Name' column first and then 'rank'
df.sort(columns=['Name', 'rank'],inplace=True)
df

Out[19]:

  Name Criticality  rank
4  bar        High     1
6  bar      Medium     2
5  bar         Low     3
0  baz        High     1
2  baz         Low     3
1  foo    Critical     0
3  foo      Medium     2

[7 rows x 3 columns]

In [21]:
# now drop the 'rank' column
df.drop(labels=['rank'],axis=1)

Out[21]:

  Name Criticality
4  bar        High
6  bar      Medium
5  bar         Low
0  baz        High
2  baz         Low
1  foo    Critical
3  foo      Medium

[7 rows x 2 columns]

Answer 2

回答by user5843090

I works for me using pd.Categorical

我使用 pd.Categorical 对我来说有效

In [114]: df = pd.DataFrame({
          'Name' : ["baz","foo","baz","foo","bar","bar","bar"],
          'Criticality' : ["hi", "crt", "lo", "med", "hi", "lo", "med"]
          })

     ...: df['Criticality'] = pd.Categorical(df['Criticality'], ["crt","hi", "med", "lo"])

     ...: df.sort_values(['Name','Criticality'])
Out[114]: 
  Name Criticality
4  bar          hi
6  bar         med
5  bar          lo
0  baz          hi
2  baz          lo
1  foo         crt
3  foo         med

使用 Pandas 进行自定义排序

提问by Blark

回答by EdChum

回答by user5843090

相关推荐

最近更新

标签

使用 Pandas 进行自定义排序

提问by Blark

回答by EdChum

回答by user5843090

相关推荐

Pandas 系列到 json 并返回

pandas.Series.unique() 是否保留顺序？

pandas 使用 Python 删除带有数字和字符串的数据帧中的小数点

pandas 对熊猫日期时间索引进行排序

相关推荐

最近更新

标签