pandas 熊猫按另一列中的值对一列进行排序

Question

提问by John Shin

I have a dataset that I want to sort and assign rank based on it.

我有一个数据集，我想根据它进行排序和分配排名。

Suppose it has two columns, one is year and the other is the column that I want to sort.

假设它有两列，一列是年份，另一列是我要排序的列。

import pandas as pd
data = {'year': pd.Series([2006, 2006, 2007, 2007]), 
        'value': pd.Series([5, 10, 4, 1])}
df = pd.DataFrame(data)

I want to sort the column 'value' by each year and then give rank to it. What I would like to have is

我想每年对“值”列进行排序，然后对其进行排名。我想要的是

data2= {'year': pd.Series([2006, 2006, 2007, 2007]), 
        'value': pd.Series([10, 5, 4, 1]),  
        'rank': pd.Series([1, 2, 1, 2]}
df2=pd.DataFrame(data2)

>>> df2
   rank  value  year
0     1     10  2006
1     2      5  2006
2     1      4  2007
3     2      1  2007

Answer 1

回答by Alexander

You can use groupbyand then use rank(with ascending=Falseto get the largest values first). You don't need to sort in the groupby, as the result is indexed to the dataframe (slightly faster performance).

您可以使用groupby然后使用rank(withascending=False首先获得最大值)。您不需要在中排序groupby，因为结果已索引到数据帧（性能稍快）。

df['yearly_rank'] = df.groupby('year', sort=False)['value'].rank(ascending=False)

>>> df.sort_values(['year', 'yearly_rank'])
   value  year  yearly_rank
1     10  2006            1
0      5  2006            2
2      4  2007            1
3      1  2007            2

Answer 2

回答by Parfait

Consider a groupby applyfunction with sort:

考虑一个带有排序的groupby 应用函数：

def rankfct(row):    
    row['rank'] = row['value'].rank(ascending=False)    
    return row

df = df.groupby(['year']).apply(rankfct).sort(['year','value'], ascending=[1,0])

pandas 熊猫按另一列中的值对一列进行排序

提问by John Shin

回答by Alexander

回答by Parfait

相关推荐

最近更新

标签

pandas 熊猫按另一列中的值对一列进行排序

提问by John Shin

回答by Alexander

回答by Parfait

相关推荐

pandas 将熊猫浮点系列转换为 int

pandas 使用python中pandas的read_excel函数将日期保留为字符串

pandas 使用字典中的值过滤熊猫数据框

外推 Pandas DataFrame

相关推荐

最近更新

标签