Python Pandas 中每组的排名顺序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33899369/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Ranking order per group in Pandas
提问by Amelio Vazquez-Reina
Consider a dataframe with three columns: group_ID
, item_ID
and value
. Say we have 10 itemIDs
total.
考虑一个包含三列的数据框:group_ID
,item_ID
和value
。假设我们itemIDs
总共有 10个。
I need to rank each item_ID
(1 to 10) withineach group_ID
based on value
, and then see the mean rank (and other stats) across groups (e.g. the IDs with the highestvalue across groups would get ranks closer to 1). How can I do this in
Pandas?
我需要每一个排名item_ID
(1〜10)内各group_ID
基础上value
,再看到平均等级(和其它数据)跨群体(如用的ID最高的各组值会得到等级越接近1)。我怎样才能在 Pandas 中做到这一点?
This answerdoes something very close with qcut
, but not exactly the same.
这个答案与 非常接近qcut
,但并不完全相同。
A data example would look like:
数据示例如下所示:
group_ID item_ID value
0 0S00A1HZEy AB 10
1 0S00A1HZEy AY 4
2 0S00A1HZEy AC 35
3 0S03jpFRaC AY 90
4 0S03jpFRaC A5 3
5 0S03jpFRaC A3 10
6 0S03jpFRaC A2 8
7 0S03jpFRaC A4 9
8 0S03jpFRaC A6 2
9 0S03jpFRaC AX 0
which would result in:
这将导致:
group_ID item_ID rank
0 0S00A1HZEy AB 2
1 0S00A1HZEy AY 3
2 0S00A1HZEy AC 1
3 0S03jpFRaC AY 1
4 0S03jpFRaC A5 5
5 0S03jpFRaC A3 2
6 0S03jpFRaC A2 4
7 0S03jpFRaC A4 3
8 0S03jpFRaC A6 6
9 0S03jpFRaC AX 7
采纳答案by DSM
There are lots of different arguments you can pass to rank
; it looks like you can use rank("dense", ascending=False)
to get the results you want, after doing a groupby
:
您可以传递许多不同的参数rank
;rank("dense", ascending=False)
在执行以下操作后,您似乎可以使用它来获得所需的结果groupby
:
>>> df["rank"] = df.groupby("group_ID")["value"].rank("dense", ascending=False)
>>> df
group_ID item_ID value rank
0 0S00A1HZEy AB 10 2
1 0S00A1HZEy AY 4 3
2 0S00A1HZEy AC 35 1
3 0S03jpFRaS AY 90 1
4 0S03jpFRaS A5 3 5
5 0S03jpFRaS A3 10 2
6 0S03jpFRaS A2 8 4
7 0S03jpFRaS A4 9 3
8 0S03jpFRaS A6 2 6
9 0S03jpFRaS AX 0 7
But note that if you're not using a global ranking scheme, finding out the mean rank across groups isn't very meaningful-- unless there are duplicate values in a group (and so you have duplicate rank values) all you're doing is measuring how many elements there are in a group.
但请注意,如果您不使用全局排名方案,那么找出各组的平均排名并不是很有意义——除非您正在做的所有组中都有重复的值(因此您有重复的排名值)正在测量一个组中有多少个元素。