Pandas 按数据框上的操作分组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16684346/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas group by operations on a data frame
提问by Anirudh Jayakumar
I have a pandas data frame like the one below.
我有一个如下所示的Pandas数据框。
UsrId JobNos
1 4
1 56
2 23
2 55
2 41
2 5
3 78
1 25
3 1
I group by the data frame based on the UsrId. The grouped data frame will conceptually look like below.
我根据 UsrId 按数据框分组。分组数据框在概念上如下所示。
UsrId JobNos
1 [4,56,25]
2 [23,55,41,5]
3 [78,1]
Now, I'm looking for an in-build API that will give me the UsrId with the maximum job count. For the above example, UsrId-2 has the maximum count.
现在,我正在寻找一个内置 API,它将为我提供具有最大作业数的 UsrId。对于上面的示例,UsrId-2 具有最大计数。
UPDATE:Instead of the UsrID with maximum job count, I want 'n' UserIds with maximum job counts. For the above example, if n=2 then the output is [2,1]. Can this be done?
更新:我想要'n'个具有最大作业数的用户ID,而不是具有最大作业数的UsrID。对于上面的示例,如果 n=2,则输出为 [2,1]。这能做到吗?
回答by root
Something like df.groupby('UsrId').JobNos.sum().idxmax()should do it:
喜欢的东西df.groupby('UsrId').JobNos.sum().idxmax()应该这样做:
In [1]: import pandas as pd
In [2]: from StringIO import StringIO
In [3]: data = """UsrId JobNos
...: 1 4
...: 1 56
...: 2 23
...: 2 55
...: 2 41
...: 2 5
...: 3 78
...: 1 25
...: 3 1"""
In [4]: df = pd.read_csv(StringIO(data), sep='\s+')
In [5]: grouped = df.groupby('UsrId')
In [6]: grouped.JobNos.sum()
Out[6]:
UsrId
1 85
2 124
3 79
Name: JobNos
In [7]: grouped.JobNos.sum().idxmax()
Out[7]: 2
If you want your results based on the number of items in each group:
如果您希望根据每组中的项目数获得结果:
In [8]: grouped.size()
Out[8]:
UsrId
1 3
2 4
3 2
In [9]: grouped.size().idxmax()
Out[9]: 2
Update:To get ordered results you can use the .ordermethod:
更新:要获得有序结果,您可以使用以下.order方法:
In [10]: grouped.JobNos.sum().order(ascending=False)
Out[10]:
UsrId
2 124
1 85
3 79
Name: JobNos

