pandas 熊猫分组并总结两列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25536032/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas group by and sum two columns
提问by acpigeon
Beginner question. This seems like it should be a straightforward operation, but I can't figure it out from reading the docs.
初学者问题。这似乎应该是一个简单的操作,但我无法从阅读文档中弄清楚。
I have a df with this structure:
我有一个具有这种结构的 df:
|integer_id|int_field_1|int_field_2|
The integer_id column is non-unique, so I'd like to group the df by integer_id and sum the two fields.
integer_id 列是非唯一的,所以我想按 integer_id 对 df 进行分组并对两个字段求和。
The equivalent SQL is:
等效的 SQL 是:
SELECT integer_id, SUM(int_field_1), SUM(int_field_2) FROM tbl
GROUP BY integer_id
Any suggestions on the simplest way to do this?
有关执行此操作的最简单方法的任何建议?
EDIT: Including input/output
编辑:包括输入/输出
Input:
integer_id int_field_1 int_field_2
2656 36 36
2656 36 36
9702 2 2
9702 1 1
Ouput using df.groupby('integer_id').sum():
使用 df.groupby('integer_id').sum() 输出:
integer_id int_field_1 int_field_2
2656 72 72
9702 3 3
回答by EdChum
回答by Bastin Robin
You can do it
你能行的
data.groupby(by=['account_ID'])['purchases'].sum()
回答by xxyjoel
A variation on the .agg() function; provides the ability to (1) persist type DataFrame, (2) apply averages, counts, summations, etc. and (3) enables groupby on multiple columns while maintaining legibility.
.agg() 函数的变体;提供以下能力:(1) 保留类型 DataFrame,(2) 应用平均值、计数、求和等,以及 (3) 在保持易读性的同时在多列上启用 groupby。
df.groupby(['att1', 'att2']).agg({'att1': "count", 'att3': "sum",'att4': 'mean'})
using your values...
使用你的价值观...
df.groupby(['integer_id']).agg({'int_field_1': "sum", 'int_field_2': "sum" })

