pandas 熊猫分组并总结两列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25536032/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:24:10  来源:igfitidea点击:

Pandas group by and sum two columns

pythonpandas

提问by acpigeon

Beginner question. This seems like it should be a straightforward operation, but I can't figure it out from reading the docs.

初学者问题。这似乎应该是一个简单的操作,但我无法从阅读文档中弄清楚。

I have a df with this structure:

我有一个具有这种结构的 df:

|integer_id|int_field_1|int_field_2|

The integer_id column is non-unique, so I'd like to group the df by integer_id and sum the two fields.

integer_id 列是非唯一的,所以我想按 integer_id 对 df 进行分组并对两个字段求和。

The equivalent SQL is:

等效的 SQL 是:

SELECT integer_id, SUM(int_field_1), SUM(int_field_2) FROM tbl
GROUP BY integer_id

Any suggestions on the simplest way to do this?

有关执行此操作的最简单方法的任何建议?

EDIT: Including input/output

编辑:包括输入/​​输出

Input:  
integer_id  int_field_1 int_field_2   
2656        36          36  
2656        36          36  
9702        2           2  
9702        1           1  

Ouput using df.groupby('integer_id').sum():

使用 df.groupby('integer_id').sum() 输出:

integer_id  int_field_1 int_field_2  
2656        72          72  
9702        3           3  

回答by EdChum

You just need to call sumon a groupbyobject:

你只需要调用sum一个groupby对象:

df.groupby('integer_id').sum()

See the docsfor further examples

有关更多示例,请参阅文档

回答by Bastin Robin

You can do it

你能行的

data.groupby(by=['account_ID'])['purchases'].sum()

回答by xxyjoel

A variation on the .agg() function; provides the ability to (1) persist type DataFrame, (2) apply averages, counts, summations, etc. and (3) enables groupby on multiple columns while maintaining legibility.

.agg() 函数的变体;提供以下能力:(1) 保留类型 DataFrame,(2) 应用平均值、计数、求和等,以及 (3) 在保持易读性的同时在多列上启用 groupby。

df.groupby(['att1', 'att2']).agg({'att1': "count", 'att3': "sum",'att4': 'mean'})

using your values...

使用你的价值观...

df.groupby(['integer_id']).agg({'int_field_1': "sum", 'int_field_2': "sum" })