Python Pandas Groupby 和 Sum Only 一列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38985053/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:44:08  来源:igfitidea点击:

Pandas Groupby and Sum Only One Column

pythonpandas

提问by JSolomonCulp

So I have a dataframe, df1, that looks like the following:

所以我有一个数据框 df1,如下所示:

       A      B      C
1     foo    12    California
2     foo    22    California
3     bar    8     Rhode Island
4     bar    32    Rhode Island
5     baz    15    Ohio
6     baz    26    Ohio

I want to group by column A and then sum column B while keeping the value in column C. Something like this:

我想按 A 列分组,然后对 B 列求和,同时将值保留在 C 列中。像这样:

      A       B      C
1    foo     34    California
2    bar     40    Rhode Island
3    baz     41    Ohio

The issue is, when I say df.groupby('A').sum() column C gets removed returning

问题是,当我说 df.groupby('A').sum() 列 C 被删除返回

      B
A
bar  40
baz  41
foo  34

How can I get around this and keep column C when I group and sum?

当我分组和求和时,如何解决这个问题并保留 C 列?

回答by Sevyns

The only way to do this would be to include C in your groupby (the groupby function can accept a list).

唯一的方法是在 groupby 中包含 C(groupby 函数可以接受列表)。

Give this a try:

试试这个:

df.groupby(['A','C'])['B'].sum()

One other thing to note, if you need to work with df after the aggregation you can also use the as_index=False option to return a dataframe object. This one gave me problems when I was first working with Pandas. Example:

需要注意的另一件事是,如果您需要在聚合后使用 df ,您还可以使用 as_index=False 选项返回数据帧对象。当我第一次使用 Pandas 时,这个给我带来了问题。例子:

df.groupby(['A','C'], as_index=False)['B'].sum()

回答by Kartik

If you don't care what's in your column C and just want the nthvalue, you could just do this:

如果您不在乎 C 列中的内容而只想要该nth值,则可以这样做:

df.groupby('A').agg({'B' : 'sum',
                     'C' : lambda x: x.iloc[n]})