Python Pandas Groupby 和 Sum Only 一列

Question

提问by JSolomonCulp

So I have a dataframe, df1, that looks like the following:

所以我有一个数据框 df1，如下所示：

       A      B      C
1     foo    12    California
2     foo    22    California
3     bar    8     Rhode Island
4     bar    32    Rhode Island
5     baz    15    Ohio
6     baz    26    Ohio

I want to group by column A and then sum column B while keeping the value in column C. Something like this:

我想按 A 列分组，然后对 B 列求和，同时将值保留在 C 列中。像这样：

      A       B      C
1    foo     34    California
2    bar     40    Rhode Island
3    baz     41    Ohio

The issue is, when I say df.groupby('A').sum() column C gets removed returning

问题是，当我说 df.groupby('A').sum() 列 C 被删除返回

      B
A
bar  40
baz  41
foo  34

How can I get around this and keep column C when I group and sum?

当我分组和求和时，如何解决这个问题并保留 C 列？

Answer 1

回答by Sevyns

The only way to do this would be to include C in your groupby (the groupby function can accept a list).

唯一的方法是在 groupby 中包含 C（groupby 函数可以接受列表）。

Give this a try:

试试这个：

df.groupby(['A','C'])['B'].sum()

One other thing to note, if you need to work with df after the aggregation you can also use the as_index=False option to return a dataframe object. This one gave me problems when I was first working with Pandas. Example:

需要注意的另一件事是，如果您需要在聚合后使用 df ，您还可以使用 as_index=False 选项返回数据帧对象。当我第一次使用 Pandas 时，这个给我带来了问题。例子：

df.groupby(['A','C'], as_index=False)['B'].sum()

Answer 2

回答by Kartik

If you don't care what's in your column C and just want the nthvalue, you could just do this:

如果您不在乎 C 列中的内容而只想要该nth值，则可以这样做：

df.groupby('A').agg({'B' : 'sum',
                     'C' : lambda x: x.iloc[n]})

Python Pandas Groupby 和 Sum Only 一列

提问by JSolomonCulp

回答by Sevyns

回答by Kartik

相关推荐

最近更新

标签

Python Pandas Groupby 和 Sum Only 一列

提问by JSolomonCulp

回答by Sevyns

回答by Kartik

相关推荐

Python 为什么 cv2.imwrite() 会改变图片的颜色？

Python 如何向 Calls 资源提交 POST 请求？（在 Twilio 中拨打电话）

使用python套接字发送/接收数据

Python 使用 pandas 将 xlsx 转换为 csv 文件。如何删除索引列？

相关推荐

最近更新

标签