Python 熊猫分组和求和

Question

提问by Trying_hard

I am using this data frame:

我正在使用这个数据框：

Fruit   Date      Name  Number
Apples  10/6/2016 Bob    7
Apples  10/6/2016 Bob    8
Apples  10/6/2016 Mike   9
Apples  10/7/2016 Steve 10
Apples  10/7/2016 Bob    1
Oranges 10/7/2016 Bob    2
Oranges 10/6/2016 Tom   15
Oranges 10/6/2016 Mike  57
Oranges 10/6/2016 Bob   65
Oranges 10/7/2016 Tony   1
Grapes  10/7/2016 Bob    1
Grapes  10/7/2016 Tom   87
Grapes  10/7/2016 Bob   22
Grapes  10/7/2016 Bob   12
Grapes  10/7/2016 Tony  15

I want to aggregate this by name and then by fruit to get a total number of fruit per name.

我想按名称汇总，然后按水果汇总，以获得每个名称的水果总数。

Bob,Apples,16 ( for example )

I tried grouping by Name and Fruit but how do I get the total number of fruit.

我尝试按名称和水果分组，但如何获得水果的总数。

Answer 1

回答by Steven G

Use GroupBy.sum:

使用GroupBy.sum：

df.groupby(['Fruit','Name']).sum()

Out[31]: 
               Number
Fruit   Name         
Apples  Bob        16
        Mike        9
        Steve      10
Grapes  Bob        35
        Tom        87
        Tony       15
Oranges Bob        67
        Mike       57
        Tom        15
        Tony        1

Answer 2

回答by Saurabh

Also you can use agg function,

您也可以使用 agg 功能，

df.groupby(['Name', 'Fruit'])['Number'].agg('sum')

Answer 3

回答by Gazala Muhamed

If you want to keep the original columns Fruitand Name, use reset_index(). Otherwise Fruitand Namewill become part of the index.

如果要保留原始列Fruitand Name，请使用reset_index(). 否则Fruit，Name将成为索引的一部分。

df.groupby(['Fruit','Name'])['Number'].sum().reset_index()

Fruit   Name       Number
Apples  Bob        16
Apples  Mike        9
Apples  Steve      10
Grapes  Bob        35
Grapes  Tom        87
Grapes  Tony       15
Oranges Bob        67
Oranges Mike       57
Oranges Tom        15
Oranges Tony        1

As seen in the other answers:

正如其他答案中所见：

df.groupby(['Fruit','Name'])['Number'].sum()

               Number
Fruit   Name         
Apples  Bob        16
        Mike        9
        Steve      10
Grapes  Bob        35
        Tom        87
        Tony       15
Oranges Bob        67
        Mike       57
        Tom        15
        Tony        1

Answer 4

回答by Demetri Pananos

Both the other answers accomplish what you want.

其他两个答案都实现了您想要的。

You can use the pivotfunctionality to arrange the data in a nice table

您可以使用该pivot功能将数据排列在一个漂亮的表格中

df.groupby(['Fruit','Name'],as_index = False).sum().pivot('Fruit','Name').fillna(0)



Name    Bob     Mike    Steve   Tom    Tony
Fruit                   
Apples  16.0    9.0     10.0    0.0     0.0
Grapes  35.0    0.0     0.0     87.0    15.0
Oranges 67.0    57.0    0.0     15.0    1.0

Answer 5

回答by jared

df.groupby(['Fruit','Name'])['Number'].sum()

You can select different columns to sum numbers.

您可以选择不同的列来对数字求和。

Answer 6

回答by YOBEN_S

You can set the groupbycolumn to indexthen using sumwith level

您可以将groupby列设置为 index然后使用sumwithlevel

df.set_index(['Fruit','Name']).sum(level=[0,1])
Out[175]: 
               Number
Fruit   Name         
Apples  Bob        16
        Mike        9
        Steve      10
Oranges Bob        67
        Tom        15
        Mike       57
        Tony        1
Grapes  Bob        35
        Tom        87
        Tony       15

Answer 7

回答by xxyjoel

A variation on the .agg() function; provides the ability to (1) persist type DataFrame, (2) apply averages, counts, summations, etc. and (3) enables groupby on multiple columns while maintaining legibility.

.agg() 函数的变体；提供以下能力：(1) 保留类型 DataFrame，(2) 应用平均值、计数、求和等，以及 (3) 在保持易读性的同时在多列上启用 groupby。

df.groupby(['att1', 'att2']).agg({'att1': "count", 'att3': "sum",'att4': 'mean'})

using your values...

使用你的价值观...

df.groupby(['Name', 'Fruit']).agg({'Number': "sum"})

Python 熊猫分组和求和

提问by Trying_hard

回答by Steven G

回答by Saurabh

回答by Gazala Muhamed

回答by Demetri Pananos

回答by jared

回答by YOBEN_S

回答by xxyjoel

相关推荐

最近更新

标签

Python 熊猫分组和求和

提问by Trying_hard

回答by Steven G

回答by Saurabh

回答by Gazala Muhamed

回答by Demetri Pananos

回答by jared

回答by YOBEN_S

回答by xxyjoel

相关推荐

应该将 conda 还是 conda-forge 用于 Python 环境？

Python 我什么时候需要使用 sqlalchemy back_populates？

如何在 Python 3.6 上安装 PIP？

Python Matplotlib - 如何绘制高分辨率图形？

相关推荐

最近更新

标签