Python 分组数据框并获得总和和计数?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38174155/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:27:18  来源:igfitidea点击:

Group dataframe and get sum AND count?

pythonpandasdataframegroup-bypandas-groupby

提问by Richard

I have a dataframe that looks like this:

我有一个看起来像这样的数据框:

              Company Name              Organisation Name  Amount
10118  Vifor Pharma UK Ltd  Welsh Assoc for Gastro & Endo 2700.00
10119  Vifor Pharma UK Ltd    Welsh IBD Specialist Group,  169.00
10120  Vifor Pharma UK Ltd             West Midlands AHSN 1200.00
10121  Vifor Pharma UK Ltd           Whittington Hospital   63.00
10122  Vifor Pharma UK Ltd                 Ysbyty Gwynedd   75.93

How do I sum the Amountand count the Organisation Name, to get a new dataframe that looks like this?

我如何求和Amount并计算Organisation Name, 以获得看起来像这样的新数据框?

              Company Name             Organisation Count   Amount
10118  Vifor Pharma UK Ltd                              5 11000.00

I know how to sum orcount:

我知道如何求和计数:

df.groupby('Company Name').sum()
df.groupby('Company Name').count()

But not how to do both!

但不是如何做到这两点!

回答by MaxU

try this:

尝试这个:

In [110]: (df.groupby('Company Name')
   .....:    .agg({'Organisation Name':'count', 'Amount': 'sum'})
   .....:    .reset_index()
   .....:    .rename(columns={'Organisation Name':'Organisation Count'})
   .....: )
Out[110]:
          Company Name   Amount  Organisation Count
0  Vifor Pharma UK Ltd  4207.93                   5

or if you don't want to reset index:

或者如果您不想重置索引:

df.groupby('Company Name')['Amount'].agg(['sum','count'])

or

或者

df.groupby('Company Name').agg({'Amount': ['sum','count']})

Demo:

演示:

In [98]: df.groupby('Company Name')['Amount'].agg(['sum','count'])
Out[98]:
                         sum  count
Company Name
Vifor Pharma UK Ltd  4207.93      5

In [99]: df.groupby('Company Name').agg({'Amount': ['sum','count']})
Out[99]:
                      Amount
                         sum count
Company Name
Vifor Pharma UK Ltd  4207.93     5

回答by cs95

Just in case you were wondering how to rename columns during aggregation, here's how for

以防万一您想知道如何在聚合期间重命名列,这里是如何

pandas >= 0.25: Named Aggregation

pandas >= 0.25:命名聚合

df.groupby('Company Name')['Amount'].agg(MySum='sum', MyCount='count')

Or,

或者,

df.groupby('Company Name').agg(MySum=('Amount', 'sum'), MyCount=('Amount', 'count'))

                       MySum  MyCount
Company Name                       
Vifor Pharma UK Ltd  4207.93        5

回答by JSharm

If you have lots of columns and only one is different you could do:

如果您有很多列并且只有一个不同,您可以执行以下操作:

In[1]: grouper = df.groupby('Company Name')
In[2]: res = grouper.count()
In[3]: res['Amount'] = grouper.Amount.sum()
In[4]: res
Out[4]:
                      Organisation Name   Amount
Company Name                                   
Vifor Pharma UK Ltd                  5  4207.93

Note you can then rename the Organisation Name column as you wish.

请注意,您可以根据需要重命名组织名称列。

回答by cvsnow

df.groupby('Company Name').agg({'Organisation name':'count','Amount':'sum'})\
    .apply(lambda x: x.sort_values(['count','sum'], ascending=False))