Python pandas:同时在不同列上均值和求和分组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48909110/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:13:04  来源:igfitidea点击:

Python pandas: mean and sum groupby on different columns at the same time

pythonpandas

提问by ahajib

I have a pandas dataframe which looks like the following:

我有一个如下所示的 Pandas 数据框:

Name    Missed    Credit    Grade
A       1         3         10
A       1         1         12      
B       2         3         10
B       1         2         20

And my desired output is:

我想要的输出是:

Name    Sum1   Sum2    Average
A       2      4      11
B       3      5      15   

Basically to get the sum of column Creditand Missedand to do average on Grade. What I am doing right now is two groupby on Nameand then get sum and average and finally merge the two output dataframes which does not seem to be the best way of doing this. I have also found this on SO which makes sense if I want to work only on one column:

基本上是为了获得列的总和CreditMissedGrade. 我现在正在做的是两个分组Name,然后得到总和和平均值,最后合并两个输出数据帧,这似乎不是最好的方法。我也在 SO 上发现了这个,如果我只想在一个列上工作,这是有道理的:

df.groupby('Name')['Credit'].agg(['sum','average'])

But not sure how to do a one-liner for both columns?

但不确定如何为两列做一个单衬?

回答by jezrael

You need aggby dictionaryand then renamecolumns names:

您需要aggbydictionaryrename列名称:

d = {'Missed':'Sum1', 'Credit':'Sum2','Grade':'Average'}
df=df.groupby('Name').agg({'Missed':'sum', 'Credit':'sum','Grade':'mean'}).rename(columns=d)
print (df)
      Sum1  Sum2  Average
Name                     
A        2     4       11
B        3     5       15

If want also create column from Name:

如果还想从Name以下创建列:

df = (df.groupby('Name', as_index=False)
       .agg({'Missed':'sum', 'Credit':'sum','Grade':'mean'})
       .rename(columns={'Missed':'Sum1', 'Credit':'Sum2','Grade':'Average'}))
print (df)
  Name  Sum1  Sum2  Average
0    A     2     4       11
1    B     3     5       15

回答by ashish trehan

A = pd.DataFrame.from_dict({'Name':['A','A','B','B'],'Missed':[1,1,2,1],'Credit':[3,1,3,2],'Grades':[10,12,10,20]})

A.groupby('Name').agg({'Missed':'sum','Credit':'sum','Grades':'mean'})