Python pandas:同时在不同列上均值和求和分组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48909110/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pandas: mean and sum groupby on different columns at the same time
提问by ahajib
I have a pandas dataframe which looks like the following:
我有一个如下所示的 Pandas 数据框:
Name Missed Credit Grade
A 1 3 10
A 1 1 12
B 2 3 10
B 1 2 20
And my desired output is:
我想要的输出是:
Name Sum1 Sum2 Average
A 2 4 11
B 3 5 15
Basically to get the sum of column Credit
and Missed
and to do average on Grade
. What I am doing right now is two groupby on Name
and then get sum and average and finally merge the two output dataframes which does not seem to be the best way of doing this. I have also found this on SO which makes sense if I want to work only on one column:
基本上是为了获得列的总和Credit
并Missed
在Grade
. 我现在正在做的是两个分组Name
,然后得到总和和平均值,最后合并两个输出数据帧,这似乎不是最好的方法。我也在 SO 上发现了这个,如果我只想在一个列上工作,这是有道理的:
df.groupby('Name')['Credit'].agg(['sum','average'])
But not sure how to do a one-liner for both columns?
但不确定如何为两列做一个单衬?
回答by jezrael
You need agg
by dictionary
and then rename
columns names:
您需要agg
bydictionary
和rename
列名称:
d = {'Missed':'Sum1', 'Credit':'Sum2','Grade':'Average'}
df=df.groupby('Name').agg({'Missed':'sum', 'Credit':'sum','Grade':'mean'}).rename(columns=d)
print (df)
Sum1 Sum2 Average
Name
A 2 4 11
B 3 5 15
If want also create column from Name
:
如果还想从Name
以下创建列:
df = (df.groupby('Name', as_index=False)
.agg({'Missed':'sum', 'Credit':'sum','Grade':'mean'})
.rename(columns={'Missed':'Sum1', 'Credit':'Sum2','Grade':'Average'}))
print (df)
Name Sum1 Sum2 Average
0 A 2 4 11
1 B 3 5 15
回答by ashish trehan
A = pd.DataFrame.from_dict({'Name':['A','A','B','B'],'Missed':[1,1,2,1],'Credit':[3,1,3,2],'Grades':[10,12,10,20]})
A.groupby('Name').agg({'Missed':'sum','Credit':'sum','Grades':'mean'})