Python Pandas 按多列分组,另一列的平均值 - 不按对象分组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/49268619/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Pandas group by multiple columns, mean of another - no group by object
提问by dasvootz
I have some data that looks like this, and called 'test_df'
我有一些看起来像这样的数据,称为“test_df”
ID Year Value Value2
0 A 2012 1 4
1 A 2012 2 5
2 A 2013 4 6
3 A 2013 5 7
4 B 2014 6 8
5 B 2014 7 4
6 B 2013 8 8
I want it to look like this:
我希望它看起来像这样:
ID Year Value_avg Value2_avg
A 2012 1.5 4.5
A 2013 4.5 6.5
B 2013 8.0 8.0
B 2014 6.5 6.0
However, when I try to group by multiple columns they end up as group by objects:
但是,当我尝试按多列分组时,它们最终会按对象分组:
Value_avg Value2_avg
ID Year
A 2012 1.5 4.5
2013 4.5 6.5
B 2013 8.0 8.0
2014 6.5 6.0
Here is the code I tried:
这是我试过的代码:
out_df = pd.DataFrame()
out_df['Value_avg'] = test_df['Value'].groupby([test_df['ID'], test_df['Year']]).mean()
out_df['Value2_avg'] = test_df['Value2'].groupby([test_df['ID'], test_df['Year']]).mean()
I tried adding:
我尝试添加:
out_df['Value_avg'] = test_df['Value'].groupby([test_df['ID'],
test_df['Year']], as_index=False).mean()
but got this error:
但得到这个错误:
"TypeError: as_index=False only valid with DataFrame"
回答by YOBEN_S
add_suffix
+ reset_index
add_suffix
+ reset_index
df.groupby(['ID','Year']).mean().add_suffix('_avg').reset_index()
Out[337]:
ID Year Value_avg Value2_avg
0 A 2012 1.5 4.5
1 A 2013 4.5 6.5
2 B 2013 8.0 8.0
3 B 2014 6.5 6.0