Pandas groupby(),agg() - 如何在没有多索引的情况下返回结果?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26323926/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas groupby(),agg() - how to return results without the multi index?
提问by Ginger
I have a dataframe:
我有一个数据框:
pe_odds[ [ 'EVENT_ID', 'SELECTION_ID', 'ODDS' ] ]
Out[67]:
EVENT_ID SELECTION_ID ODDS
0 100429300 5297529 18.00
1 100429300 5297529 20.00
2 100429300 5297529 21.00
3 100429300 5297529 22.00
4 100429300 5297529 23.00
5 100429300 5297529 24.00
6 100429300 5297529 25.00
When I use groupby and agg, I get results with a multi-index:
当我使用 groupby 和 agg 时,我得到一个多索引的结果:
pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] )
Out[68]:
amin amax
EVENT_ID SELECTION_ID
100428417 5490293 1.71 1.71
5881623 1.14 1.35
5922296 2.00 2.00
5956692 2.00 2.02
100428419 603721 2.44 2.90
4387436 4.30 6.20
4398859 1.23 1.35
4574687 1.35 1.46
4881396 14.50 19.00
6032606 2.94 4.20
6065580 2.70 5.80
6065582 2.42 3.65
100428421 5911426 2.22 2.52
I have tried using as_index to return the results without the multi_index:
我尝试使用 as_index 返回没有 multi_index 的结果:
pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ], as_index=False )[ 'ODDS' ].agg( [ np.min, np.max ], as_index=False )
But it still gives me a multi-index.
但它仍然给了我一个多索引。
I can use .reset_index(), but it is very slow:
我可以使用 .reset_index(),但速度很慢:
pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] ).reset_index()
pe_odds.groupby( [ 'EVENT_ID', 'SELECTION_ID' ] )[ 'ODDS' ].agg( [ np.min, np.max ] ).reset_index()
Out[69]:
EVENT_ID SELECTION_ID amin amax
0 100428417 5490293 1.71 1.71
1 100428417 5881623 1.14 1.35
2 100428417 5922296 2.00 2.00
3 100428417 5956692 2.00 2.02
4 100428419 603721 2.44 2.90
5 100428419 4387436 4.30 6.20
How can I return the results, without the Multi-index, using parameters of the groupby and/or agg function. And without having to resort to using reset_index() ?
如何使用 groupby 和/或 agg 函数的参数在没有多索引的情况下返回结果。并且不必求助于使用 reset_index() ?
回答by behzad.nouri
Below call:
下面调用:
>>> gr = df.groupby(['EVENT_ID', 'SELECTION_ID'], as_index=False)
>>> res = gr.agg({'ODDS':[np.min, np.max]})
>>> res
EVENT_ID SELECTION_ID ODDS
amin amax
0 100429300 5297529 18 25
1 100429300 5297559 30 38
returns a frame with mulit-index columns. If you do not want columns to be multi-index either you may do:
返回一个带有多索引列的框架。如果您不希望列成为多索引,您可以这样做:
>>> res.columns = list(map(''.join, res.columns.values))
>>> res
EVENT_ID SELECTION_ID ODDSamin ODDSamax
0 100429300 5297529 18 25
1 100429300 5297559 30 38

