pandas groupby,您可以获得一列的最大值和另一列的最小值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44383136/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas groupby where you get the max of one column and the min of another column
提问by lhay86
I have a dataframe as follows:
我有一个数据框如下:
user num1 num2
a 1 1
a 2 2
a 3 3
b 4 4
b 5 5
I want a dataframe which has the minimum from num1 for each user, and the maximum of num2 for each user.
我想要一个数据帧,每个用户的 num1 最小值,每个用户的 num2 最大值。
The output should be like:
输出应该是这样的:
user num1 num2
a 1 3
b 4 5
I know that if I wanted the max of both columns I could just do:
我知道如果我想要两列的最大值,我可以这样做:
a.groupby('user')['num1', 'num2'].max()
Is there some equivalent without having to do something like:
是否有一些等价物而无需执行以下操作:
series_1 = a.groupby('user')['num1'].min()
series_2 = a.groupby('user')['num2'].max()
# converting from series to df so I can do a join on user
df_1 = pd.DataFrame(np.array([series_1]).transpose(), index=series_1.index, columns=['num1'])
df_2 = pd.DataFrame(np.array([series_2]).transpose(), index=series_2.index, columns=['num2'])
df_1.join(df_2)
回答by jezrael
Use groupby
+ agg
by dict
, so then is necessary order columns by subset
or reindex_axis
. Last add reset_index
for convert index
to column
if necessary.
使用groupby
+ agg
by dict
,因此必须按subset
or 对列进行排序reindex_axis
。如有必要,最后添加reset_index
转换index
为column
。
df = a.groupby('user').agg({'num1':'min', 'num2':'max'})[['num1','num2']].reset_index()
print (df)
user num1 num2
0 a 1 3
1 b 4 5
What is same as:
什么是相同的:
df = a.groupby('user').agg({'num1':'min', 'num2':'max'})
.reindex_axis(['num1','num2'], axis=1)
.reset_index()
print (df)
user num1 num2
0 a 1 3
1 b 4 5