在“分组依据”pandas 数据框中重复值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36676800/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Repeating values in a "group by" pandas dataframe
提问by stackit
I have the following pandas DataFrame:
我有以下Pandas数据帧:
email cat class_price
0 [email protected] cat1 1
1 [email protected] cat2 2
2 [email protected] cat2 4
3 [email protected] cat2 4
4 [email protected] cat2 1
5 [email protected] cat1 3
6 [email protected] cat1 2
7 [email protected] cat2 1
8 [email protected] cat2 4
9 [email protected] cat2 2
10 [email protected] cat3 1
11 [email protected] cat1 1
And I want to group by email and by class_price, for each line I want to take the max of class_price.
我想通过电子邮件和 class_price 分组,对于每一行我想取 class_price 的最大值。
I'm using:
我正在使用:
test_df2 = test_df.groupby(['email','cat'])['class_price'].max()
The output is:
输出是:
email cat
[email protected] cat1 2
cat2 4
[email protected] cat2 2
cat3 1
[email protected] cat1 3
cat2 4
But how can I get a result where even grouped columns retain repeated values,such that it can be be written as a proper table with all the values:
但是我怎样才能得到一个结果,即使分组的列也保留重复的值,这样它就可以写成一个包含所有值的正确表:
email cat maxvalue
[email protected] cat2 2
[email protected] cat1 2
[email protected] cat3 3
Note: example output isn't compatible with example input just written to explain the idea.
注意:示例输出与刚刚为解释这个想法而编写的示例输入不兼容。
回答by B. M.
You can just reset the index, putting data in columns.
您可以重置索引,将数据放入列中。
In [1]: print (test_df2.reset_index(name='maxvalue').to_string(index=False))
email cat maxvalue
[email protected] cat1 2
[email protected] cat2 4
[email protected] cat2 2
[email protected] cat3 1
[email protected] cat1 3
[email protected] cat2 4

