pandas 我在 groupby 上应用了 sum(),我想对最后一列的值进行排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44742945/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:52:31  来源:igfitidea点击:

I applied sum() on a groupby and I want to sort the values of the last column

pythonsortingpandassumpandas-groupby

提问by KawtarZZ

Given the following DataFrame

鉴于以下数据帧

user_ID  product_id  amount
   1       456          1
   1        87          1
   1       788          3
   1       456          5
   1        87          2
  ...      ...         ...

The first column is the ID of the customer, the second is the ID of the product he bought and the 'amount' express if the quantity of the product purchased on that given day (the date is also taken into consideration). a customer can buy many products each day as much as he wants to. I want to calculate the total of times each product is bought by the customer, so I applied a groupby

第一列是客户的ID,第二列是他购买的产品的ID,'amount'表示当天购买的产品数量(日期也考虑在内)。客户每天可以购买任意数量的产品。我想计算客户购买每种产品的总次数,所以我应用了groupby

df.groupby(['user_id','product_id'], sort=True).sum()

now I want to sort the sum of amount in each group. Any help?

现在我想对每组中的金额总和进行排序。有什么帮助吗?

回答by student

Suppose dfis:

假设df是:

     user_ID  product_id  amount
0        1         456       1
1        1          87       1
2        1         788       3
3        1         456       5
4        1          87       2
5        2         456       1
6        2         788       3
7        2         456       5

Then you can use, groupbyand sumas before, in addition you can sort values by two columns [user_ID, amount]and ascending=[True,False]refers ascending order of user and for each user descending order of amount:

然后你可以使用,groupbysum以前一样,此外,你可以按两列对值进行排序,[user_ID, amount]ascending=[True,False]引用用户的升序和每个用户的金额降序:

new_df = df.groupby(['user_ID','product_id'], sort=True).sum().reset_index()
new_df = new_df.sort_values(by = ['user_ID', 'amount'], ascending=[True,False])
print(new_df)

Output:

输出:

     user_ID   product_id  amount
1        1         456       6
0        1          87       3
2        1         788       3
3        2         456       6
4        2         788       3

回答by FAMG

You could also use aggregate():

您还可以使用aggregate()

# Make up some example data
df = data.frame (user_ID = as.factor(rep(1:5, each = 5)), 
                 product_id = as.factor(sample(seq(1:10),size = 25, replace = TRUE)),
                 amount = sample(1:5, size = 25, replace = TRUE))

# Use aggregate with function sum to calculate the amount of products bought by product and customer
aggregate(amount ~  product_id * user_ID , data = df, FUN = sum)

Output:

输出:

   product_id user_ID amount
1           2       1      3
2           4       1      2
3           6       1      1
4           9       1      5
5           1       2      5
6           3       2      9
7           8       2      1
8          10       2      5
9           2       3      5
10          3       3      5
11          4       3      5
12          5       3      3
13          8       3      5
14          3       4      3
15          4       4      9
16          5       4      2
17         10       4      1
18          2       5      1
19          4       5      4
20          5       5      2
21         10       5      2