Python 熊猫得到一个分组的平均值

Question

提问by jxn

I am trying to find the average monthly cost per user_id but i am only able to get average cost per user or monthly cost per user.

我试图找到每个 user_id 的平均每月成本，但我只能获得每个用户的平均成本或每个用户的每月成本。

Because i group by user and month, there is no way to get the average of the second groupby (month) unless i transform the groupby output to something else.

因为我按用户和月份分组，除非我将 groupby 输出转换为其他内容，否则无法获得第二个 groupby（月份）的平均值。

This is my df:

这是我的 df：

     df = { 'id' : pd.Series([1,1,1,1,2,2,2,2]),
            'cost' : pd.Series([10,20,30,40,50,60,70,80]),
            'mth': pd.Series([3,3,4,5,3,4,4,5])}

   cost  id  mth
0    10   1    3
1    20   1    3
2    30   1    4
3    40   1    5
4    50   2    3
5    60   2    4
6    70   2    4
7    80   2    5

I can get monthly sum but i want the average of the months for each user_id.

我可以获得每月总和，但我想要每个 user_id 的月数平均值。

df.groupby(['id','mth'])['cost'].sum()

id  mth
1   3       30
    4       30
    5       40
2   3       50
    4      130
    5       80

i want something like this:

我想要这样的东西：

id average_monthly
1 (30+30+40)/3
2 (50+130+80)/3

Answer 1

回答by Jerome Montino

Resetting the index should work. Try this:

重置索引应该可以工作。尝试这个：

In [19]: df.groupby(['id', 'mth']).sum().reset_index().groupby('id').mean()  
Out[19]: 
    mth       cost
id                
1   4.0  33.333333
2   4.0  86.666667

You can just drop mthif you want. The logic is that after the sumpart, you have this:

mth如果你愿意，你可以放弃。逻辑是在sum部分之后，你有这个：

In [20]: df.groupby(['id', 'mth']).sum()
Out[20]: 
        cost
id mth      
1  3      30
   4      30
   5      40
2  3      50
   4     130
   5      80

Resetting the index at this point will give you unique months.

此时重置索引将为您提供唯一的月份。

In [21]: df.groupby(['id', 'mth']).sum().reset_index()
Out[21]: 
   id  mth  cost
0   1    3    30
1   1    4    30
2   1    5    40
3   2    3    50
4   2    4   130
5   2    5    80

It's just a matter of grouping it again, this time using meaninstead of sum. This should give you the averages.

这只是再次分组的问题，这次使用mean代替sum。这应该给你平均值。

Let us know if this helps.

如果这有帮助，请告诉我们。

Python 熊猫得到一个分组的平均值

提问by jxn

回答by Jerome Montino

相关推荐

最近更新

标签

Python 熊猫得到一个分组的平均值

提问by jxn

回答by Jerome Montino

相关推荐

为什么在 Python 3 中 map 返回一个 map 对象而不是一个列表？

Python 如何从 Spark SQL 中的列表创建数据框？

Python 无法打开包含文件：'io.h'：没有这样的文件或目录

Python Jupyter 中的内联动画

相关推荐

最近更新

标签