Python Pandas:计算组内的移动平均值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/53339021/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:09:43  来源:igfitidea点击:

Python Pandas: Calculate moving average within group

pythonpandaspandas-groupbymoving-average

提问by Alexandr Kapshuk

I have a dataframe containing time series for 100 objects:

我有一个包含 100 个对象的时间序列的数据框:

object  period  value 
1       1       24
1       2       67
...
1       1000    56
2       1       59
2       2       46
...
2       1000    64
3       1       54
...
100     1       451
100     2       153
...
100     1000    21

I want to calculate moving average with window 10 for the valuecolumn. I guess I have to do something like

我想用窗口 10 计算列的移动平均值value。我想我必须做类似的事情

df.groupby('object').apply(lambda ~calculate MA~) 

and then merge this Series to the original dataframe by object? Can't figure out exact commands

然后按对象将此系列合并到原始数据框?无法弄清楚确切的命令

回答by zipa

You can use rollingwith transform:

您可以使用滚动使用transform

df['moving'] = df.groupby('object')['value'].transform(lambda x: x.rolling(10, 1).mean())

The 1in rollingis for minimum number of periods.

1rolling为周期的最小数目。

回答by Sandeep Kadapa

You can use rollingon groupbyobject directly as:

您可以直接rollinggroupby对象上使用:

df['moving'] = df.groupby('object').rolling(10)['value'].mean()

回答by dajcs

Extending the answer from @Sandeep Kadapa:

扩展@Sandeep Kadapa 的回答:

df['moving'] = df.groupby('object').rolling(10)['value'].mean().reset_index(drop=True)

The reason for reset_indexis because after df.groupbywe end up with a Multi Level Index and at the assignment we will get error TypeError: incompatible index of inserted column with frame index

原因reset_index是因为在df.groupby我们最终得到一个多级索引并且在分配时我们会得到错误TypeError: incompatible index of inserted column with frame index

回答by Ramin Melikov

Create a column as a chain method:

创建一列作为链式方法:

(
    df
        .assign(
            column_name = lambda x: 
                x
                    .groupby(['object'])['value']
                    .transform(lambda x: x.rolling(10)
                    .mean())
        )
)