使用 python pandas 计算增量平均值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21142149/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculate an incremental mean using python pandas
提问by Jmc
I'd like to generate a series that's the incremental mean of a timeseries. Meaning that, starting from the first date (index 0), the mean stored in row x is the average of values [0:x]
我想生成一个序列,它是时间序列的增量平均值。这意味着,从第一个日期(索引 0)开始,存储在 x 行中的平均值是值 [0:x] 的平均值
data
index value mean formula
0 4
1 5
2 6
3 7 5.5 average(0-3)
4 4 5.2 average(0-4)
5 5 5.166666667 average(0-5)
6 6 5.285714286 average(0-6)
7 7 5.5 average(0-7)
I'm hoping there's a way to do this without looping to take advantage of pandas.
我希望有一种方法可以在不循环利用Pandas的情况下做到这一点。
回答by jpobst
Here's an update for newer versions of Pandas (starting with 0.18.0)
这是 Pandas 新版本的更新(从 0.18.0 开始)
df['value'].expanding().mean()
or
或者
s.expanding().mean()
回答by Andy Hayden
As @TomAugspurger points out, you can use expanding_mean:
正如@TomAugspurger 指出的那样,您可以使用expanding_mean:
In [11]: s = pd.Series([4, 5, 6, 7, 4, 5, 6, 7])
In [12]: pd.expanding_mean(s, 4)
Out[12]:
0 NaN
1 NaN
2 NaN
3 5.500000
4 5.200000
5 5.166667
6 5.285714
7 5.500000
dtype: float64
回答by patricksurry
Another approach is to use cumsum(), and divide by the cumulative number of items, for example:
另一种方法是使用 cumsum(),并除以项目的累积数量,例如:
In [1]:
s = pd.Series([4, 5, 6, 7, 4, 5, 6, 7])
s.cumsum() / pd.Series(np.arange(1, len(s)+1), s.index)
Out[1]:
0 4.000000
1 4.500000
2 5.000000
3 5.500000
4 5.200000
5 5.166667
6 5.285714
7 5.500000
dtype: float64

