Pandas 和 Rolling_Mean with Offset(平均每日交易量计算)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35272145/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas and Rolling_Mean with Offset (Average Daily Volume Calculation)
提问by Stumbling Through Data Science
When I pull stock data into a dataframe from Yahoo, I want to be able to calculate the 5 day average of volume, excluding the current date.
当我将股票数据从 Yahoo 提取到数据框中时,我希望能够计算 5 天的平均交易量,不包括当前日期。
Is there a way to use rolling mean with an offset? For example, a 5 day mean that excludes current day and is based on the prior 5 days.
有没有办法使用带有偏移量的滚动平均值?例如,5 天的意思是排除当天并基于前 5 天。
When I run the following code
当我运行以下代码时
r = DataReader("BBRY", "yahoo", '2015-01-01','2015-01-31')
r['ADV']=pd.rolling_mean(r['Volume'], window=5)
It returns the 5 day volume, inclusive of the current date, so when you look at the below, 1/8 has average volume from 1/2,1/5,1/6,1/7, and 1/8. I would want 1/9 to be the first date that returns average volume and it to contain data from 1/2,1/5,1/6,1/7, and 1/8.
它返回 5 天的交易量,包括当前日期,因此当您查看以下内容时,1/8 的平均交易量来自 1/2、1/5、1/6、1/7 和 1/8。我希望 1/9 是返回平均交易量的第一个日期,它包含来自 1/2、1/5、1/6、1/7 和 1/8 的数据。
Date Open High Low Close Volume Adj Close Symbol ADV
1/2/2015 11.01 11.11 10.79 10.82 9733200 10.82 BBRY NaN
1/5/2015 10.60 10.77 10.37 10.76 12318100 10.76 BBRY NaN
1/6/2015 10.80 10.85 10.44 10.62 10176400 10.62 BBRY NaN
1/7/2015 10.65 10.80 10.48 10.67 10277400 10.67 BBRY NaN
1/8/2015 10.75 10.78 10.57 10.63 6868300 10.63 BBRY 9,874,680.00
1/9/2015 10.59 10.65 10.28 10.38 7745600 10.38 BBRY 9,477,160.00
回答by EdChum
You can shift
the rows to achieve what you want:
您可以shift
通过行来实现您想要的:
In [44]:
r['ADV'] = pd.rolling_mean(r['Volume'].shift(), window=5)
r
Out[44]:
Open High Low Close Volume Adj Close ADV
Date
2015-01-02 11.01 11.11 10.79 10.82 9733200 10.82 NaN
2015-01-05 10.60 10.77 10.37 10.76 12318100 10.76 NaN
2015-01-06 10.80 10.85 10.44 10.62 10176400 10.62 NaN
2015-01-07 10.65 10.80 10.48 10.67 10277400 10.67 NaN
2015-01-08 10.75 10.78 10.57 10.63 6868300 10.63 NaN
2015-01-09 10.59 10.65 10.28 10.38 7745600 10.38 9874680
2015-01-12 10.36 10.37 10.02 10.12 7739600 10.12 9477160
2015-01-13 10.05 10.23 9.68 9.71 15292900 9.71 8561460
2015-01-14 9.61 12.63 9.32 12.60 83543900 12.60 9584760
2015-01-15 10.36 10.71 10.01 10.11 52574600 10.11 24238060
2015-01-16 10.12 10.39 10.11 10.24 16068900 10.24 33379320
2015-01-20 10.28 10.37 9.82 10.03 15185900 10.03 35043980
2015-01-21 10.03 10.38 9.81 9.93 19614500 9.93 36533240
2015-01-22 10.44 11.11 10.24 10.51 44594300 10.51 37397560
2015-01-23 10.78 11.03 10.61 10.71 21079800 10.71 29607640
2015-01-26 10.67 10.71 10.40 10.52 6982000 10.52 23308680
2015-01-27 10.38 10.63 10.32 10.56 7057200 10.56 21491300
2015-01-28 10.65 10.67 10.10 10.12 9705000 10.12 19865560
2015-01-29 10.05 10.27 9.85 10.25 12304700 10.25 17883660
2015-01-30 10.15 10.26 10.00 10.15 9203400 10.15 11425740
回答by tsando
In the latest version of pandas (> 0.18.0), the syntax would change to:
在最新版本的 pandas (> 0.18.0) 中,语法将更改为:
df['Volume'].rolling(window=5).mean().shift(1)
df['Volume'].rolling(window=5).mean().shift(1)