pandas 如何让 DataFrame.pct_change 计算每日价格数据的每月变化?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14036397/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to get DataFrame.pct_change to calculate monthly change on daily price data?
提问by Dallas
I know that it is possible to offset with the periodsargument, but how would one go about return-izing daily price data that is spread throughout a month (trading days, for example)?
我知道可以用这个periods论点来抵消,但是如何将分布在一个月内(例如,交易日)的每日价格数据返回化?
Example data is:
示例数据是:
In [1]: df.AAPL
2009-01-02 16:00:00 90.36
2009-01-05 16:00:00 94.18
2009-01-06 16:00:00 92.62
2009-01-07 16:00:00 90.62
2009-01-08 16:00:00 92.30
2009-01-09 16:00:00 90.19
2009-01-12 16:00:00 88.28
2009-01-13 16:00:00 87.34
2009-01-14 16:00:00 84.97
2009-01-15 16:00:00 83.02
2009-01-16 16:00:00 81.98
2009-01-20 16:00:00 77.87
2009-01-21 16:00:00 82.48
2009-01-22 16:00:00 87.98
2009-01-23 16:00:00 87.98
...
2009-12-10 16:00:00 195.59
2009-12-11 16:00:00 193.84
2009-12-14 16:00:00 196.14
2009-12-15 16:00:00 193.34
2009-12-16 16:00:00 194.20
2009-12-17 16:00:00 191.04
2009-12-18 16:00:00 194.59
2009-12-21 16:00:00 197.38
2009-12-22 16:00:00 199.50
2009-12-23 16:00:00 201.24
2009-12-24 16:00:00 208.15
2009-12-28 16:00:00 210.71
2009-12-29 16:00:00 208.21
2009-12-30 16:00:00 210.74
2009-12-31 16:00:00 209.83
Name: AAPL, Length: 252
As you can see, simply offsetting by 30 would not produce correct results, as there are gaps in the timestamp data, not every month is 30 days, etc. I know there must be an easy way to do this using pandas.
如您所见,简单地抵消 30 不会产生正确的结果,因为时间戳数据中存在差距,并非每个月都是 30 天,等等。我知道必须有一种使用 Pandas 的简单方法来做到这一点。
回答by bmu
You can resample the data to business month. If you don't want the mean price (which is the default in resample) you can use a custom resample method using the keyword argument how:
您可以将数据重新采样为营业月。如果您不想要平均价格(这是 中的默认值resample),您可以使用关键字参数使用自定义重采样方法how:
In [31]: from pandas.io import data as web
# read some example data, note that this is not exactly your data!
In [32]: s = web.get_data_yahoo('AAPL', start='2009-01-02',
... end='2009-12-31')['Adj Close']
# resample to business month and return the last value in the period
In [34]: monthly = s.resample('BM', how=lambda x: x[-1])
In [35]: monthly
Out[35]:
Date
2009-01-30 89.34
2009-02-27 88.52
2009-03-31 104.19
...
2009-10-30 186.84
2009-11-30 198.15
2009-12-31 208.88
Freq: BM
In [36]: monthly.pct_change()
Out[36]:
Date
2009-01-30 NaN
2009-02-27 -0.009178
2009-03-31 0.177022
...
2009-10-30 0.016982
2009-11-30 0.060533
2009-12-31 0.054151
Freq: BM

