pandas 使用财务数据计算数据帧的回报
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13385663/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculating returns from a dataframe with financial data
提问by Daniel Velkov
I have a dataframe with monthly financial data:
我有一个包含每月财务数据的数据框:
In [89]: vfiax_monthly.head()
Out[89]:
year month day d open close high low volume aclose
2003-01-31 2003 1 31 731246 64.95 64.95 64.95 64.95 0 64.95
2003-02-28 2003 2 28 731274 63.98 63.98 63.98 63.98 0 63.98
2003-03-31 2003 3 31 731305 64.59 64.59 64.59 64.59 0 64.59
2003-04-30 2003 4 30 731335 69.93 69.93 69.93 69.93 0 69.93
2003-05-30 2003 5 30 731365 73.61 73.61 73.61 73.61 0 73.61
I'm trying to calculate the returns like that:
我正在尝试计算这样的回报:
In [90]: returns = (vfiax_monthly.open[1:] - vfiax_monthly.open[:-1])/vfiax_monthly.open[1:]
But I'm getting only zeroes:
但我只得到零:
In [91]: returns.head()
Out[91]:
2003-01-31 NaN
2003-02-28 0
2003-03-31 0
2003-04-30 0
2003-05-30 0
Freq: BM, Name: open
I think that's because the arithmetic operations get aligned on the index and that makes the [1:]
and [:-1]
useless.
我认为这是因为算术运算在索引上对齐,这使得[1:]
和[:-1]
无用。
My workaround is:
我的解决方法是:
In [103]: returns = (vfiax_monthly.open[1:].values - vfiax_monthly.open[:-1].values)/vfiax_monthly.open[1:].values
In [104]: returns = pd.Series(returns, index=vfiax_monthly.index[1:])
In [105]: returns.head()
Out[105]:
2003-02-28 -0.015161
2003-03-31 0.009444
2003-04-30 0.076362
2003-05-30 0.049993
2003-06-30 0.012477
Freq: BM
Is there a better way to calculate the returns? I don't like the conversion to array and then back to Series.
有没有更好的方法来计算回报?我不喜欢转换为数组然后再转换回系列。
回答by Matti John
Instead of slicing, use .shift
to move the index position of values in a DataFrame/Series. For example:
不是切片,而是.shift
用于移动 DataFrame/Series 中值的索引位置。例如:
returns = (vfiax_monthly.open - vfiax_monthly.open.shift(1))/vfiax_monthly.open.shift(1)
This is what pct_change
is doing under the bonnet. You can also use it for other functions e.g.:
这就是pct_change
引擎盖下正在做的事情。您还可以将其用于其他功能,例如:
(3*vfiax_monthly.open + 2*vfiax_monthly.open.shift(1))/5
You might also want to looking into the rollingand windowfunctions for other types of analysis of financial data.
回答by spencerlyon2
The easiest way to do this is to use the DataFrame.pct_change() method.
最简单的方法是使用 DataFrame.pct_change() 方法。
Here is a quick example
这是一个快速示例
In[1]: aapl = get_data_yahoo('aapl', start='11/1/2012', end='11/13/2012')
In[2]: appl
Out[2]:
Open High Low Close Volume Adj Close
Date
2012-11-01 598.22 603.00 594.17 596.54 12903500 593.83
2012-11-02 595.89 596.95 574.75 576.80 21406200 574.18
2012-11-05 583.52 587.77 577.60 584.62 18897700 581.96
2012-11-06 590.23 590.74 580.09 582.85 13389900 580.20
2012-11-07 573.84 574.54 555.75 558.00 28344600 558.00
2012-11-08 560.63 562.23 535.29 537.75 37719500 537.75
2012-11-09 540.42 554.88 533.72 547.06 33211200 547.06
2012-11-12 554.15 554.50 538.65 542.83 18421500 542.83
2012-11-13 538.91 550.48 536.36 542.90 19033900 542.90
In[3]: aapl.pct_change()
Out[3]:
Open High Low Close Volume Adj Close
Date
2012-11-01 NaN NaN NaN NaN NaN NaN
2012-11-02 -0.003895 -0.010033 -0.032684 -0.033091 0.658945 -0.033090
2012-11-05 -0.020759 -0.015378 0.004959 0.013558 -0.117186 0.013550
2012-11-06 0.011499 0.005053 0.004311 -0.003028 -0.291453 -0.003024
2012-11-07 -0.027769 -0.027423 -0.041959 -0.042635 1.116864 -0.038263
2012-11-08 -0.023020 -0.021426 -0.036815 -0.036290 0.330747 -0.036290
2012-11-09 -0.036049 -0.013073 -0.002933 0.017313 -0.119522 0.017313
2012-11-12 0.025406 -0.000685 0.009237 -0.007732 -0.445323 -0.007732
2012-11-13 -0.027502 -0.007250 -0.004251 0.000129 0.033244 0.000129
回答by rgalbo
The best way to calculate forward looking returns without any chance of bias is to use the built in function pd.DataFrame.pct_change()
. In your case all you need to use is this function since you have monthly data, and you are looking for the monthly return.
在没有任何偏差的情况下计算前瞻性回报的最佳方法是使用内置函数pd.DataFrame.pct_change()
。在您的情况下,您只需要使用此功能,因为您有每月数据,并且您正在寻找每月回报。
If, for example, you wanted to look at the 6 month return, you would just set the param
df.pct_change(periods = 6)
and that will give you the 6 month percent return.
例如,如果您想查看 6 个月的回报,您只需设置参数即可
df.pct_change(periods = 6)
获得 6 个月的百分比回报。
Because you have a relatively small data set, the easiest way is to resample on the parameters that you need to calculate the data on then use the pct_change()
function again.
因为您的数据集相对较小,所以最简单的方法是对需要计算数据的参数重新采样,然后pct_change()
再次使用该函数。
However because of the nice properties of log
it is common to use the formula for calculating returns (if you plan on computing statistics on the return series):
然而,由于log
它的良好特性,通常使用公式来计算回报(如果您计划计算回报系列的统计数据):
Which you would implement as such:
你会这样实现:
log_return = np.log(vfiax_monthly.open / vfiax_monthly.open.shift())
log_return = np.log(vfiax_monthly.open / vfiax_monthly.open.shift())
回答by salhin
Could also use a mix of diff
and shift
methods of pandas series:
也可以使用pandas 系列的混合diff
和shift
方法:
retrun = vfiax_monthly.open.diff()/vfiax_monthly.open.shift(1)