pandas 使用财务数据计算数据帧的回报

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13385663/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 00:07:44  来源:igfitidea点击:

Calculating returns from a dataframe with financial data

pandasfinance

提问by Daniel Velkov

I have a dataframe with monthly financial data:

我有一个包含每月财务数据的数据框:

In [89]: vfiax_monthly.head()
Out[89]: 
            year  month  day       d   open  close   high    low  volume  aclose
2003-01-31  2003      1   31  731246  64.95  64.95  64.95  64.95       0   64.95
2003-02-28  2003      2   28  731274  63.98  63.98  63.98  63.98       0   63.98
2003-03-31  2003      3   31  731305  64.59  64.59  64.59  64.59       0   64.59
2003-04-30  2003      4   30  731335  69.93  69.93  69.93  69.93       0   69.93
2003-05-30  2003      5   30  731365  73.61  73.61  73.61  73.61       0   73.61

I'm trying to calculate the returns like that:

我正在尝试计算这样的回报:

In [90]: returns = (vfiax_monthly.open[1:] - vfiax_monthly.open[:-1])/vfiax_monthly.open[1:]

But I'm getting only zeroes:

但我只得到零:

In [91]: returns.head()
Out[91]: 
2003-01-31   NaN
2003-02-28     0
2003-03-31     0
2003-04-30     0
2003-05-30     0
Freq: BM, Name: open

I think that's because the arithmetic operations get aligned on the index and that makes the [1:]and [:-1]useless.

我认为这是因为算术运算在索引上对齐,这使得[1:][:-1]无用。

My workaround is:

我的解决方法是:

In [103]: returns = (vfiax_monthly.open[1:].values - vfiax_monthly.open[:-1].values)/vfiax_monthly.open[1:].values

In [104]: returns = pd.Series(returns, index=vfiax_monthly.index[1:])

In [105]: returns.head()
Out[105]: 
2003-02-28   -0.015161
2003-03-31    0.009444
2003-04-30    0.076362
2003-05-30    0.049993
2003-06-30    0.012477
Freq: BM

Is there a better way to calculate the returns? I don't like the conversion to array and then back to Series.

有没有更好的方法来计算回报?我不喜欢转换为数组然后再转换回系列。

回答by Matti John

Instead of slicing, use .shiftto move the index position of values in a DataFrame/Series. For example:

不是切片,而是.shift用于移动 DataFrame/Series 中值的索引位置。例如:

returns = (vfiax_monthly.open - vfiax_monthly.open.shift(1))/vfiax_monthly.open.shift(1)

This is what pct_changeis doing under the bonnet. You can also use it for other functions e.g.:

这就是pct_change引擎盖下正在做的事情。您还可以将其用于其他功能,例如:

(3*vfiax_monthly.open + 2*vfiax_monthly.open.shift(1))/5

You might also want to looking into the rollingand windowfunctions for other types of analysis of financial data.

您可能还想查看滚动窗口函数以进行其他类型的财务数据分析。

回答by spencerlyon2

The easiest way to do this is to use the DataFrame.pct_change() method.

最简单的方法是使用 DataFrame.pct_change() 方法。

Here is a quick example

这是一个快速示例

In[1]: aapl = get_data_yahoo('aapl', start='11/1/2012', end='11/13/2012')

In[2]: appl
Out[2]: 
          Open    High     Low   Close    Volume  Adj Close
Date                                                           
2012-11-01  598.22  603.00  594.17  596.54  12903500     593.83
2012-11-02  595.89  596.95  574.75  576.80  21406200     574.18
2012-11-05  583.52  587.77  577.60  584.62  18897700     581.96
2012-11-06  590.23  590.74  580.09  582.85  13389900     580.20
2012-11-07  573.84  574.54  555.75  558.00  28344600     558.00
2012-11-08  560.63  562.23  535.29  537.75  37719500     537.75
2012-11-09  540.42  554.88  533.72  547.06  33211200     547.06
2012-11-12  554.15  554.50  538.65  542.83  18421500     542.83
2012-11-13  538.91  550.48  536.36  542.90  19033900     542.90

In[3]: aapl.pct_change()
Out[3]:
                Open      High       Low     Close    Volume  Adj Close
Date                                                                   
2012-11-01       NaN       NaN       NaN       NaN       NaN        NaN
2012-11-02 -0.003895 -0.010033 -0.032684 -0.033091  0.658945  -0.033090
2012-11-05 -0.020759 -0.015378  0.004959  0.013558 -0.117186   0.013550
2012-11-06  0.011499  0.005053  0.004311 -0.003028 -0.291453  -0.003024
2012-11-07 -0.027769 -0.027423 -0.041959 -0.042635  1.116864  -0.038263
2012-11-08 -0.023020 -0.021426 -0.036815 -0.036290  0.330747  -0.036290
2012-11-09 -0.036049 -0.013073 -0.002933  0.017313 -0.119522   0.017313
2012-11-12  0.025406 -0.000685  0.009237 -0.007732 -0.445323  -0.007732
2012-11-13 -0.027502 -0.007250 -0.004251  0.000129  0.033244   0.000129

回答by rgalbo

The best way to calculate forward looking returns without any chance of bias is to use the built in function pd.DataFrame.pct_change(). In your case all you need to use is this function since you have monthly data, and you are looking for the monthly return.

在没有任何偏差的情况下计算前瞻性回报的最佳方法是使用内置函数pd.DataFrame.pct_change()。在您的情况下,您只需要使用此功能,因为您有每月数据,并且您正在寻找每月回报。

If, for example, you wanted to look at the 6 month return, you would just set the param df.pct_change(periods = 6)and that will give you the 6 month percent return.

例如,如果您想查看 6 个月的回报,您只需设置参数即可 df.pct_change(periods = 6)获得 6 个月的百分比回报。

Because you have a relatively small data set, the easiest way is to resample on the parameters that you need to calculate the data on then use the pct_change()function again.

因为您的数据集相对较小,所以最简单的方法是对需要计算数据的参数重新采样,然后pct_change()再次使用该函数。

However because of the nice properties of logit is common to use the formula for calculating returns (if you plan on computing statistics on the return series):

然而,由于log它的良好特性,通常使用公式来计算回报(如果您计划计算回报系列的统计数据):

enter image description here

在此处输入图片说明

Which you would implement as such:

你会这样实现:

log_return = np.log(vfiax_monthly.open / vfiax_monthly.open.shift())

log_return = np.log(vfiax_monthly.open / vfiax_monthly.open.shift())

回答by salhin

Could also use a mix of diffand shiftmethods of pandas series:

也可以使用pandas 系列的混合diffshift方法:

retrun = vfiax_monthly.open.diff()/vfiax_monthly.open.shift(1)