Python 熊猫数据框中的对数回报

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31287552/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:45:51  来源:igfitidea点击:

Logarithmic returns in pandas dataframe

pythonpandas

提问by AmanArora

Python pandas has a pct_change function which I use to calculate the returns for stock prices in a dataframe:

Python pandas 有一个 pct_change 函数,我用它来计算数据帧中股票价格的回报:

ndf['Return']= ndf['TypicalPrice'].pct_change()

I am using the following code to get logarithmic returns, but it gives the exact same values as the pct.change() function:

我使用以下代码来获取对数返回,但它给出与 pct.change() 函数完全相同的值:

ndf['retlog']=np.log(ndf['TypicalPrice'].astype('float64')/ndf['TypicalPrice'].astype('float64').shift(1))
#np is for numpy

采纳答案by Jianxun Li

Here is one way to calculate log return using .shift(). And the result is similar to but not the same as the gross return calculated by pct_change(). Can you upload a copy of your sample data (dropbox share link) to reproduce the inconsistency you saw?

这是使用 计算日志回报的一种方法.shift()。结果与由 计算的总回报相似但不相同pct_change()。您能否上传您的示例数据副本(保管箱共享链接)以重现您看到的不一致情况?

import pandas as pd
import numpy as np

np.random.seed(0)
df = pd.DataFrame(100 + np.random.randn(100).cumsum(), columns=['price'])
df['pct_change'] = df.price.pct_change()
df['log_ret'] = np.log(df.price) - np.log(df.price.shift(1))

Out[56]: 
       price  pct_change  log_ret
0   101.7641         NaN      NaN
1   102.1642      0.0039   0.0039
2   103.1429      0.0096   0.0095
3   105.3838      0.0217   0.0215
4   107.2514      0.0177   0.0176
5   106.2741     -0.0091  -0.0092
6   107.2242      0.0089   0.0089
7   107.0729     -0.0014  -0.0014
..       ...         ...      ...
92  101.6160      0.0021   0.0021
93  102.5926      0.0096   0.0096
94  102.9490      0.0035   0.0035
95  103.6555      0.0069   0.0068
96  103.6660      0.0001   0.0001
97  105.4519      0.0172   0.0171
98  105.5788      0.0012   0.0012
99  105.9808      0.0038   0.0038

[100 rows x 3 columns]

回答by Ami Tavory

The results might seemsimilar, but that is just because of the Taylor expansion for the logarithm. Since log(1 + x) ~ x, the results can be similar.

结果可能看起来相似,但这只是因为对数泰勒展开式。由于log(1 + x) ~ x,结果可能相似。

However,

然而,

I am using the following code to get logarithmic returns, but it gives the exact same values as the pct.change() function.

我使用以下代码来获取对数返回,但它给出与 pct.change() 函数完全相同的值。

is not quite correct.

不太正确。

import pandas as pd

df = pd.DataFrame({'p': range(10)})

df['pct_change'] = df.pct_change()
df['log_stuff'] = \
    np.log(df['p'].astype('float64')/df['p'].astype('float64').shift(1))
df[['pct_change', 'log_stuff']].plot();

enter image description here

在此处输入图片说明

回答by EpicAdv

Log returns are simply the natural log of 1 plus the arithmetic return. So how about this?

对数回报只是 1 的自然对数加上算术回报。那么这个怎么样?

df['pct_change'] = df.price.pct_change()
df['log_return'] = np.log(1 + df.pct_change)

回答by poulter7

Single line, and only calculating logs once. First convert to log-space, then take the 1-period diff.

单行,并且只计算一次日志。首先转换为对数空间,然后取 1 周期差异。

    np.diff(np.log(df.price))

In earlier versions of numpy:

在早期版本的 numpy 中:

    np.log(df.price)).diff()

回答by Robert

@poulter7: I cannot comment on the other answers, so I post it as new answer: be careful with

@poulter7:我无法评论其他答案,所以我将其作为新答案发布:小心

np.log(df.price).diff() 

as this will fail for indices which can become negative as well as risk factors e.g. negative interest rates. In these cases

因为这对于可能变为负值的指数以及风险因素(例如负利率)将失败。在这些情况下

np.log(df.price/df.price.shift(1)).dropna()

is preferred and based on my experience generally the safer approach. It also evaluates the logarithm only once.

是首选,根据我的经验,通常是更安全的方法。它还只计算一次对数。

Whether you use +1 or -1 depends on the ordering of your time series. Use -1 for descending and +1 for ascending dates - in both cases the shift provides the preceding date's value.

使用 +1 还是 -1 取决于时间序列的顺序。使用 -1 表示降序日期,+1 表示升序日期 - 在这两种情况下,移位都提供前一个日期的值。