Python 熊猫数据框中的对数回报
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31287552/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Logarithmic returns in pandas dataframe
提问by AmanArora
Python pandas has a pct_change function which I use to calculate the returns for stock prices in a dataframe:
Python pandas 有一个 pct_change 函数,我用它来计算数据帧中股票价格的回报:
ndf['Return']= ndf['TypicalPrice'].pct_change()
I am using the following code to get logarithmic returns, but it gives the exact same values as the pct.change() function:
我使用以下代码来获取对数返回,但它给出与 pct.change() 函数完全相同的值:
ndf['retlog']=np.log(ndf['TypicalPrice'].astype('float64')/ndf['TypicalPrice'].astype('float64').shift(1))
#np is for numpy
采纳答案by Jianxun Li
Here is one way to calculate log return using .shift()
. And the result is similar to but not the same as the gross return calculated by pct_change()
. Can you upload a copy of your sample data (dropbox share link) to reproduce the inconsistency you saw?
这是使用 计算日志回报的一种方法.shift()
。结果与由 计算的总回报相似但不相同pct_change()
。您能否上传您的示例数据副本(保管箱共享链接)以重现您看到的不一致情况?
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame(100 + np.random.randn(100).cumsum(), columns=['price'])
df['pct_change'] = df.price.pct_change()
df['log_ret'] = np.log(df.price) - np.log(df.price.shift(1))
Out[56]:
price pct_change log_ret
0 101.7641 NaN NaN
1 102.1642 0.0039 0.0039
2 103.1429 0.0096 0.0095
3 105.3838 0.0217 0.0215
4 107.2514 0.0177 0.0176
5 106.2741 -0.0091 -0.0092
6 107.2242 0.0089 0.0089
7 107.0729 -0.0014 -0.0014
.. ... ... ...
92 101.6160 0.0021 0.0021
93 102.5926 0.0096 0.0096
94 102.9490 0.0035 0.0035
95 103.6555 0.0069 0.0068
96 103.6660 0.0001 0.0001
97 105.4519 0.0172 0.0171
98 105.5788 0.0012 0.0012
99 105.9808 0.0038 0.0038
[100 rows x 3 columns]
回答by Ami Tavory
The results might seemsimilar, but that is just because of the Taylor expansion for the logarithm. Since log(1 + x) ~ x, the results can be similar.
结果可能看起来相似,但这只是因为对数的泰勒展开式。由于log(1 + x) ~ x,结果可能相似。
However,
然而,
I am using the following code to get logarithmic returns, but it gives the exact same values as the pct.change() function.
我使用以下代码来获取对数返回,但它给出与 pct.change() 函数完全相同的值。
is not quite correct.
不太正确。
import pandas as pd
df = pd.DataFrame({'p': range(10)})
df['pct_change'] = df.pct_change()
df['log_stuff'] = \
np.log(df['p'].astype('float64')/df['p'].astype('float64').shift(1))
df[['pct_change', 'log_stuff']].plot();
回答by EpicAdv
Log returns are simply the natural log of 1 plus the arithmetic return. So how about this?
对数回报只是 1 的自然对数加上算术回报。那么这个怎么样?
df['pct_change'] = df.price.pct_change()
df['log_return'] = np.log(1 + df.pct_change)
回答by poulter7
Single line, and only calculating logs once. First convert to log-space, then take the 1-period diff.
单行,并且只计算一次日志。首先转换为对数空间,然后取 1 周期差异。
np.diff(np.log(df.price))
In earlier versions of numpy:
在早期版本的 numpy 中:
np.log(df.price)).diff()
回答by Robert
@poulter7: I cannot comment on the other answers, so I post it as new answer: be careful with
@poulter7:我无法评论其他答案,所以我将其作为新答案发布:小心
np.log(df.price).diff()
as this will fail for indices which can become negative as well as risk factors e.g. negative interest rates. In these cases
因为这对于可能变为负值的指数以及风险因素(例如负利率)将失败。在这些情况下
np.log(df.price/df.price.shift(1)).dropna()
is preferred and based on my experience generally the safer approach. It also evaluates the logarithm only once.
是首选,根据我的经验,通常是更安全的方法。它还只计算一次对数。
Whether you use +1 or -1 depends on the ordering of your time series. Use -1 for descending and +1 for ascending dates - in both cases the shift provides the preceding date's value.
使用 +1 还是 -1 取决于时间序列的顺序。使用 -1 表示降序日期,+1 表示升序日期 - 在这两种情况下,移位都提供前一个日期的值。