pandas Python:计算时间序列的对数回报
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31742545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: calculating log returns of a time series
提问by Jorko12
I have such time series of data, where the 3rd row represents the close value of an index.
我有这样的时间序列数据,其中第三行代表索引的接近值。
DAX 20150728 11173.910156
DAX 20150727 11056.400391
DAX 20150724 11347.450195
DAX 20150723 11512.110352
How can I calculate the log returns of the index using pandas python?
如何使用 pandas python 计算索引的对数回报?
Thank you very much!
非常感谢!
Regards
问候
采纳答案by EdChum
If I understand log returns correctly then the following is what you want:
如果我理解 log 正确返回,那么以下是您想要的:
In [155]:
t="""DAX 20150728 11173.910156
DAX 20150727 11056.400391
DAX 20150724 11347.450195
DAX 20150723 11512.110352"""
df = pd.read_csv(io.StringIO(t), header=None, sep='\s+',names=['exchange', 'date', 'close'], parse_dates=[1])
df
Out[155]:
exchange date close
0 DAX 2015-07-28 11173.910156
1 DAX 2015-07-27 11056.400391
2 DAX 2015-07-24 11347.450195
3 DAX 2015-07-23 11512.110352
In [157]:
df['log return'] = np.log(df['close']) - np.log(df['close'].iloc[0])
df
Out[157]:
exchange date close log return
0 DAX 2015-07-28 11173.910156 0.000000
1 DAX 2015-07-27 11056.400391 -0.010572
2 DAX 2015-07-24 11347.450195 0.015411
3 DAX 2015-07-23 11512.110352 0.029818
EDIT
编辑
OK if it's intra log difference then you can do this succinctly using diff:
好的,如果它是内部日志差异,那么您可以使用以下方式简洁地执行此操作diff:
In [161]:
df['log return'] = np.log(df['close']).diff()
df
Out[161]:
exchange date close log return
0 DAX 2015-07-28 11173.910156 NaN
1 DAX 2015-07-27 11056.400391 -0.010572
2 DAX 2015-07-24 11347.450195 0.025984
3 DAX 2015-07-23 11512.110352 0.014406
回答by Robert
Be careful with
小心
np.log(df['close']).diff()
as this will fail for indices which can become negative as well as risk factors e.g. negative interest rates. In these cases
因为这对于可能变为负值的指数以及风险因素(例如负利率)将失败。在这些情况下
np.log(df['close']/df['close'].shift(1)).dropna()
is preferred and based on my experience generally the safer approach. Whether you use +1 or -1 depends on the ordering of your time series. Use -1 for descending and +1 for ascending dates - in both cases the shift provides the preceding date's value.
是首选,根据我的经验,通常是更安全的方法。使用 +1 还是 -1 取决于时间序列的顺序。使用 -1 表示降序,+1 表示升序 - 在这两种情况下,移位都提供前一个日期的值。
In this specific example you need to set up the date column as index first, otherwise divide operation will fail:
在这个具体示例中,您需要先将日期列设置为索引,否则除法操作将失败:
df['close'].set_index("date",inplace=True)
回答by hvedrung
import numpy as np
df['log return'] = np.log(df[2]/df[2].shift(-1))
df is your dataframe which is descending sorted by date.
df 是您的数据框,按日期降序排列。

