pandas Python:计算时间序列的对数回报

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31742545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:42:22  来源:igfitidea点击:

Python: calculating log returns of a time series

pythonpandas

提问by Jorko12

I have such time series of data, where the 3rd row represents the close value of an index.

我有这样的时间序列数据,其中第三行代表索引的接近值。

DAX 20150728 11173.910156
DAX 20150727 11056.400391
DAX 20150724 11347.450195
DAX 20150723 11512.110352

How can I calculate the log returns of the index using pandas python?

如何使用 pandas python 计算索引的对数回报?

Thank you very much!

非常感谢!

Regards

问候

采纳答案by EdChum

If I understand log returns correctly then the following is what you want:

如果我理解 log 正确返回,那么以下是您想要的:

In [155]:

t="""DAX 20150728 11173.910156
DAX 20150727 11056.400391
DAX 20150724 11347.450195
DAX 20150723 11512.110352"""
df = pd.read_csv(io.StringIO(t), header=None, sep='\s+',names=['exchange', 'date', 'close'], parse_dates=[1])
df
Out[155]:
  exchange       date         close
0      DAX 2015-07-28  11173.910156
1      DAX 2015-07-27  11056.400391
2      DAX 2015-07-24  11347.450195
3      DAX 2015-07-23  11512.110352
In [157]:

df['log return'] = np.log(df['close']) - np.log(df['close'].iloc[0])
df
Out[157]:
  exchange       date         close  log return
0      DAX 2015-07-28  11173.910156    0.000000
1      DAX 2015-07-27  11056.400391   -0.010572
2      DAX 2015-07-24  11347.450195    0.015411
3      DAX 2015-07-23  11512.110352    0.029818

EDIT

编辑

OK if it's intra log difference then you can do this succinctly using diff:

好的,如果它是内部日志差异,那么您可以使用以下方式简洁地执行此操作diff

In [161]:
df['log return'] = np.log(df['close']).diff()
df

Out[161]:
  exchange       date         close  log return
0      DAX 2015-07-28  11173.910156         NaN
1      DAX 2015-07-27  11056.400391   -0.010572
2      DAX 2015-07-24  11347.450195    0.025984
3      DAX 2015-07-23  11512.110352    0.014406

回答by Robert

Be careful with

小心

np.log(df['close']).diff() 

as this will fail for indices which can become negative as well as risk factors e.g. negative interest rates. In these cases

因为这对于可能变为负值的指数以及风险因素(例如负利率)将失败。在这些情况下

np.log(df['close']/df['close'].shift(1)).dropna()

is preferred and based on my experience generally the safer approach. Whether you use +1 or -1 depends on the ordering of your time series. Use -1 for descending and +1 for ascending dates - in both cases the shift provides the preceding date's value.

是首选,根据我的经验,通常是更安全的方法。使用 +1 还是 -1 取决于时间序列的顺序。使用 -1 表示降序,+1 表示升序 - 在这两种情况下,移位都提供前一个日期的值。

In this specific example you need to set up the date column as index first, otherwise divide operation will fail:

在这个具体示例中,您需要先将日期列设置为索引,否则除法操作将失败:

df['close'].set_index("date",inplace=True)

回答by hvedrung

    import numpy as np
    df['log return'] = np.log(df[2]/df[2].shift(-1)) 

df is your dataframe which is descending sorted by date.

df 是您的数据框,按日期降序排列。