pandas 使用熊猫数据框计算累积回报

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35365545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:41:16  来源:igfitidea点击:

Calculating cumulative returns with pandas dataframe

pythonpandascumsum

提问by David Hancock

I have this dataframe

我有这个数据框

Poloniex_DOGE_BTC   Poloniex_XMR_BTC    Daily_rets  perc_ret
172 0.006085    -0.000839   0.003309    0
173 0.006229    0.002111    0.005135    0
174 0.000000    -0.001651   0.004203    0
175 0.000000    0.007743    0.005313    0
176 0.000000    -0.001013   -0.003466   0
177 0.000000    -0.000550   0.000772    0
178 0.000000    -0.009864   0.001764    0

I'm trying to make a running total of daily_rets in perc_ret

我正在尝试在 perc_ret 中运行 daily_rets

however my code just copies the values from daily_rets

但是我的代码只是从 daily_rets 复制值

df['perc_ret'] = (  df['Daily_rets'] + df['perc_ret'].shift(1) )


Poloniex_DOGE_BTC   Poloniex_XMR_BTC    Daily_rets  perc_ret
172 0.006085    -0.000839   0.003309    NaN
173 0.006229    0.002111    0.005135    0.005135
174 0.000000    -0.001651   0.004203    0.004203
175 0.000000    0.007743    0.005313    0.005313
176 0.000000    -0.001013   -0.003466   -0.003466
177 0.000000    -0.000550   0.000772    0.000772
178 0.000000    -0.009864   0.001764    0.001764

采纳答案by jezrael

If performance is important, use numpy.cumprod:

如果性能很重要,请使用numpy.cumprod

np.cumprod(1 + df['Daily_rets'].values) - 1

Timings:

时间

#7k rows
df = pd.concat([df] * 1000, ignore_index=True)

In [191]: %timeit np.cumprod(1 + df['Daily_rets'].values) - 1
41 μs ± 282 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [192]: %timeit (1 + df.Daily_rets).cumprod() - 1
554 μs ± 3.63 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

回答by Alexander

If they are daily simple returns and you want a cumulative return, surely you must want a daily compounded number?

如果它们是每日简单回报并且您想要累积回报,那么您肯定想要每日复合数字吗?

df['perc_ret'] = (1 + df.Daily_rets).cumprod() - 1  # Or df.Daily_rets.add(1).cumprod().sub(1)

>>> df
     Poloniex_DOGE_BTC  Poloniex_XMR_BTC  Daily_rets  perc_ret
172           0.006085         -0.000839    0.003309  0.003309
173           0.006229          0.002111    0.005135  0.008461
174           0.000000         -0.001651    0.004203  0.012700
175           0.000000          0.007743    0.005313  0.018080
176           0.000000         -0.001013   -0.003466  0.014551
177           0.000000         -0.000550    0.000772  0.015335
178           0.000000         -0.009864    0.001764  0.017126

If they are log returns, then you could just use cumsum.

如果它们是日志返回,那么您可以使用cumsum.

回答by Dong Yi

you just cannot simply add them all by using cumsum

您不能简单地使用 cumsum 将它们全部添加

for example, if you have array [1.1, 1.1], you supposed to have 2.21, not 2.2

例如,如果你有数组 [1.1, 1.1],你应该有 2.21,而不是 2.2

import numpy as np

# daily return:
df['daily_return'] = df['close'].pct_change()

# calculate cumluative return
df['cumluative_return'] = np.exp(np.log1p(df['daily_return']).cumsum())