pandas 使用熊猫数据框计算累积回报
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35365545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculating cumulative returns with pandas dataframe
提问by David Hancock
I have this dataframe
我有这个数据框
Poloniex_DOGE_BTC Poloniex_XMR_BTC Daily_rets perc_ret
172 0.006085 -0.000839 0.003309 0
173 0.006229 0.002111 0.005135 0
174 0.000000 -0.001651 0.004203 0
175 0.000000 0.007743 0.005313 0
176 0.000000 -0.001013 -0.003466 0
177 0.000000 -0.000550 0.000772 0
178 0.000000 -0.009864 0.001764 0
I'm trying to make a running total of daily_rets in perc_ret
我正在尝试在 perc_ret 中运行 daily_rets
however my code just copies the values from daily_rets
但是我的代码只是从 daily_rets 复制值
df['perc_ret'] = ( df['Daily_rets'] + df['perc_ret'].shift(1) )
Poloniex_DOGE_BTC Poloniex_XMR_BTC Daily_rets perc_ret
172 0.006085 -0.000839 0.003309 NaN
173 0.006229 0.002111 0.005135 0.005135
174 0.000000 -0.001651 0.004203 0.004203
175 0.000000 0.007743 0.005313 0.005313
176 0.000000 -0.001013 -0.003466 -0.003466
177 0.000000 -0.000550 0.000772 0.000772
178 0.000000 -0.009864 0.001764 0.001764
采纳答案by jezrael
If performance is important, use numpy.cumprod
:
如果性能很重要,请使用numpy.cumprod
:
np.cumprod(1 + df['Daily_rets'].values) - 1
Timings:
时间:
#7k rows
df = pd.concat([df] * 1000, ignore_index=True)
In [191]: %timeit np.cumprod(1 + df['Daily_rets'].values) - 1
41 μs ± 282 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [192]: %timeit (1 + df.Daily_rets).cumprod() - 1
554 μs ± 3.63 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
回答by Alexander
If they are daily simple returns and you want a cumulative return, surely you must want a daily compounded number?
如果它们是每日简单回报并且您想要累积回报,那么您肯定想要每日复合数字吗?
df['perc_ret'] = (1 + df.Daily_rets).cumprod() - 1 # Or df.Daily_rets.add(1).cumprod().sub(1)
>>> df
Poloniex_DOGE_BTC Poloniex_XMR_BTC Daily_rets perc_ret
172 0.006085 -0.000839 0.003309 0.003309
173 0.006229 0.002111 0.005135 0.008461
174 0.000000 -0.001651 0.004203 0.012700
175 0.000000 0.007743 0.005313 0.018080
176 0.000000 -0.001013 -0.003466 0.014551
177 0.000000 -0.000550 0.000772 0.015335
178 0.000000 -0.009864 0.001764 0.017126
If they are log returns, then you could just use cumsum
.
如果它们是日志返回,那么您可以使用cumsum
.
回答by Dong Yi
you just cannot simply add them all by using cumsum
您不能简单地使用 cumsum 将它们全部添加
for example, if you have array [1.1, 1.1], you supposed to have 2.21, not 2.2
例如,如果你有数组 [1.1, 1.1],你应该有 2.21,而不是 2.2
import numpy as np
# daily return:
df['daily_return'] = df['close'].pct_change()
# calculate cumluative return
df['cumluative_return'] = np.exp(np.log1p(df['daily_return']).cumsum())