python pandas中的相对强弱指数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20526414/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Relative Strength Index in python pandas
提问by user3084006
I am new to pandas. What is the best way to calculate the relative strength part in the RSI indicator in pandas? So far I got the following:
我是熊猫的新手。计算熊猫 RSI 指标中相对强度部分的最佳方法是什么?到目前为止,我得到了以下信息:
from pylab import *
import pandas as pd
import numpy as np
def Datapull(Stock):
try:
df = (pd.io.data.DataReader(Stock,'yahoo',start='01/01/2010'))
return df
print 'Retrieved', Stock
time.sleep(5)
except Exception, e:
print 'Main Loop', str(e)
def RSIfun(price, n=14):
delta = price['Close'].diff()
#-----------
dUp=
dDown=
RolUp=pd.rolling_mean(dUp, n)
RolDown=pd.rolling_mean(dDown, n).abs()
RS = RolUp / RolDown
rsi= 100.0 - (100.0 / (1.0 + RS))
return rsi
Stock='AAPL'
df=Datapull(Stock)
RSIfun(df)
Am I doing it correctly so far? I am having trouble with the difference part of the equation where you separate out upward and downward calculations
到目前为止我做对了吗?我在等式的不同部分遇到问题,您将向上和向下的计算分开
采纳答案by behzad.nouri
dUp= delta[delta > 0]
dDown= delta[delta < 0]
also you need something like:
你还需要类似的东西:
RolUp = RolUp.reindex_like(delta, method='ffill')
RolDown = RolDown.reindex_like(delta, method='ffill')
otherwise RS = RolUp / RolDownwill not do what you desire
否则RS = RolUp / RolDown不会做你想做的事
Edit:seems this is a more accurate way of RS calculation:
编辑:似乎这是一种更准确的 RS 计算方式:
# dUp= delta[delta > 0]
# dDown= delta[delta < 0]
# dUp = dUp.reindex_like(delta, fill_value=0)
# dDown = dDown.reindex_like(delta, fill_value=0)
dUp, dDown = delta.copy(), delta.copy()
dUp[dUp < 0] = 0
dDown[dDown > 0] = 0
RolUp = pd.rolling_mean(dUp, n)
RolDown = pd.rolling_mean(dDown, n).abs()
RS = RolUp / RolDown
回答by Shura
def RSI(series):
delta = series.diff()
u = delta * 0
d = u.copy()
i_pos = delta > 0
i_neg = delta < 0
u[i_pos] = delta[i_pos]
d[i_neg] = delta[i_neg]
rs = moments.ewma(u, span=27) / moments.ewma(d, span=27)
return 100 - 100 / (1 + rs)
回答by Jev
You can use rolling_apply in combination with a subfunction to make a clean function like this:
您可以将 rolling_apply 与子函数结合使用来创建一个像这样的干净函数:
def rsi(price, n=14):
''' rsi indicator '''
gain = (price-price.shift(1)).fillna(0) # calculate price gain with previous day, first row nan is filled with 0
def rsiCalc(p):
# subfunction for calculating rsi for one lookback period
avgGain = p[p>0].sum()/n
avgLoss = -p[p<0].sum()/n
rs = avgGain/avgLoss
return 100 - 100/(1+rs)
# run for all periods with rolling_apply
return pd.rolling_apply(gain,n,rsiCalc)
回答by Bill
I too had this question and was working down the rolling_apply path that Jevtook. However, when I tested my results, they didn't match up against the commercial stock charting programs I use, such as StockCharts.com or thinkorswim. So I did some digging and discovered that when Welles Wilder created the RSI, he used a smoothing technique now referred to as Wilder Smoothing. The commercial services above use Wilder Smoothing rather than a simple moving average to calculate the average gains and losses.
我也有这个问题,正在按照Jev 所采用的 rolling_apply 路径进行工作。然而,当我测试我的结果时,它们与我使用的商业股票图表程序不匹配,例如 StockCharts.com 或 thinkorswim。所以我做了一些挖掘,发现当 Welles Wilder 创建 RSI 时,他使用了一种现在称为 Wilder Smoothing 的平滑技术。上述商业服务使用 Wilder Smoothing 而不是简单的移动平均线来计算平均收益和损失。
I'm new to Python (and Pandas), so I'm wondering if there's some brilliant way to refactor out the forloop below to make it faster. Maybe someone else can comment on that possibility.
我是 Python(和 Pandas)的新手,所以我想知道是否有一些绝妙的方法可以重构下面的for循环以使其更快。也许其他人可以评论这种可能性。
I hope you find this useful.
希望这个对你有帮助。
def get_rsi_timeseries(prices, n=14):
# RSI = 100 - (100 / (1 + RS))
# where RS = (Wilder-smoothed n-period average of gains / Wilder-smoothed n-period average of -losses)
# Note that losses above should be positive values
# Wilder-smoothing = ((previous smoothed avg * (n-1)) + current value to average) / n
# For the very first "previous smoothed avg" (aka the seed value), we start with a straight average.
# Therefore, our first RSI value will be for the n+2nd period:
# 0: first delta is nan
# 1:
# ...
# n: lookback period for first Wilder smoothing seed value
# n+1: first RSI
# First, calculate the gain or loss from one price to the next. The first value is nan so replace with 0.
deltas = (prices-prices.shift(1)).fillna(0)
# Calculate the straight average seed values.
# The first delta is always zero, so we will use a slice of the first n deltas starting at 1,
# and filter only deltas > 0 to get gains and deltas < 0 to get losses
avg_of_gains = deltas[1:n+1][deltas > 0].sum() / n
avg_of_losses = -deltas[1:n+1][deltas < 0].sum() / n
# Set up pd.Series container for RSI values
rsi_series = pd.Series(0.0, deltas.index)
# Now calculate RSI using the Wilder smoothing method, starting with n+1 delta.
up = lambda x: x if x > 0 else 0
down = lambda x: -x if x < 0 else 0
i = n+1
for d in deltas[n+1:]:
avg_of_gains = ((avg_of_gains * (n-1)) + up(d)) / n
avg_of_losses = ((avg_of_losses * (n-1)) + down(d)) / n
if avg_of_losses != 0:
rs = avg_of_gains / avg_of_losses
rsi_series[i] = 100 - (100 / (1 + rs))
else:
rsi_series[i] = 100
i += 1
return rsi_series
回答by Moot
It is important to note that there are various ways of defining the RSI. It is commonly defined in at least two ways: using a simple moving average (SMA) as above, or using an exponential moving average (EMA). Here's a code snippet that calculates both definitions of RSI and plots them for comparison. I'm discarding the first row after taking the difference, since it is always NaN by definition.
需要注意的是,有多种定义 RSI 的方法。它通常至少以两种方式定义:使用上述简单移动平均线 (SMA),或使用指数移动平均线 (EMA)。这是一个代码片段,它计算 RSI 的两个定义并绘制它们以进行比较。我在取差值后丢弃第一行,因为根据定义它总是 NaN。
Note that when using EMA one has to be careful: since it includes a memory going back to the beginning of the data, the result depends on where you start! For this reason, typically people will add some data at the beginning, say 100 time steps, and then cut off the first 100 RSI values.
请注意,使用 EMA 时必须小心:因为它包含一个可以追溯到数据开头的内存,结果取决于您从哪里开始!出于这个原因,通常人们会在开始时添加一些数据,比如 100 个时间步长,然后切断前 100 个 RSI 值。
In the plot below, one can see the difference between the RSI calculated using SMA and EMA: the SMA one tends to be more sensitive. Note that the RSI based on EMA has its first finite value at the first time step (which is the second time step of the original period, due to discarding the first row), whereas the RSI based on SMA has its first finite value at the 14th time step. This is because by default rolling_mean() only returns a finite value once there are enough values to fill the window.
在下图中,可以看到使用 SMA 和 EMA 计算的 RSI 之间的差异:SMA 往往更敏感。请注意,基于 EMA 的 RSI 在第一个时间步长(这是原始周期的第二个时间步长,由于丢弃了第一行)具有第一个有限值,而基于 SMA 的 RSI 在第一个有限值第 14 步。这是因为默认情况下,rolling_mean() 仅在有足够的值来填充窗口时才返回一个有限值。
import pandas
import pandas_datareader.data as web
import datetime
import matplotlib.pyplot as plt
# Window length for moving average
window_length = 14
# Dates
start = '2010-01-01'
end = '2013-01-27'
# Get data
data = web.DataReader('AAPL', 'yahoo', start, end)
# Get just the adjusted close
close = data['Adj Close']
# Get the difference in price from previous step
delta = close.diff()
# Get rid of the first row, which is NaN since it did not have a previous
# row to calculate the differences
delta = delta[1:]
# Make the positive gains (up) and negative gains (down) Series
up, down = delta.copy(), delta.copy()
up[up < 0] = 0
down[down > 0] = 0
# Calculate the EWMA
roll_up1 = up.ewm(span=window_length).mean()
roll_down1 = down.abs().ewm(span=window_length).mean()
# Calculate the RSI based on EWMA
RS1 = roll_up1 / roll_down1
RSI1 = 100.0 - (100.0 / (1.0 + RS1))
# Calculate the SMA
roll_up2 = up.rolling(window_length).mean()
roll_down2 = down.abs().rolling(window_length).mean()
# Calculate the RSI based on SMA
RS2 = roll_up2 / roll_down2
RSI2 = 100.0 - (100.0 / (1.0 + RS2))
# Compare graphically
plt.figure(figsize=(8, 6))
RSI1.plot()
RSI2.plot()
plt.legend(['RSI via EWMA', 'RSI via SMA'])
plt.show()
回答by AAV
My answer is tested on StockCharts sample data.
我的答案是在 StockCharts 样本数据上进行测试的。
[StockChart RSI info][1]http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:relative_strength_index_rsi
[StockChart RSI 信息][1] http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:relative_strength_index_rsi
def RSI(series, period):
delta = series.diff().dropna()
u = delta * 0
d = u.copy()
u[delta > 0] = delta[delta > 0]
d[delta < 0] = -delta[delta < 0]
u[u.index[period-1]] = np.mean( u[:period] ) #first value is sum of avg gains
u = u.drop(u.index[:(period-1)])
d[d.index[period-1]] = np.mean( d[:period] ) #first value is sum of avg losses
d = d.drop(d.index[:(period-1)])
rs = pd.stats.moments.ewma(u, com=period-1, adjust=False) / \
pd.stats.moments.ewma(d, com=period-1, adjust=False)
return 100 - 100 / (1 + rs)
#sample data from StockCharts
data = pd.Series( [ 44.34, 44.09, 44.15, 43.61,
44.33, 44.83, 45.10, 45.42,
45.84, 46.08, 45.89, 46.03,
45.61, 46.28, 46.28, 46.00,
46.03, 46.41, 46.22, 45.64 ] )
print RSI( data, 14 )
#output
14 70.464135
15 66.249619
16 66.480942
17 69.346853
18 66.294713
19 57.915021
回答by pythonguy
# Relative Strength Index
# Avg(PriceUp)/(Avg(PriceUP)+Avg(PriceDown)*100
# Where: PriceUp(t)=1*(Price(t)-Price(t-1)){Price(t)- Price(t-1)>0};
# PriceDown(t)=-1*(Price(t)-Price(t-1)){Price(t)- Price(t-1)<0};
# Change the formula for your own requirement
def rsi(values):
up = values[values>0].mean()
down = -1*values[values<0].mean()
return 100 * up / (up + down)
stock['RSI_6D'] = stock['Momentum_1D'].rolling(center=False,window=6).apply(rsi)
stock['RSI_12D'] = stock['Momentum_1D'].rolling(center=False,window=12).apply(rsi)
Momentum_1D = Pt - P(t-1) where P is closing price and t is date
Momentum_1D = Pt - P(t-1) 其中 P 是收盘价,t 是日期
回答by Mott The Tuple
You can get a massive speed up of Bill's answer by using numba. 100 loops of 20k row series( regular = 113 seconds, numba = 0.28 seconds ). Numba excels with loops and arithmetic.
通过使用 numba,您可以大大加快 Bill 的回答速度。20k 行系列的 100 个循环(常规 = 113 秒,numba = 0.28 秒)。Numba 擅长循环和算术。
import numpy as np
import numba as nb
@nb.jit(fastmath=True, nopython=True)
def calc_rsi( array, deltas, avg_gain, avg_loss, n ):
# Use Wilder smoothing method
up = lambda x: x if x > 0 else 0
down = lambda x: -x if x < 0 else 0
i = n+1
for d in deltas[n+1:]:
avg_gain = ((avg_gain * (n-1)) + up(d)) / n
avg_loss = ((avg_loss * (n-1)) + down(d)) / n
if avg_loss != 0:
rs = avg_gain / avg_loss
array[i] = 100 - (100 / (1 + rs))
else:
array[i] = 100
i += 1
return array
def get_rsi( array, n = 14 ):
deltas = np.append([0],np.diff(array))
avg_gain = np.sum(deltas[1:n+1].clip(min=0)) / n
avg_loss = -np.sum(deltas[1:n+1].clip(max=0)) / n
array = np.empty(deltas.shape[0])
array.fill(np.nan)
array = calc_rsi( array, deltas, avg_gain, avg_loss, n )
return array
rsi = get_rsi( array or series, 14 )
回答by Ritik Gupta
rsi_Indictor(close,n_days):
rsi_series = pd.DataFrame(close)
# Change = close[i]-Change[i-1]
rsi_series["Change"] = (rsi_series["Close"] - rsi_series["Close"].shift(1)).fillna(0)
# Upword Movement
rsi_series["Upword Movement"] = (rsi_series["Change"][rsi_series["Change"] >0])
rsi_series["Upword Movement"] = rsi_series["Upword Movement"].fillna(0)
# Downword Movement
rsi_series["Downword Movement"] = (abs(rsi_series["Change"])[rsi_series["Change"] <0]).fillna(0)
rsi_series["Downword Movement"] = rsi_series["Downword Movement"].fillna(0)
#Average Upword Movement
# For first Upword Movement Mean of first n elements.
rsi_series["Average Upword Movement"] = 0.00
rsi_series["Average Upword Movement"][n] = rsi_series["Upword Movement"][1:n+1].mean()
# For Second onwords
for i in range(n+1,len(rsi_series),1):
#print(rsi_series["Average Upword Movement"][i-1],rsi_series["Upword Movement"][i])
rsi_series["Average Upword Movement"][i] = (rsi_series["Average Upword Movement"][i-1]*(n-1)+rsi_series["Upword Movement"][i])/n
#Average Downword Movement
# For first Downword Movement Mean of first n elements.
rsi_series["Average Downword Movement"] = 0.00
rsi_series["Average Downword Movement"][n] = rsi_series["Downword Movement"][1:n+1].mean()
# For Second onwords
for i in range(n+1,len(rsi_series),1):
#print(rsi_series["Average Downword Movement"][i-1],rsi_series["Downword Movement"][i])
rsi_series["Average Downword Movement"][i] = (rsi_series["Average Downword Movement"][i-1]*(n-1)+rsi_series["Downword Movement"][i])/n
#Relative Index
rsi_series["Relative Strength"] = (rsi_series["Average Upword Movement"]/rsi_series["Average Downword Movement"]).fillna(0)
#RSI
rsi_series["RSI"] = 100 - 100/(rsi_series["Relative Strength"]+1)
return rsi_series.round(2)
回答by Christoph
You can also use the following. If statements will ensure the first RSI value is calculated differently (and properly) from the rest of the values. In the end, all NaN values will be replaced with blanks.
您还可以使用以下内容。If 语句将确保第一个 RSI 值的计算方式与其余值不同(且正确)。最后,所有 NaN 值都将替换为空格。
This assumes you have already imported pandas and your dataframe is df. The only additional data required is a column of Close prices which is labeled as Close. You can reference this column as df.Close, however, sometimes you may have multiple word with space separators as a column header (which requires df['word1 word2'] format). As a consistent practice I always use the df['Close'] format.
这假设您已经导入了熊猫并且您的数据框是 df。唯一需要的附加数据是一列收盘价,标记为收盘价。您可以将此列引用为 df.Close,但是,有时您可能有多个带有空格分隔符的单词作为列标题(这需要 df['word1 word2'] 格式)。作为一贯的做法,我总是使用 df['Close'] 格式。
import numpy as np
# Calculate change in closing prices day over day
df['Delta'] = df['Close'].diff(periods=1, axis=0)
# Calculate if difference in close is Gain
conditions = [df['Delta'] <= 0, df['Delta'] > 0]
choices = [0, df['Delta']]
df['ClGain'] = np.select(conditions, choices)
# Calculate if difference in close is Loss
conditions = [df['Delta'] >= 0, df['Delta'] < 0]
choices = [0, -df['Delta']]
df['ClLoss'] = np.select(conditions, choices)
# Determine periods to calculate RSI over
rsi_n = 9
# Calculate Avg Gain over n periods
conditions = [df.index < rsi_n, df.index == rsi_n, df.index > rsi_n]
choices = ["", df['ClGain'].rolling(rsi_n).mean(), ((df['AvgGain'].shift(1) * (rsi_n - 1)) + df['ClGain']) / rsi_n]
df['AvgGain'] = np.select(conditions, choices)
# Calculate Avg Loss over n periods
conditions = [df.index < rsi_n, df.index == rsi_n, df.index > rsi_n]
choices = ["", df['ClLoss'].rolling(rsi_n).mean(), ((df['AvgLoss'].shift(1) * (rsi_n - 1)) + df['ClLoss']) / rsi_n]
df['AvgLoss'] = np.select(conditions, choices)
# Calculate RSI
df['RSI'] = 100-(100 / (1 + (df['AvgGain'] / df['AvgLoss'])))
# Replace NaN cells with blanks
df = df.replace(np.nan, "", regex=True)
# (OPTIONAL) Remove columns used to create RSI
del df['Delta']
del df['ClGain']
del df['ClLoss']
del df['AvgGain']
del df['AvgLoss']


