Python 和 Pandas - 移动平均交叉
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28345261/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python and Pandas - Moving Average Crossover
提问by chilliq
There is a Pandas DataFrame object with some stock data. SMAs are moving averages calculated from previous 45/15 days.
有一个包含一些股票数据的 Pandas DataFrame 对象。SMA 是从前 45/15 天计算的移动平均线。
Date Price SMA_45 SMA_15
20150127 102.75 113 106
20150128 103.05 100 106
20150129 105.10 112 105
20150130 105.35 111 105
20150202 107.15 111 105
20150203 111.95 110 105
20150204 111.90 110 106
I want to find all dates, when SMA_15 and SMA_45 intersect.
我想找到 SMA_15 和 SMA_45 相交的所有日期。
Can it be done efficiently using Pandas or Numpy? How?
可以使用 Pandas 或 Numpy 有效地完成吗?如何?
EDIT:
编辑:
What I mean by 'intersection':
我所说的“交叉点”是什么意思:
The data row, when:
数据行,当:
- long SMA(45) value was bigger than short SMA(15) value for longer than short SMA period(15) and it became smaller.
- long SMA(45) value was smaller than short SMA(15) value for longer than short SMA period(15) and it became bigger.
- 长 SMA(45) 值大于短 SMA(15) 值,长于短 SMA 周期 (15) 且变小。
- 长 SMA(45) 值小于短 SMA(15) 值,长于短 SMA 周期 (15) 并变大。
回答by unutbu
I'm taking a crossover to mean when the SMA lines -- as functions of time -- intersect, as depicted on this investopedia page.
我采用交叉的意思是当 SMA 线(作为时间的函数)相交时,如本投资百科页面所示。


Since the SMAs represent continuous functions, there is a crossing when, for a given row, (SMA_15 is less than SMA_45) and (the previous SMA_15 is greater than the previous SMA_45) -- or vice versa.
由于 SMA 代表连续函数,因此对于给定的行,当(SMA_15 小于 SMA_45)和(前一个 SMA_15 大于前一个 SMA_45)时存在交叉——反之亦然。
In code, that could be expressed as
在代码中,可以表示为
previous_15 = df['SMA_15'].shift(1)
previous_45 = df['SMA_45'].shift(1)
crossing = (((df['SMA_15'] <= df['SMA_45']) & (previous_15 >= previous_45))
| ((df['SMA_15'] >= df['SMA_45']) & (previous_15 <= previous_45)))
If we change your data to
如果我们将您的数据更改为
Date Price SMA_45 SMA_15
20150127 102.75 113 106
20150128 103.05 100 106
20150129 105.10 112 105
20150130 105.35 111 105
20150202 107.15 111 105
20150203 111.95 110 105
20150204 111.90 110 106
so that there are crossings,
所以有交叉点,


then
然后
import pandas as pd
df = pd.read_table('data', sep='\s+')
previous_15 = df['SMA_15'].shift(1)
previous_45 = df['SMA_45'].shift(1)
crossing = (((df['SMA_15'] <= df['SMA_45']) & (previous_15 >= previous_45))
| ((df['SMA_15'] >= df['SMA_45']) & (previous_15 <= previous_45)))
crossing_dates = df.loc[crossing, 'Date']
print(crossing_dates)
yields
产量
1 20150128
2 20150129
Name: Date, dtype: int64
回答by Akshay
As an alternative to the unutbu's answer, something like below can also be done to find the indices where SMA_15crosses SMA_45.
作为unutbu答案的替代方案,还可以执行以下操作来找到SMA_15crosses的索引SMA_45。
diff = df['SMA_15'] < df['SMA_45']
diff_forward = diff.shift(1)
crossing = np.where(abs(diff - diff_forward) == 1)[0]
print(crossing)
>>> [1,2]
print(df.iloc[crossing])
>>>
Date Price SMA_15 SMA_45
1 20150128 103.05 100 106
2 20150129 105.10 112 105
回答by Jeril
The following methods gives the similar results, but takes less time than the previous methods:
以下方法给出了类似的结果,但比以前的方法花费的时间更少:
df['position'] = df['SMA_15'] > df['SMA_45']
df['pre_position'] = df['position'].shift(1)
df.dropna(inplace=True) # dropping the NaN values
df['crossover'] = np.where(df['position'] == df['pre_position'], False, True)
Time taken for this approach:
2.7 ms ± 310 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)Time taken for previous approach:
3.46 ms ± 307 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
这种方法所用的时间:
2.7 ms ± 310 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)先前方法所花费的时间:
3.46 ms ± 307 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

