Python 和 Pandas - 移动平均交叉

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28345261/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:55:08  来源:igfitidea点击:

Python and Pandas - Moving Average Crossover

pythonnumpypandas

提问by chilliq

There is a Pandas DataFrame object with some stock data. SMAs are moving averages calculated from previous 45/15 days.

有一个包含一些股票数据的 Pandas DataFrame 对象。SMA 是从前 45/15 天计算的移动平均线。

Date      Price   SMA_45      SMA_15
20150127  102.75  113         106
20150128  103.05  100         106
20150129  105.10  112         105
20150130  105.35  111         105
20150202  107.15  111         105
20150203  111.95  110         105
20150204  111.90  110         106

I want to find all dates, when SMA_15 and SMA_45 intersect.

我想找到 SMA_15 和 SMA_45 相交的所有日期。

Can it be done efficiently using Pandas or Numpy? How?

可以使用 Pandas 或 Numpy 有效地完成吗?如何?

EDIT:

编辑:

What I mean by 'intersection':

我所说的“交叉点”是什么意思:

The data row, when:

数据行,当:

  • long SMA(45) value was bigger than short SMA(15) value for longer than short SMA period(15) and it became smaller.
  • long SMA(45) value was smaller than short SMA(15) value for longer than short SMA period(15) and it became bigger.
  • 长 SMA(45) 值大于短 SMA(15) 值,长于短 SMA 周期 (15) 且变小。
  • 长 SMA(45) 值小于短 SMA(15) 值,长于短 SMA 周期 (15) 并变大。

回答by unutbu

I'm taking a crossover to mean when the SMA lines -- as functions of time -- intersect, as depicted on this investopedia page.

我采用交叉的意思是当 SMA 线(作为时间的函数)相交时,如本投资百科页面所示

enter image description here

在此处输入图片说明

Since the SMAs represent continuous functions, there is a crossing when, for a given row, (SMA_15 is less than SMA_45) and (the previous SMA_15 is greater than the previous SMA_45) -- or vice versa.

由于 SMA 代表连续函数,因此对于给定的行,当(SMA_15 小于 SMA_45)和(前一个 SMA_15 大于前一个 SMA_45)时存在交叉——反之亦然。

In code, that could be expressed as

在代码中,可以表示为

previous_15 = df['SMA_15'].shift(1)
previous_45 = df['SMA_45'].shift(1)
crossing = (((df['SMA_15'] <= df['SMA_45']) & (previous_15 >= previous_45))
            | ((df['SMA_15'] >= df['SMA_45']) & (previous_15 <= previous_45)))

If we change your data to

如果我们将您的数据更改为

Date      Price   SMA_45      SMA_15
20150127  102.75  113         106
20150128  103.05  100         106
20150129  105.10  112         105
20150130  105.35  111         105
20150202  107.15  111         105
20150203  111.95  110         105
20150204  111.90  110         106

so that there are crossings,

所以有交叉点,

enter image description here

在此处输入图片说明

then

然后

import pandas as pd

df = pd.read_table('data', sep='\s+')
previous_15 = df['SMA_15'].shift(1)
previous_45 = df['SMA_45'].shift(1)
crossing = (((df['SMA_15'] <= df['SMA_45']) & (previous_15 >= previous_45))
            | ((df['SMA_15'] >= df['SMA_45']) & (previous_15 <= previous_45)))
crossing_dates = df.loc[crossing, 'Date']
print(crossing_dates)

yields

产量

1    20150128
2    20150129
Name: Date, dtype: int64

回答by Akshay

As an alternative to the unutbu's answer, something like below can also be done to find the indices where SMA_15crosses SMA_45.

作为unutbu答案的替代方案,还可以执行以下操作来找到SMA_15crosses的索引SMA_45

diff = df['SMA_15'] < df['SMA_45']
diff_forward = diff.shift(1)
crossing = np.where(abs(diff - diff_forward) == 1)[0]
print(crossing)
>>> [1,2]

print(df.iloc[crossing])
>>>
       Date   Price  SMA_15  SMA_45
1  20150128  103.05    100    106
2  20150129  105.10    112    105

回答by Jeril

The following methods gives the similar results, but takes less time than the previous methods:

以下方法给出了类似的结果,但比以前的方法花费的时间更少:

df['position'] = df['SMA_15'] > df['SMA_45']
df['pre_position'] = df['position'].shift(1)
df.dropna(inplace=True) # dropping the NaN values
df['crossover'] = np.where(df['position'] == df['pre_position'], False, True)

Time taken for this approach: 2.7 ms ± 310 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Time taken for previous approach: 3.46 ms ± 307 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

这种方法所用的时间: 2.7 ms ± 310 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

先前方法所花费的时间:3.46 ms ± 307 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)