pandas 用 Python 编写指数移动平均线

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45665217/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:14:31  来源:igfitidea点击:

Coding the exponential moving average with Python

pythonpython-3.xpandasmathfinance

提问by Antoine Coppin

I want to do calculations on three columns of a dataframe df. In order to do that I want run a price of assets (cryptocurrencies) list in a three column table in order to calculate the exponential moving average of them after having enough data.

我想对数据框的三列进行计算df。为了做到这一点,我想在三列表中运行资产(加密货币)列表的价格,以便在拥有足够数据后计算它们的指数移动平均值。

def calculateAllEMA(self,values_array):
    df = pd.DataFrame(values_array, columns=['BTC', 'ETH', 'DASH'])
    column_by_search = ["BTC", "ETH", "DASH"]
    print(df)
    for i,column in enumerate(column_by_search):
        ema=[]
        # over and over for each day that follows day 23 to get the full range of EMA
        for j in range(0, len(column)-24):
            # Add the closing prices for the first 22 days together and divide them by 22.
            EMA_yesterday = column.iloc[1+j:22+j].mean()
            k = float(2)/(22+1)
            # getting the first EMA day by taking the following day's (day 23) closing price multiplied by k, then multiply the previous day's moving average by (1-k) and add the two.
            ema.append(column.iloc[23 + j]*k+EMA_yesterday*(1-k))
        print("ema")
        print(ema)
        mean_exp[i] = ema[-1]
    return mean_exp

Yet, when I print what's in len(column)-24I get -21 (-24 + 3 ?). I can't therefore go through the loop. How can I cope with this error to get exponential moving average of the assets ?

然而,当我打印内容时,len(column)-24我得到 -21 (-24 + 3 ?)。因此我无法通过循环。我该如何处理这个错误以获得资产的指数移动平均值?

I tried to apply this link from iexplain.comfor the pseudo code of the exponential moving average.

我尝试将此链接从 iexplain.com应用于指数移动平均线的伪代码。

If you have any easier idea, I'm open to hear it.

如果您有任何更简单的想法,我愿意倾听。

Here is the data that I use to calculate it when it bugs :

这是我用来计算错误时的数据:

        BTC     ETH    DASH
0   4044.59  294.40  196.97
1   4045.25  294.31  196.97
2   4044.59  294.40  196.97
3   4045.25  294.31  196.97
4   4044.59  294.40  196.97
5   4045.25  294.31  196.97
6   4044.59  294.40  196.97
7   4045.25  294.31  196.97
8   4045.25  294.31  196.97
9   4044.59  294.40  196.97
10  4045.25  294.31  196.97
11  4044.59  294.40  196.97
12  4045.25  294.31  196.97
13  4045.25  294.32  197.07
14  4045.25  294.31  196.97
15  4045.41  294.46  197.07
16  4045.25  294.41  197.07
17  4045.41  294.41  197.07
18  4045.41  294.47  197.07
19  4045.25  294.41  197.07
20  4045.25  294.32  197.07
21  4045.43  294.35  197.07
22  4045.41  294.46  197.07
23  4045.25  294.41  197.07

回答by vestland

pandas.stats.moments.ewmafrom the original answer has been deprecated.

pandas.stats.moments.ewma从原始答案已被弃用。

Instead you can use pandas.DataFrame.ewmas documented here.

相反,您可以pandas.DataFrame.ewm按照此处的说明使用。



Below is a complete snippet with random data that builds a dataframe with calculated ewmas from specified columns.

下面是一个包含随机数据的完整片段,它使用来自指定列的计算 ewmas 构建数据框。

Code:

代码:

# imports
import pandas as pd
import numpy as np

np.random.seed(123)

rows = 50
df = pd.DataFrame(np.random.randint(90,110,size=(rows, 3)), columns=['BTC', 'ETH', 'DASH'])
datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=rows).tolist()
df['dates'] = datelist 
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)

def ewmas(df, win, keepSource):
    """Add exponentially weighted moving averages for all columns in a dataframe.

    Arguments: 
    df -- pandas dataframe
    win -- length of ewma estimation window
    keepSource -- True or False for keep or drop source data in output dataframe

    """

    df_temp = df.copy()

    # Manage existing column names
    colNames = list(df_temp.columns.values).copy()
    removeNames = colNames.copy()

    i = 0
    for col in colNames:

        # Make new names for ewmas
        ewmaName = colNames[i] + '_ewma_' + str(win)   

        # Add ewmas
        #df_temp[ewmaName] = pd.stats.moments.ewma(df[colNames[i]], span = win)
        df_temp[ewmaName] = df[colNames[i]].ewm(span = win, adjust=True).mean()

        i = i + 1

    # Remove estimates with insufficient window length
    df_temp = df_temp.iloc[win:]

    # Remove or keep source data
    if keepSource == False:
        df_temp = df_temp.drop(removeNames,1)

    return df_temp

# Test run
df_new = ewmas(df = df, win = 22, keepSource = True)
print(df_new.tail())

Output:

输出:

             BTC  ETH   DASH  BTC_ewma_22  ETH_ewma_22    DASH_ewma_22
dates                                                             
2017-02-15   91   96    98    98.752431    100.081052     97.926787
2017-02-16  100  102   102    98.862445    100.250270     98.285973
2017-02-17  100  107    97    98.962634    100.844749     98.172712
2017-02-18  103  102    91    99.317826    100.946384     97.541684
2017-02-19   99  104    91    99.289894    101.214755     96.966758

Plotusing df_new[['BTC', 'BTC_ewma_22']].plot():

绘图使用df_new[['BTC', 'BTC_ewma_22']].plot()

enter image description here

在此处输入图片说明

回答by MathiasL

In your loop for i,column in enumerate(column_by_search):you iterate over the elements in your column_by_searchlist, that is column takes on the values "BTC", "ETH", "DASH" in turn. Thus, len(column)will give you the length of the string "BTC", which is 3 in fact.

在您的循环中,for i,column in enumerate(column_by_search):您迭代column_by_search列表中的元素,即列依次采用值“BTC”、“ETH”、“DASH”。因此,len(column)将为您提供字符串“BTC”的长度,实际上是 3。

Try df[column]instead, that will return a list with the elements in the desired column and you can iterate over it.

试试吧df[column],这将返回一个包含所需列中元素的列表,您可以对其进行迭代。