Python中的时间序列分解函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20672236/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:59:50  来源:igfitidea点击:

Time Series Decomposition function in Python

pythontime-series

提问by user3084006

Time series decomposition is a method that separates a time-series data set into three (or more) components. For example:

时间序列分解是一种将时间序列数据集分成三个(或更多)组件的方法。例如:

x(t) = s(t) + m(t) + e(t)

where

在哪里

t is the time coordinate
x is the data
s is the seasonal component
e is the random error term
m is the trend

In R I would do the functions decomposeand stl. How would I do this in python?

在 RI 中将执行功能decomposestl. 我将如何在 python 中做到这一点?

回答by Matt

Have you been introduced to scipyyet? From what I've seen in a few PDFs/sites

你已经被介绍给scipy了吗?从我在一些 PDF/网站中看到的

Hereand Here

这里这里

it's doable. But without seeing a specific example it would be hard for someone to show you a code example. Scipyis awesome I use it in my research stuff, still haven't been let down by it.

这是可行的。但是如果没有看到一个具体的例子,有人很难向你展示一个代码示例。Scipy太棒了我在我的研究中使用它,仍然没有被它失望。

回答by cast42

You can call R functions from python using rpy2Install rpy2 using pip with: pip install rpy2 Then use this wrapper: https://gist.github.com/andreas-h/7808564to call the STL functionality provided by R

您可以使用rpy2从 python 调用 R 函数 使用 pip 安装 rpy2 使用: pip install rpy2 然后使用此包装器:https://gist.github.com/andreas-h/7808564调用 R 提供的 STL 功能

回答by AN6U5

I've been having a similar issue and am trying to find the best path forward. Try moving your data into a PandasDataFrame and then call StatsModelstsa.seasonal_decompose. See the following example:

我一直有类似的问题,正在努力寻找最佳的前进道路。尝试将您的数据移动到PandasDataFrame 中,然后调用StatsModelstsa.seasonal_decompose。请参阅以下示例

import statsmodels.api as sm

dta = sm.datasets.co2.load_pandas().data
# deal with missing values. see issue
dta.co2.interpolate(inplace=True)

res = sm.tsa.seasonal_decompose(dta.co2)
resplot = res.plot()

Three plots produced from above input

从上面的输入产生的三个图

You can then recover the individual components of the decomposition from:

然后,您可以从以下位置恢复分解的各个组件:

res.resid
res.seasonal
res.trend

I hope this helps!

我希望这有帮助!

回答by Jeff Tilton

I already answered this question here, but below is a quick function on how to do this with rpy2. This enables you to use R's robust statistical decomposition with loess, but in python!

我已经在这里回答了这个问题,但下面是有关如何使用 rpy2 执行此操作的快速功能。这使您能够在 loess 中使用 R 的稳健统计分解,但在 python 中!

    import pandas as pd

    from rpy2.robjects import r, pandas2ri
    import numpy as np
    from rpy2.robjects.packages import importr


def decompose(series, frequency, s_window = 'periodic', log = False,  **kwargs):
    '''
    Decompose a time series into seasonal, trend and irregular components using loess, 
    acronym STL.
    https://www.rdocumentation.org/packages/stats/versions/3.4.3/topics/stl

    params:
        series: a time series

        frequency: the number of observations per “cycle” 
                   (normally a year, but sometimes a week, a day or an hour)
                   https://robjhyndman.com/hyndsight/seasonal-periods/

        s_window: either the character string "periodic" or the span 
                 (in lags) of the loess window for seasonal extraction, 
                 which should be odd and at least 7, according to Cleveland 
                 et al.

        log:    boolean.  take log of series



        **kwargs:  See other params for stl at 
           https://www.rdocumentation.org/packages/stats/versions/3.4.3/topics/stl
    '''

    df = pd.DataFrame()
    df['date'] = series.index
    if log: series = series.pipe(np.log)
    s = [x for x in series.values]
    length = len(series)
    s = r.ts(s, frequency=frequency)
    decomposed = [x for x in r.stl(s, s_window).rx2('time.series')]
    df['observed'] = series.values
    df['trend'] = decomposed[length:2*length]
    df['seasonal'] = decomposed[0:length]
    df['residuals'] = decomposed[2*length:3*length]
    return df

The above function assumes that your series has a datetime index. It returns a dataframe with the individual components that you can then graph with your favorite graphing library.

上述函数假设您的系列具有日期时间索引。它返回一个包含各个组件的数据框,然后您可以使用您最喜欢的图形库绘制这些组件。

You can pass the parameters for stl seen here, but change any period to underscore, for example the positional argument in the above function is s_window, but in the above link it is s.window. Also, I found some of the above code on this repository.

您可以传递此处看到的 stl 的参数,但将任何句点更改为下划线,例如上述函数中的位置参数是 s_window,但在上述链接中它是 s.window。另外,我在这个存储库中找到了上面的一些代码。

Example data

示例数据

Hopefully the below works, honestly haven't tried it since this is a request long after I answered the question.

希望下面的方法有效,老实说还没有尝试过,因为这是我回答问题很久之后的请求。

import pandas as pd
import numpy as np
obs_per_cycle = 52
observations = obs_per_cycle * 3
data = [v+2*i for i,v in enumerate(np.random.normal(5, 1, observations))]
tidx = pd.date_range('2016-07-01', periods=observations, freq='w')
ts = pd.Series(data=data, index=tidx)
df = decompose(ts, frequency=obs_per_cycle, s_window = 'periodic')