pandas pd.rolling_mean 已被弃用 - ndarrays 的替代方案

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36274447/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:57:23  来源:igfitidea点击:

pd.rolling_mean becoming deprecated - alternatives for ndarrays

pythonnumpypandasscipymean

提问by saladi

It looks like pd.rolling_meanis becoming deprecated for ndarrays,

看起来pd.rolling_mean正在被弃用ndarrays

 pd.rolling_mean(x, window=2, center=False)

FutureWarning: pd.rolling_mean is deprecated for ndarrays and will be removed in a future version

 pd.rolling_mean(x, window=2, center=False)

FutureWarning:不推荐使用 pd.rolling_mean ndarrays,并将在未来版本中删除

but it seems to be the fastest way of doing this, according to this SO answer.

但根据this SO answer,这似乎是最快的方法。

Are there now new ways of doing this directly with SciPy or NumPy that are as fast as pd.rolling_mean?

现在是否有新的方法可以直接使用 SciPy 或 NumPy 执行此操作,速度与 一样快pd.rolling_mean

回答by saladi

EDIT -- Unfortunately, it looks like the new way is not nearly as fast:

编辑 - 不幸的是,新方法看起来并没有那么快:

New version of Pandas:

新版Pandas:

In [1]: x = np.random.uniform(size=100)

In [2]: %timeit pd.rolling_mean(x, window=2)
1000 loops, best of 3: 240 μs per loop

In [3]: %timeit pd.Series(x).rolling(window=2).mean()
1000 loops, best of 3: 226 μs per loop

In [4]: pd.__version__
Out[4]: '0.18.0'

Old version:

旧版本:

In [1]: x = np.random.uniform(size=100)

In [2]: %timeit pd.rolling_mean(x,window=2)
100000 loops, best of 3: 12.4 μs per loop

In [3]: pd.__version__
Out[3]: u'0.17.1'

回答by maxymoo

Looks like the new way is via methods on the DataFrame.rollingclass (I guess you're meant to think of it sort of like a groupby): http://pandas.pydata.org/pandas-docs/version/0.18.0/whatsnew.html

看起来新方法是通过DataFrame.rolling类上的方法(我猜你应该把它想象成一个groupby):http: //pandas.pydata.org/pandas-docs/version/0.18.0/whatsnew。 html

e.g.

例如

x.rolling(window=2).mean()

回答by Pruce Uchiha

try this

尝试这个

x.rolling(window=2, center=False).mean()

回答by moi

I suggest scipy.ndimage.filters.uniform_filter1dlike in my answerto the linked question. It is also way faster for large arrays:

我建议scipy.ndimage.filters.uniform_filter1d我喜欢回答的链接的问题。对于大型数组,它也更快:

import numpy as np
from scipy.ndimage.filters import uniform_filter1d
N = 1000
x = np.random.random(100000)

%timeit pd.rolling_mean(x, window=N)
__main__:257: FutureWarning: pd.rolling_mean is deprecated for ndarrays and will be removed in a future version
The slowest run took 84.55 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 7.37 ms per loop

%timeit uniform_filter1d(x, size=N)
10000 loops, best of 3: 190 μs per loop

回答by heltonbiker

If your dimensions are homogeneous, you could try to implement an n-dimensional form of the Summed Area Tableused for bidimensional images:

如果您的维度是同质的,您可以尝试实现用于二维图像的总面积表的 n 维形式:

A summed area table is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid.

总面积表是一种数据结构和算法,用于快速有效地生成网格的矩形子集中的值的总和。

Then, in this order, you could:

然后,按此顺序,您可以:

  1. Create the summed area table ("integral") of your array;
  2. Iterate to get the (quite cheap) sum of a n-dimensional kernel at a given position;
  3. Divide by the size of the n-dimensional volume of the kernel.
  1. 创建阵列的总面积表(“积分”);
  2. 迭代以获取给定位置的 n 维内核的(相当便宜的)总和;
  3. 除以内核的 n 维体积的大小。

Unfortunately I cannot know if this is efficient or not, but the by the given premise, it should be.

不幸的是,我不知道这是否有效,但根据给定的前提,它应该是有效的。