pandas pd.rolling_mean 已被弃用 - ndarrays 的替代方案

Question

提问by saladi

It looks like pd.rolling_meanis becoming deprecated for ndarrays,

看起来pd.rolling_mean正在被弃用ndarrays，

 pd.rolling_mean(x, window=2, center=False)
FutureWarning: pd.rolling_mean is deprecated for ndarrays and will be removed in a future version

 pd.rolling_mean(x, window=2, center=False)
FutureWarning：不推荐使用 pd.rolling_mean ndarrays，并将在未来版本中删除

but it seems to be the fastest way of doing this, according to this SO answer.

但根据this SO answer，这似乎是最快的方法。

Are there now new ways of doing this directly with SciPy or NumPy that are as fast as pd.rolling_mean?

现在是否有新的方法可以直接使用 SciPy 或 NumPy 执行此操作，速度与一样快pd.rolling_mean？

Answer 1

回答by saladi

EDIT -- Unfortunately, it looks like the new way is not nearly as fast:

编辑 - 不幸的是，新方法看起来并没有那么快：

New version of Pandas:

新版Pandas：

In [1]: x = np.random.uniform(size=100)

In [2]: %timeit pd.rolling_mean(x, window=2)
1000 loops, best of 3: 240 μs per loop

In [3]: %timeit pd.Series(x).rolling(window=2).mean()
1000 loops, best of 3: 226 μs per loop

In [4]: pd.__version__
Out[4]: '0.18.0'

Old version:

旧版本：

In [1]: x = np.random.uniform(size=100)

In [2]: %timeit pd.rolling_mean(x,window=2)
100000 loops, best of 3: 12.4 μs per loop

In [3]: pd.__version__
Out[3]: u'0.17.1'

Answer 2

回答by maxymoo

Looks like the new way is via methods on the DataFrame.rollingclass (I guess you're meant to think of it sort of like a groupby): http://pandas.pydata.org/pandas-docs/version/0.18.0/whatsnew.html

看起来新方法是通过DataFrame.rolling类上的方法（我猜你应该把它想象成一个groupby）：http: //pandas.pydata.org/pandas-docs/version/0.18.0/whatsnew。 html

e.g.

例如

x.rolling(window=2).mean()

Answer 3

回答by Pruce Uchiha

try this

尝试这个

x.rolling(window=2, center=False).mean()

Answer 4

回答by moi

I suggest scipy.ndimage.filters.uniform_filter1dlike in my answerto the linked question. It is also way faster for large arrays:

我建议scipy.ndimage.filters.uniform_filter1d我喜欢回答的链接的问题。对于大型数组，它也更快：

import numpy as np
from scipy.ndimage.filters import uniform_filter1d
N = 1000
x = np.random.random(100000)

%timeit pd.rolling_mean(x, window=N)
__main__:257: FutureWarning: pd.rolling_mean is deprecated for ndarrays and will be removed in a future version
The slowest run took 84.55 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 7.37 ms per loop

%timeit uniform_filter1d(x, size=N)
10000 loops, best of 3: 190 μs per loop

Answer 5

回答by heltonbiker

If your dimensions are homogeneous, you could try to implement an n-dimensional form of the Summed Area Tableused for bidimensional images:

如果您的维度是同质的，您可以尝试实现用于二维图像的总面积表的 n 维形式：

A summed area table is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid.

总面积表是一种数据结构和算法，用于快速有效地生成网格的矩形子集中的值的总和。

Then, in this order, you could:

然后，按此顺序，您可以：

Create the summed area table ("integral") of your array;
Iterate to get the (quite cheap) sum of a n-dimensional kernel at a given position;
Divide by the size of the n-dimensional volume of the kernel.

创建阵列的总面积表（“积分”）；
迭代以获取给定位置的 n 维内核的（相当便宜的）总和；
除以内核的 n 维体积的大小。

Unfortunately I cannot know if this is efficient or not, but the by the given premise, it should be.

不幸的是，我不知道这是否有效，但根据给定的前提，它应该是有效的。

pandas pd.rolling_mean 已被弃用 - ndarrays 的替代方案

提问by saladi

回答by saladi

回答by maxymoo

回答by Pruce Uchiha

回答by moi

回答by heltonbiker

相关推荐

最近更新

标签

pandas pd.rolling_mean 已被弃用 - ndarrays 的替代方案

提问by saladi

回答by saladi

回答by maxymoo

回答by Pruce Uchiha

回答by moi

回答by heltonbiker

相关推荐

pandas 如何从pandas groupby中的多列中获取唯一值

Pandas 列的 To_CSV 唯一值

pandas 从数据透视表绘制熊猫

pandas 使用 matplotlib 在条形图上添加值标签

相关推荐

最近更新

标签