pandas 熊猫的滚动差异

Question

提问by WBM

Does anyone know an efficient function/method such as pandas.rolling_mean, that would calculate the rolling difference of an array

有谁知道一个有效的函数/方法，例如pandas.rolling_mean，可以计算数组的滚动差异

This is my closest solution:

这是我最接近的解决方案：

roll_diff = pd.Series(values).diff(periods=1)

However, it only calculates single-step rolling difference. Ideally the step size would be editable (i.e. difference between current time step and n last steps).

但是，它只计算单步滚动差异。理想情况下，步长是可编辑的（即当前时间步长和最后 n 步之间的差异）。

I've also written this, but for larger arrays, it is quite slow:

我也写过这个，但是对于较大的数组，它很慢：

def roll_diff(values,step):
    diff = []
    for i in np.arange(step, len(values)-1):
        pers_window = np.arange(i-1,i-step-1,-1)
        diff.append(np.abs(values[i] - np.mean(values[pers_window])))
    diff = np.pad(diff, (0, step+1), 'constant')
    return diff

Answer 1

回答by Pierluigi

What about:

关于什么：

import pandas

x = pandas.DataFrame({
    'x_1': [0, 1, 2, 3, 0, 1, 2, 500, ],},
    index=[0, 1, 2, 3, 4, 5, 6, 7])

x['x_1'].rolling(window=2).apply(lambda x: x.iloc[1] - x.iloc[0])

in general you can replace the lambdafunction with your own function. Note that in this case the first item will be NaN.

一般来说，您可以lambda用您自己的函数替换该函数。请注意，在这种情况下，第一项将是NaN。

Update

更新

Defining the following:

定义以下内容：

n_steps = 2
def my_fun(x):
    return x.iloc[-1] - x.iloc[0]

x['x_1'].rolling(window=n_steps).apply(my_fun)

you can compute the differences between values at n_steps.

您可以计算处的值之间的差异n_steps。

Answer 2

回答by Dan

You can do the same thing as in https://stackoverflow.com/a/48345749/1011724if you work directly on the underlying numpy array:

如果您直接在底层 numpy 数组上工作，您可以执行与https://stackoverflow.com/a/48345749/1011724相同的操作：

import numpy as np
diff_kernel = np.array([1,-1])
np.convolve(rs,diff_kernel ,'same')

where rsis your pandas series

rs你的Pandas系列在哪里

Answer 3

回答by Manualmsdos

If you got KeyError: 0, try with iloc:

如果有KeyError: 0，请尝试iloc：

import pandas

x = pandas.DataFrame({
    'x_1': [0, 1, 2, 3, 0, 1, 2, 500, ],},
    index=[0, 1, 2, 3, 4, 5, 6, 7])

x['x_1'].rolling(window=2).apply(lambda x: x.iloc[1] - x.iloc[0])

Answer 4

回答by jpp

This should work:

这应该有效：

import numpy as np

x = np.array([1, 3, 6, 1, -5, 6, 4, 1, 6])

def running_diff(arr, N):
    return np.array([arr[i] - arr[i-N] for i in range(N, len(arr))])

running_diff(x, 4)  # array([-6,  3, -2,  0, 11])

For a given pd.Series, you will have to define what you want for the first few items. The below example just returns the initial series values.

对于给定的pd.Series，您必须为前几项定义您想要的内容。下面的示例仅返回初始系列值。

s_roll_diff = np.hstack((s.values[:4], running_diff(s.values, 4)))

This works because you can assign a np.arraydirectly to a pd.DataFrame, e.g. for a column s, df.s_roll_diff = np.hstack((df.s.values[:4], running_diff(df.s.values, 4)))

这是有效的，因为您可以将 anp.array直接分配给 a pd.DataFrame，例如对于列s，df.s_roll_diff = np.hstack((df.s.values[:4], running_diff(df.s.values, 4)))

pandas 熊猫的滚动差异

提问by WBM

回答by Pierluigi

Update

更新

回答by Dan

回答by Manualmsdos

回答by jpp

相关推荐

最近更新

标签

pandas 熊猫的滚动差异

提问by WBM

回答by Pierluigi

Update

更新

回答by Dan

回答by Manualmsdos

回答by jpp

相关推荐

pandas 带 groupby 的条形图

Pandas - 如何将 RangeIndex 转换为 DateTimeIndex

pandas 如何在pandas数据框中的所有列中获取唯一值

pandas 熊猫数据框。按值和计数分组

相关推荐

最近更新

标签