如何使用来自多列的参数调用 pandas.rolling.apply？

Question

提问by quarkpol

I've got a dataset:

我有一个数据集：

    Open     High      Low    Close        
0  132.960  133.340  132.940  133.105
1  133.110  133.255  132.710  132.755
2  132.755  132.985  132.640  132.735 
3  132.730  132.790  132.575  132.685
4  132.685  132.785  132.625  132.755

I try to use rolling.apply function for all rows, like this:

我尝试对所有行使用 rolling.apply 函数，如下所示：

df['new_col']= df[['Open']].rolling(2).apply(AccumulativeSwingIndex(df['High'],df['Low'],df['Close']))

shows error

显示错误

or

或者

df['new_col']=  df[['Open', 'High', 'Low', 'Close']].rolling(2).apply(AccumulativeSwingIndex)

pass only parameter from column 'Open'

仅传递来自“打开”列的参数

Can anybody help me?

有谁能够帮助我？

Answer 1

回答by piRSquared

Define your own `roll`

定义你自己的 `roll`

We can create a function that takes a window size argument wand any other keyword arguments. We use this to build a new DataFramein which we will call groupbyon while passing on the keyword arguments via kwargs.

我们可以创建一个接受窗口大小参数w和任何其他关键字参数的函数。我们使用它来构建一个新DataFrame的，我们将groupby在通过kwargs.

注意：我不必使用，stride_tricks.as_stridedstride_tricks.as_strided但它很简洁，在我看来是合适的。

from numpy.lib.stride_tricks import as_strided as stride
import pandas as pd

def roll(df, w, **kwargs):
    v = df.values
    d0, d1 = v.shape
    s0, s1 = v.strides

    a = stride(v, (d0 - (w - 1), w, d1), (s0, s0, s1))

    rolled_df = pd.concat({
        row: pd.DataFrame(values, columns=df.columns)
        for row, values in zip(df.index, a)
    })

    return rolled_df.groupby(level=0, **kwargs)

roll(df, 2).mean()

       Open      High       Low    Close
0  133.0350  133.2975  132.8250  132.930
1  132.9325  133.1200  132.6750  132.745
2  132.7425  132.8875  132.6075  132.710
3  132.7075  132.7875  132.6000  132.720

We can also use the pandas.DataFrame.pipemethod to the same effect:

我们也可以使用该pandas.DataFrame.pipe方法达到同样的效果：

df.pipe(roll, w=2).mean()

OLD ANSWER

旧答案

Panelhas been deprecated. See above for updated answer.

Panel已被弃用。有关更新的答案，请参见上文。

see https://stackoverflow.com/a/37491779/2336654

见https://stackoverflow.com/a/37491779/2336654

define our own roll

定义我们自己的 roll

def roll(df, w, **kwargs):
    roll_array = np.dstack([df.values[i:i+w, :] for i in range(len(df.index) - w + 1)]).T
    panel = pd.Panel(roll_array, 
                     items=df.index[w-1:],
                     major_axis=df.columns,
                     minor_axis=pd.Index(range(w), name='roll'))
    return panel.to_frame().unstack().T.groupby(level=0, **kwargs)

you should be able to:

你应该能够：

roll(df, 2).apply(your_function)

Using mean

使用 mean

roll(df, 2).mean()

major      Open      High       Low    Close
1      133.0350  133.2975  132.8250  132.930
2      132.9325  133.1200  132.6750  132.745
3      132.7425  132.8875  132.6075  132.710
4      132.7075  132.7875  132.6000  132.720

f = lambda df: df.sum(1)

roll(df, 2, group_keys=False).apply(f)

   roll
1  0       532.345
   1       531.830
2  0       531.830
   1       531.115
3  0       531.115
   1       530.780
4  0       530.780
   1       530.850
dtype: float64

Answer 2

回答by aliciawyy

As your rolling window is not too large, I think you can also put them in the same dataframe then use the applyfunction to reduce.

由于您的滚动窗口不是太大，我认为您也可以将它们放在同一个数据帧中，然后使用该apply功能来减少。

For example, with the dataset dfas following

例如，数据集df如下

            Open    High        Low     Close
Date                
2017-11-07  258.97  259.3500    258.09  258.67
2017-11-08  258.47  259.2200    258.15  259.11
2017-11-09  257.73  258.3900    256.36  258.17
2017-11-10  257.73  258.2926    257.37  258.09
2017-11-13  257.31  258.5900    257.27  258.33

You can just add the rolling data to this dataframe with

您只需将滚动数据添加到此数据框中即可

window = 2
df1 = pd.DataFrame(index=df.index)
for i in range(window):
    df_shifted = df.shift(i).copy()
    df_shifted.columns = ["{}-{}".format(s, i) for s in df.columns]
    df1 = df1.join(df_shifted)
df1

           Open-0   High-0      Low-0   Close-0 Open-1  High-1      Low-1   Close-1
Date                                
2017-11-07  258.97  259.3500    258.09  258.67  NaN     NaN         NaN     NaN
2017-11-08  258.47  259.2200    258.15  259.11  258.97  259.3500    258.09  258.67
2017-11-09  257.73  258.3900    256.36  258.17  258.47  259.2200    258.15  259.11
2017-11-10  257.73  258.2926    257.37  258.09  257.73  258.3900    256.36  258.17
2017-11-13  257.31  258.5900    257.27  258.33  257.73  258.2926    257.37  258.09

Then you can make an apply on it easily with all the rolling data you want with

然后您可以轻松地使用您想要的所有滚动数据对其进行应用

df1.apply(AccumulativeSwingIndex, axis=1)

Answer 3

回答by Mike

If you're trying to apply the function to all rows of all columns:

如果您尝试将该函数应用于所有列的所有行：

df.rolling(size_of_your_window).apply(your_function_here)

Answer 4

回答by SO44

Try this for passing multiple columns to apply

试试这个来传递多列应用

df['new_column'] = df.apply(lambda x: your_function(x['High'],x['Low'],x['Close']), axis=1)

如何使用来自多列的参数调用 pandas.rolling.apply？

提问by quarkpol

回答by piRSquared

Define your own `roll`

定义你自己的 `roll`

OLD ANSWER

旧答案

回答by aliciawyy

回答by Mike

回答by SO44

相关推荐

最近更新

标签

如何使用来自多列的参数调用 pandas.rolling.apply？

提问by quarkpol

回答by piRSquared

Define your own roll

定义你自己的 roll

OLD ANSWER

旧答案

回答by aliciawyy

回答by Mike

回答by SO44

相关推荐

Python Pandas 防止单元格换行

Python pandas：根据位置而不是索引值替换值

Pandas：将 unicode 字符串转换为字符串

pandas 获取 groupby 中的第一个和最后一个值

相关推荐

最近更新

标签

Define your own `roll`

定义你自己的 `roll`