如何使用来自多列的参数调用 pandas.rolling.apply?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38878917/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:47:06  来源:igfitidea点击:

How to invoke pandas.rolling.apply with parameters from multiple column?

pythonpandas

提问by quarkpol

I've got a dataset:

我有一个数据集:

    Open     High      Low    Close        
0  132.960  133.340  132.940  133.105
1  133.110  133.255  132.710  132.755
2  132.755  132.985  132.640  132.735 
3  132.730  132.790  132.575  132.685
4  132.685  132.785  132.625  132.755

I try to use rolling.apply function for all rows, like this:

我尝试对所有行使用 rolling.apply 函数,如下所示:

df['new_col']= df[['Open']].rolling(2).apply(AccumulativeSwingIndex(df['High'],df['Low'],df['Close']))
  • shows error
  • 显示错误

or

或者

df['new_col']=  df[['Open', 'High', 'Low', 'Close']].rolling(2).apply(AccumulativeSwingIndex)
  • pass only parameter from column 'Open'
  • 仅传递来自“打开”列的参数

Can anybody help me?

有谁能够帮助我?

回答by piRSquared

Define your own roll

定义你自己的 roll

We can create a function that takes a window size argument wand any other keyword arguments. We use this to build a new DataFramein which we will call groupbyon while passing on the keyword arguments via kwargs.

我们可以创建一个接受窗口大小参数w和任何其他关键字参数的函数。我们使用它来构建一个新DataFrame的,我们将groupby在通过kwargs.

注意:我不必使用,stride_tricks.as_stridedstride_tricks.as_strided但它很简洁,在我看来是合适的。
from numpy.lib.stride_tricks import as_strided as stride
import pandas as pd

def roll(df, w, **kwargs):
    v = df.values
    d0, d1 = v.shape
    s0, s1 = v.strides

    a = stride(v, (d0 - (w - 1), w, d1), (s0, s0, s1))

    rolled_df = pd.concat({
        row: pd.DataFrame(values, columns=df.columns)
        for row, values in zip(df.index, a)
    })

    return rolled_df.groupby(level=0, **kwargs)

roll(df, 2).mean()

       Open      High       Low    Close
0  133.0350  133.2975  132.8250  132.930
1  132.9325  133.1200  132.6750  132.745
2  132.7425  132.8875  132.6075  132.710
3  132.7075  132.7875  132.6000  132.720

We can also use the pandas.DataFrame.pipemethod to the same effect:

我们也可以使用该pandas.DataFrame.pipe方法达到同样的效果:

df.pipe(roll, w=2).mean()




OLD ANSWER

旧答案

Panelhas been deprecated. See above for updated answer.

Panel已被弃用。有关更新的答案,请参见上文。

see https://stackoverflow.com/a/37491779/2336654

https://stackoverflow.com/a/37491779/2336654

define our own roll

定义我们自己的 roll

def roll(df, w, **kwargs):
    roll_array = np.dstack([df.values[i:i+w, :] for i in range(len(df.index) - w + 1)]).T
    panel = pd.Panel(roll_array, 
                     items=df.index[w-1:],
                     major_axis=df.columns,
                     minor_axis=pd.Index(range(w), name='roll'))
    return panel.to_frame().unstack().T.groupby(level=0, **kwargs)

you should be able to:

你应该能够:

roll(df, 2).apply(your_function)

Using mean

使用 mean

roll(df, 2).mean()

major      Open      High       Low    Close
1      133.0350  133.2975  132.8250  132.930
2      132.9325  133.1200  132.6750  132.745
3      132.7425  132.8875  132.6075  132.710
4      132.7075  132.7875  132.6000  132.720


f = lambda df: df.sum(1)

roll(df, 2, group_keys=False).apply(f)

   roll
1  0       532.345
   1       531.830
2  0       531.830
   1       531.115
3  0       531.115
   1       530.780
4  0       530.780
   1       530.850
dtype: float64

回答by aliciawyy

As your rolling window is not too large, I think you can also put them in the same dataframe then use the applyfunction to reduce.

由于您的滚动窗口不是太大,我认为您也可以将它们放在同一个数据帧中,然后使用该apply功能来减少。

For example, with the dataset dfas following

例如,数据集df如下

            Open    High        Low     Close
Date                
2017-11-07  258.97  259.3500    258.09  258.67
2017-11-08  258.47  259.2200    258.15  259.11
2017-11-09  257.73  258.3900    256.36  258.17
2017-11-10  257.73  258.2926    257.37  258.09
2017-11-13  257.31  258.5900    257.27  258.33

You can just add the rolling data to this dataframe with

您只需将滚动数据添加到此数据框中即可

window = 2
df1 = pd.DataFrame(index=df.index)
for i in range(window):
    df_shifted = df.shift(i).copy()
    df_shifted.columns = ["{}-{}".format(s, i) for s in df.columns]
    df1 = df1.join(df_shifted)
df1

           Open-0   High-0      Low-0   Close-0 Open-1  High-1      Low-1   Close-1
Date                                
2017-11-07  258.97  259.3500    258.09  258.67  NaN     NaN         NaN     NaN
2017-11-08  258.47  259.2200    258.15  259.11  258.97  259.3500    258.09  258.67
2017-11-09  257.73  258.3900    256.36  258.17  258.47  259.2200    258.15  259.11
2017-11-10  257.73  258.2926    257.37  258.09  257.73  258.3900    256.36  258.17
2017-11-13  257.31  258.5900    257.27  258.33  257.73  258.2926    257.37  258.09

Then you can make an apply on it easily with all the rolling data you want with

然后您可以轻松地使用您想要的所有滚动数据对其进行应用

df1.apply(AccumulativeSwingIndex, axis=1)

回答by Mike

If you're trying to apply the function to all rows of all columns:

如果您尝试将该函数应用于所有列的所有行:

df.rolling(size_of_your_window).apply(your_function_here)

回答by SO44

Try this for passing multiple columns to apply

试试这个来传递多列应用

df['new_column'] = df.apply(lambda x: your_function(x['High'],x['Low'],x['Close']), axis=1)