如何使用来自多列的参数调用 pandas.rolling.apply?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38878917/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to invoke pandas.rolling.apply with parameters from multiple column?
提问by quarkpol
I've got a dataset:
我有一个数据集:
Open High Low Close
0 132.960 133.340 132.940 133.105
1 133.110 133.255 132.710 132.755
2 132.755 132.985 132.640 132.735
3 132.730 132.790 132.575 132.685
4 132.685 132.785 132.625 132.755
I try to use rolling.apply function for all rows, like this:
我尝试对所有行使用 rolling.apply 函数,如下所示:
df['new_col']= df[['Open']].rolling(2).apply(AccumulativeSwingIndex(df['High'],df['Low'],df['Close']))
- shows error
- 显示错误
or
或者
df['new_col']= df[['Open', 'High', 'Low', 'Close']].rolling(2).apply(AccumulativeSwingIndex)
- pass only parameter from column 'Open'
- 仅传递来自“打开”列的参数
Can anybody help me?
有谁能够帮助我?
回答by piRSquared
Define your own roll
定义你自己的 roll
We can create a function that takes a window size argument w
and any other keyword arguments. We use this to build a new DataFrame
in which we will call groupby
on while passing on the keyword arguments via kwargs
.
我们可以创建一个接受窗口大小参数w
和任何其他关键字参数的函数。我们使用它来构建一个新DataFrame
的,我们将groupby
在通过kwargs
.
stride_tricks.as_strided
stride_tricks.as_strided
但它很简洁,在我看来是合适的。
from numpy.lib.stride_tricks import as_strided as stride
import pandas as pd
def roll(df, w, **kwargs):
v = df.values
d0, d1 = v.shape
s0, s1 = v.strides
a = stride(v, (d0 - (w - 1), w, d1), (s0, s0, s1))
rolled_df = pd.concat({
row: pd.DataFrame(values, columns=df.columns)
for row, values in zip(df.index, a)
})
return rolled_df.groupby(level=0, **kwargs)
roll(df, 2).mean()
Open High Low Close
0 133.0350 133.2975 132.8250 132.930
1 132.9325 133.1200 132.6750 132.745
2 132.7425 132.8875 132.6075 132.710
3 132.7075 132.7875 132.6000 132.720
We can also use the pandas.DataFrame.pipe
method to the same effect:
我们也可以使用该pandas.DataFrame.pipe
方法达到同样的效果:
df.pipe(roll, w=2).mean()
OLD ANSWER
旧答案
Panel
has been deprecated. See above for updated answer.
Panel
已被弃用。有关更新的答案,请参见上文。
see https://stackoverflow.com/a/37491779/2336654
见https://stackoverflow.com/a/37491779/2336654
define our own roll
定义我们自己的 roll
def roll(df, w, **kwargs):
roll_array = np.dstack([df.values[i:i+w, :] for i in range(len(df.index) - w + 1)]).T
panel = pd.Panel(roll_array,
items=df.index[w-1:],
major_axis=df.columns,
minor_axis=pd.Index(range(w), name='roll'))
return panel.to_frame().unstack().T.groupby(level=0, **kwargs)
you should be able to:
你应该能够:
roll(df, 2).apply(your_function)
Using mean
使用 mean
roll(df, 2).mean()
major Open High Low Close
1 133.0350 133.2975 132.8250 132.930
2 132.9325 133.1200 132.6750 132.745
3 132.7425 132.8875 132.6075 132.710
4 132.7075 132.7875 132.6000 132.720
f = lambda df: df.sum(1)
roll(df, 2, group_keys=False).apply(f)
roll
1 0 532.345
1 531.830
2 0 531.830
1 531.115
3 0 531.115
1 530.780
4 0 530.780
1 530.850
dtype: float64
回答by aliciawyy
As your rolling window is not too large, I think you can also put them in the same dataframe then use the apply
function to reduce.
由于您的滚动窗口不是太大,我认为您也可以将它们放在同一个数据帧中,然后使用该apply
功能来减少。
For example, with the dataset df
as following
例如,数据集df
如下
Open High Low Close
Date
2017-11-07 258.97 259.3500 258.09 258.67
2017-11-08 258.47 259.2200 258.15 259.11
2017-11-09 257.73 258.3900 256.36 258.17
2017-11-10 257.73 258.2926 257.37 258.09
2017-11-13 257.31 258.5900 257.27 258.33
You can just add the rolling data to this dataframe with
您只需将滚动数据添加到此数据框中即可
window = 2
df1 = pd.DataFrame(index=df.index)
for i in range(window):
df_shifted = df.shift(i).copy()
df_shifted.columns = ["{}-{}".format(s, i) for s in df.columns]
df1 = df1.join(df_shifted)
df1
Open-0 High-0 Low-0 Close-0 Open-1 High-1 Low-1 Close-1
Date
2017-11-07 258.97 259.3500 258.09 258.67 NaN NaN NaN NaN
2017-11-08 258.47 259.2200 258.15 259.11 258.97 259.3500 258.09 258.67
2017-11-09 257.73 258.3900 256.36 258.17 258.47 259.2200 258.15 259.11
2017-11-10 257.73 258.2926 257.37 258.09 257.73 258.3900 256.36 258.17
2017-11-13 257.31 258.5900 257.27 258.33 257.73 258.2926 257.37 258.09
Then you can make an apply on it easily with all the rolling data you want with
然后您可以轻松地使用您想要的所有滚动数据对其进行应用
df1.apply(AccumulativeSwingIndex, axis=1)
回答by Mike
If you're trying to apply the function to all rows of all columns:
如果您尝试将该函数应用于所有列的所有行:
df.rolling(size_of_your_window).apply(your_function_here)
回答by SO44
Try this for passing multiple columns to apply
试试这个来传递多列应用
df['new_column'] = df.apply(lambda x: your_function(x['High'],x['Low'],x['Close']), axis=1)