pandas 如何使用groupby计算vwap(成交量加权平均价格)并应用?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44854512/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:54:38  来源:igfitidea点击:

How to calculate vwap (volume weighted average price) using groupby and apply?

pythonpandaslambdapandas-groupby

提问by cJc

I have read multiple post similar to my question, but I still can't figure it out. I have a pandas df that looks like the following (for multiple days):

我已经阅读了多篇与我的问题类似的帖子,但我仍然无法弄清楚。我有一个如下所示的 Pandas df(多天):

Out[1]: 
                     price  quantity
time                                
2016-06-08 09:00:22  32.30    1960.0
2016-06-08 09:00:22  32.30     142.0
2016-06-08 09:00:22  32.30    3857.0
2016-06-08 09:00:22  32.30    1000.0
2016-06-08 09:00:22  32.35     991.0
2016-06-08 09:00:22  32.30     447.0
...

To calculate the vwap I could do:

要计算 vwap,我可以这样做:

df['vwap'] = (np.cumsum(df.quantity * df.price) / np.cumsum(df.quantity))

However, I would like to start over every day (groupby), but I can't figure out how to make it work with a (lambda?) function.

但是,我想每天重新开始(groupby),但我不知道如何使其与(lambda?)函数一起工作。

df['vwap_day'] = df.groupby(df.index.date)['vwap'].apply(lambda ...

Speed is of essence. Would appreciate any help:)

速度至关重要。将不胜感激任何帮助:)

回答by piRSquared

Option 0
plain vanilla approach

选项 0
纯香草方法

def vwap(df):
    q = df.quantity.values
    p = df.price.values
    return df.assign(vwap=(p * q).cumsum() / q.cumsum())

df = df.groupby(df.index.date, group_keys=False).apply(vwap)
df

                     price  quantity       vwap
time                                           
2016-06-08 09:00:22  32.30    1960.0  32.300000
2016-06-08 09:00:22  32.30     142.0  32.300000
2016-06-08 09:00:22  32.30    3857.0  32.300000
2016-06-08 09:00:22  32.30    1000.0  32.300000
2016-06-08 09:00:22  32.35     991.0  32.306233
2016-06-08 09:00:22  32.30     447.0  32.305901

Option 1
Throwing in a little eval

选项 1
投入一点eval

df = df.assign(
    vwap=df.eval(
        'wgtd = price * quantity', inplace=False
    ).groupby(df.index.date).cumsum().eval('wgtd / quantity')
)
df

                     price  quantity       vwap
time                                           
2016-06-08 09:00:22  32.30    1960.0  32.300000
2016-06-08 09:00:22  32.30     142.0  32.300000
2016-06-08 09:00:22  32.30    3857.0  32.300000
2016-06-08 09:00:22  32.30    1000.0  32.300000
2016-06-08 09:00:22  32.35     991.0  32.306233
2016-06-08 09:00:22  32.30     447.0  32.305901