带有 pct_change 的 Pandas groupby

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40273251/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:17:34  来源:igfitidea点击:

Pandas groupby with pct_change

pythonpandasnumpy

提问by user3357979

I'm trying to find the period over period growth in value for each unique group, grouped by Company, Group, and Date.

我试图找到按公司、组和日期分组的每个独特组的期间价值增长。

Company Group Date     Value
A       X     2015-01  1
A       X     2015-02  2
A       X     2015-03  1.5
A       XX    2015-01  1
A       XX    2015-02  1.5
A       XX    2015-03  0.75
A       XX    2015-04  1
B       Y     2015-01  1
B       Y     2015-02  1.5
B       Y     2015-03  2
B       Y     2015-04  3
B       YY    2015-01  2
B       YY    2015-02  2.5
B       YY    2015-03  3

I've tried:

我试过了:

df.groupby(['Date','Company','Group']).pct_change()

but this returns all NaN.

但这会返回所有 NaN。

The result I'm looking for is:

我正在寻找的结果是:

Company Group Date     Value/People
A       X     2015-01  NaN
A       X     2015-02  1.0
A       X     2015-03  -0.25
A       XX    2015-01  NaN
A       XX    2015-02  0.5
A       XX    2015-03  -0.5
A       XX    2015-04  0.33
B       Y     2015-01  NaN
B       Y     2015-02  0.5
B       Y     2015-03  0.33
B       Y     2015-04  0.5
B       YY    2015-01  NaN
B       YY    2015-02  0.25
B       YY    2015-03  0.2

回答by piRSquared

you want to get your date into the row index and groups/company into the columns

您想将日期放入行索引并将组/公司放入列中

d1 = df.set_index(['Date', 'Company', 'Group']).Value.unstack(['Company', 'Group'])
d1

enter image description here

在此处输入图片说明

then use pct_change

然后使用 pct_change

d1.pct_change()

enter image description here

在此处输入图片说明

OR

或者

with groupby

与 groupby

df['pct'] = df.sort_values('Date').groupby(['Company', 'Group']).Value.pct_change()
df

enter image description here

在此处输入图片说明

回答by SimonR

I'm not sure the groupbymethod works as intended as of Pandas 0.23.4 at least.

我不确定该groupby方法至少在 Pandas 0.23.4 中是否按预期工作。

df['pct'] = df.sort_values('Date').groupby(['Company', 'Group']).Value.pct_change()

Produces this, which is incorrect for purposes of the question:

产生这个,这对于问题的目的是不正确的:

Incorrect Outcome

错误的结果

The Index+Stack method still works as intended, but you need to do additional merges to get it into the original form requested.

Index+Stack 方法仍然按预期工作,但您需要进行额外的合并以使其成为请求的原始形式。

d1 = df.set_index(['Date', 'Company', 'Group']).Value.unstack(['Company', 'Group'])
d1 = d1.pct_change().stack([0,1]).reset_index()
df = df.merge(d1, on=['Company', 'Group', 'Date'], how='left')
df.rename(columns={0: 'pct'}, inplace=True)
df

Correct Outcome

正确的结果