Pandas 数据框按多列分组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46408971/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:31:50  来源:igfitidea点击:

Pandas dataframe group by multiple columns

pythonpandas

提问by DougKruger

Given a dataframe with two datetime columns Aand Band a numeric column C, how to group by monthof both Aand Band sum(C)i.e.

鉴于有两个日期时间列的数据帧A,并B和数字列C,如何组由month两个ABsum(C)

In [1]: df
Out[1]: 

      A           B            C  
0  2013-01-01  2013-01-01  0.282863 
1  2013-01-02  2013-01-01  0.173215 
2  2013-02-03  2013-02-04  2.104569 
3  2013-02-09  2013-04-15  0.706771 
4  2013-03-05  2013-08-01  0.567020 
5  2013-03-06  2013-04-01  0.113648

回答by YOBEN_S

By using groupby

通过使用 groupby

df.groupby([df.A.dt.month,df.B.dt.month]).C.sum()

Out[954]: 
A  B
1  1    0.456078
2  2    2.104569
   4    0.706771
3  4    0.113648
   8    0.567020
Name: C, dtype: float64

Note: By using this , make sure A and B are datetime format If not , do following code before groupby

注意:通过使用这个,确保A和B是日期时间格式如果不是,请先执行以下代码 groupby

df.A=pd.to_datetime(df.A)
df.B=pd.to_datetime(df.B) 

回答by Yale Newman

I recently just read about a new function that makes grouping by dates super easy.

我最近刚读到一个新功能,它使按日期分组变得非常容易。

 df.A=pd.to_datetime(df.A)
 df.B=pd.to_datetime(df.B)

 df.groupby([pd.Grouper(key='A', freq='M'), pd.Grouper(key='B', freq='M')])['C'].sum()

The number of options this opens up makes it worth looking into:

这打开的选项数量值得研究:

Source: http://pbpython.com/pandas-grouper-agg.html

来源:http: //pbpython.com/pandas-grouper-agg.html

Different Date aliases: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases

不同的日期别名:http: //pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases

回答by A.Kot

df['month_A'] = [i.month for i in pd.to_datetime(df.A)]
df['month_B'] = [i.month for i in pd.to_datetime(df.B)]

df.groupby(['month_A', 'month_B']).sum()

回答by Rockbar

If you combine with following, you will get back the result with the respective values in A and B column

如果您结合以下内容,您将使用 A 和 B 列中的相应值返回结果

idsum = df.groupby([df.A.dt.month,df.B.dt.month])["C"].transform(sum) == df["C"]
df=df[idsum]