pandas Python,将数据框中的每日数据汇总为每月和每季度
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40554396/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python, summarize daily data in dataframe to monthly and quarterly
提问by Windtalker
I have already loaded my data into Pandas dataframe.
我已经将我的数据加载到 Pandas 数据框中。
Example:
例子:
Date Price
2012/12/02 141.25
2012/12/05 132.64
2012/12/06 132.11
2012/12/21 141.64
2012/12/25 143.19
2012/12/31 139.66
2013/01/05 145.11
2013/01/06 145.99
2013/01/07 145.97
2013/01/11 145.11
2013/01/12 145.99
2013/01/24 145.97
2013/02/23 145.11
2013/03/24 145.99
2013/03/28 145.97
2013/04/28 145.97
2013/05/24 145.97
2013/06/23 145.11
2013/07/24 145.99
2013/08/28 145.97
2013/09/28 145.97
Just two columns, one is data and one is price.
只有两列,一列是数据,一列是价格。
Now how to group or resample the data starts from 2013 to monthly and quarterly df?
现在如何对从 2013 年开始到每月和每季度 df 的数据进行分组或重新采样?
Monthly:
每月:
Date Price
2013/01/01 Monthly total
2013/02/01 Monthly total
2013/03/01 Monthly total
2013/04/01 Monthly total
2013/05/01 Monthly total
2013/06/01 Monthly total
2013/07/01 Monthly total
2013/08/01 Monthly total
2013/09/01 Monthly total
Quarterly:
季刊:
Date Price
2013/01/01 Quarterly total
2013/04/01 Quarterly total
2013/07/01 Quarterly total
Please note that the monthly and quarterly data need to start from first day of month but in the original dataframe the first day of month data is missing, quantity of valid daily data in each month could vary. Also the original dataframe has data from 2012 to 2013, I only need monthly and quarterly data from beginning of 2013.
请注意,月度和季度数据需要从月的第一天开始,但在原始数据框中缺少月的第一天数据,每个月的有效日数据数量可能会有所不同。另外原始数据框有2012年到2013年的数据,我只需要2013年初的月度和季度数据。
I tried something like
我试过类似的东西
result1 = df.groupby([lambda x: x.year, lambda x: x.month], axis=1).sum()
but does not work.
但不起作用。
Thank you!
谢谢!
回答by Boud
First convert your Date column into a datetime index:
首先将您的日期列转换为日期时间索引:
df.Date = pd.to_datetime(df.Date)
df.set_index('Date', inplace=True)
Then use resample
. The list of offset aliases is in the pandas documentation. For begin of month resample, use MS
, and QS
for the quarters:
然后使用resample
. 偏移别名列表在pandas 文档中。对于月初重新采样,使用MS
, 和QS
季度:
df.resample('QS').sum()
Out[46]:
Price
Date
2012-10-01 830.49
2013-01-01 1311.21
2013-04-01 437.05
2013-07-01 437.93
df.resample('MS').sum()
Out[47]:
Price
Date
2012-12-01 830.49
2013-01-01 874.14
2013-02-01 145.11
2013-03-01 291.96
2013-04-01 145.97
2013-05-01 145.97
2013-06-01 145.11
2013-07-01 145.99
2013-08-01 145.97
2013-09-01 145.97