pandas python中的聚合时间序列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29248280/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:05:50  来源:igfitidea点击:

Aggregate time series in python

pythonpandastime-seriestimeserieschart

提问by PK10

How do we aggregate the time series by hour or minutely granularity? If I have a time series like the following then I want the values to be aggregated by hour. Does pandas support it or is there a nifty way to do it in python?

我们如何按小时或分钟的粒度聚合时间序列?如果我有一个如下所示的时间序列,那么我希望按小时聚合这些值。pandas 是否支持它,或者在 python 中是否有一种很好的方法来做到这一点?

timestamp, value
2012-04-30T22:25:31+00:00, 1
2012-04-30T22:25:43+00:00, 1
2012-04-30T22:29:04+00:00, 2
2012-04-30T22:35:09+00:00, 4
2012-04-30T22:39:28+00:00, 1
2012-04-30T22:47:54+00:00, 8
2012-04-30T22:50:49+00:00, 9
2012-04-30T22:51:57+00:00, 1
2012-04-30T22:54:50+00:00, 1
2012-04-30T22:57:22+00:00, 0
2012-04-30T22:58:38+00:00, 7
2012-04-30T23:05:21+00:00, 1
2012-04-30T23:08:56+00:00, 1

I also tried to make sure I have the correct data types in my data frame by calling:

我还尝试通过调用来确保我的数据框中有正确的数据类型:

  print data_frame.dtypes

and I get the following as out put

我得到以下内容

ts     datetime64[ns]
val             int64

When I call group by on the data frame

当我在数据框上调用 group by 时

grouped = data_frame.groupby(lambda x: x.minute)

I get the following error:

我收到以下错误:

grouped = data_frame.groupby(lambda x: x.minute)
AttributeError: 'int' object has no attribute 'minute'

回答by grechut

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.resample.htmlDataFrame.resample method. You can specify here way of aggregation, in your case sum.

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.resample.htmlDataFrame.resample 方法。您可以在此处指定聚合方式,在您的情况下sum

data_frame.resample("1Min", how="sum")

http://pandas.pydata.org/pandas-docs/dev/timeseries.html#up-and-downsampling

http://pandas.pydata.org/pandas-docs/dev/timeseries.html#up-and-downsampling