python pandas时间序列年份提取

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28990256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:58:51  来源:igfitidea点击:

python pandas time series year extraction

pythonparsingdatetimepandasdataframe

提问by user3861925

I have a DF containing timestamps:

我有一个包含时间戳的 DF:

0     2005-08-31 16:39:40
1     2005-12-28 16:00:34
2     2005-10-21 17:52:10
3     2014-01-28 12:23:15
4     2014-01-28 12:23:15
5     2011-02-04 18:32:34
6     2011-02-04 18:32:34
7     2011-02-04 18:32:34

I would like to extract the year from each timestamp, creating additional column in the DF that would look like:

我想从每个时间戳中提取年份,在 DF 中创建附加列,如下所示:

0     2005-08-31 16:39:40 2005
1     2005-12-28 16:00:34 2005
2     2005-10-21 17:52:10 2005
3     2014-01-28 12:23:15 2014
4     2014-01-28 12:23:15 2014
5     2011-02-04 18:32:34 2011
6     2011-02-04 18:32:34 2011
7     2011-02-04 18:32:34 2011

Obviously I can go over all DF entries stripping off the first 4 characters of the date. Which is very slow. I wonder if there is a fast python-way to do this. I saw that it's possible to convert the column into the datetime format by DF = pd.to_datetime(DF,'%Y-%m-%d %H:%M:%S') but when I try to then apply datetime.datetime.year(DF) it doesn't work. I will also need to parse the timestamps to months and combinations of years-months and so on... Help please. Thanks.

显然,我可以查看所有 DF 条目,去掉日期的前 4 个字符。这是非常缓慢的。我想知道是否有一种快速的 python 方式来做到这一点。我看到可以通过 DF = pd.to_datetime(DF,'%Y-%m-%d %H:%M:%S') 将列转换为日期时间格式,但是当我尝试然后应用日期时间时。 datetime.year(DF) 它不起作用。我还需要将时间戳解析为月和年月的组合等等......请帮忙。谢谢。

采纳答案by EdChum

No need to apply a function for each row there is a new datetimeaccessor you can call to access the yearproperty:

无需为每一行应用函数,您可以调用一个新的日期时间访问器来访问year属性:

In [35]:

df1['year'] = df1['timestamp'].dt.year
df1
Out[35]:
            timestamp  year
0 2005-08-31 16:39:40  2005
1 2005-12-28 16:00:34  2005
2 2005-10-21 17:52:10  2005
3 2014-01-28 12:23:15  2014
4 2014-01-28 12:23:15  2014
5 2011-02-04 18:32:34  2011
6 2011-02-04 18:32:34  2011
7 2011-02-04 18:32:34  2011

If your timestamps are str then you can convert to datetime64 using pd.to_dateime:

如果您的时间戳是 str ,那么您可以使用pd.to_dateime以下方法转换为 datetime64 :

df['timestamp'] = pd.to_datetime(df['timestamp'])

You can access the months and other attributes using dtlike the above.

您可以使用上述方法访问月份和其他属性dt

For version prior to 0.15.0you can perform the following:

对于之前的版本,0.15.0您可以执行以下操作:

df1['year'] = df1['timestamp'].apply(lambda x: x.year)