python pandas时间序列年份提取
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28990256/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python pandas time series year extraction
提问by user3861925
I have a DF containing timestamps:
我有一个包含时间戳的 DF:
0 2005-08-31 16:39:40
1 2005-12-28 16:00:34
2 2005-10-21 17:52:10
3 2014-01-28 12:23:15
4 2014-01-28 12:23:15
5 2011-02-04 18:32:34
6 2011-02-04 18:32:34
7 2011-02-04 18:32:34
I would like to extract the year from each timestamp, creating additional column in the DF that would look like:
我想从每个时间戳中提取年份,在 DF 中创建附加列,如下所示:
0 2005-08-31 16:39:40 2005
1 2005-12-28 16:00:34 2005
2 2005-10-21 17:52:10 2005
3 2014-01-28 12:23:15 2014
4 2014-01-28 12:23:15 2014
5 2011-02-04 18:32:34 2011
6 2011-02-04 18:32:34 2011
7 2011-02-04 18:32:34 2011
Obviously I can go over all DF entries stripping off the first 4 characters of the date. Which is very slow. I wonder if there is a fast python-way to do this. I saw that it's possible to convert the column into the datetime format by DF = pd.to_datetime(DF,'%Y-%m-%d %H:%M:%S') but when I try to then apply datetime.datetime.year(DF) it doesn't work. I will also need to parse the timestamps to months and combinations of years-months and so on... Help please. Thanks.
显然,我可以查看所有 DF 条目,去掉日期的前 4 个字符。这是非常缓慢的。我想知道是否有一种快速的 python 方式来做到这一点。我看到可以通过 DF = pd.to_datetime(DF,'%Y-%m-%d %H:%M:%S') 将列转换为日期时间格式,但是当我尝试然后应用日期时间时。 datetime.year(DF) 它不起作用。我还需要将时间戳解析为月和年月的组合等等......请帮忙。谢谢。
采纳答案by EdChum
No need to apply a function for each row there is a new datetimeaccessor you can call to access the yearproperty:
无需为每一行应用函数,您可以调用一个新的日期时间访问器来访问year属性:
In [35]:
df1['year'] = df1['timestamp'].dt.year
df1
Out[35]:
timestamp year
0 2005-08-31 16:39:40 2005
1 2005-12-28 16:00:34 2005
2 2005-10-21 17:52:10 2005
3 2014-01-28 12:23:15 2014
4 2014-01-28 12:23:15 2014
5 2011-02-04 18:32:34 2011
6 2011-02-04 18:32:34 2011
7 2011-02-04 18:32:34 2011
If your timestamps are str then you can convert to datetime64 using pd.to_dateime
:
如果您的时间戳是 str ,那么您可以使用pd.to_dateime
以下方法转换为 datetime64 :
df['timestamp'] = pd.to_datetime(df['timestamp'])
You can access the months and other attributes using dt
like the above.
您可以使用上述方法访问月份和其他属性dt
。
For version prior to 0.15.0
you can perform the following:
对于之前的版本,0.15.0
您可以执行以下操作:
df1['year'] = df1['timestamp'].apply(lambda x: x.year)