将对象类型更改为 datetime64[ns]-pandas

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19764230/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:18:13  来源:igfitidea点击:

Change object type in to datetime64[ns]-pandas

pythontimepandasdataframe

提问by Nilani Algiriyage

I'm analyzing web server log files and having date time in following format.

我正在分析 Web 服务器日志文件并具有以下格式的日期时间。

02/Apr/2013:23:55:00 +0530

I'm converting this into pandas date-time format.

我正在将其转换为Pandas日期时间格式。

df['Time'] = pd.to_datetime(df['Time'])

But still it is in the object format.

但它仍然是对象格式。

print df.dtypes

Time object

时间对象

why it is not changing to datetime64[ns]?

为什么它没有改变为datetime64[ns]

Numpy version

Numpy 版本

In [2]: np.__version__
Out[2]: '1.8.0'

采纳答案by alko

Following answer depends on your python version.

以下答案取决于您的 python 版本。

Pandas' to_datetimecan't recognize your custom datetime format, you should provide it explicetly:

Pandasto_datetime无法识别您的自定义日期时间格式,您应该明确提供:

>>> import pandas as pd
>>> from datetime import datetime
>>> df = pd.DataFrame({'Time':['02/Apr/2013:23:55:00 +0530']},index=['tst'])
>>> from functools import partial
>>> to_datetime_fmt = partial(pd.to_datetime, format='%d/%b/%Y:%H:%M:%S %z')

and apply this custom converter

并应用此自定义转换器

>>> df['Time'] = df['Time'].apply(to_datetime_fmt)
>>> df.dtypes
Time    datetime64[ns]
dtype: object

Note, however that it works from python version 3.2, in earlier versions %zis unsupported. You have to add timedelta manually.

但是请注意,它从 python 版本3.2 开始工作,在早期版本%z中不受支持。您必须手动添加 timedelta。

>>> from datetime import timedelta
>>> df = pd.DataFrame({'Time':['02/Apr/2013:23:55:00 +0530']},index=['tst'])

Split time into datetime and offset

将时间拆分为日期时间和偏移量

>>> def strptime_with_offset(string, format='%d/%b/%Y:%H:%M:%S'):
...    base_dt = datetime.strptime(string[:-6], format)
...    offset = int(string[-6:])
...    delta = timedelta(hours=offset/100, minutes=offset%100)
...    return base_dt + delta
...

and apply this conversion function:

并应用此转换函数:

>>> df['Time'] = df['Time'].apply(strptime_with_offset)
>>> df['Time']
tst   2013-04-03 05:25:00
Name: Time, dtype: datetime64[ns]
>>> df.dtypes
Time    datetime64[ns]
dtype: object

回答by Nilani Algiriyage

Apart from alko's approach this code also worked fine.

除了 alko 的方法,这段代码也运行良好。

from dateutil import parser

def parse(x):
    date, hh, mm, ss = x.split(':')
    dd, mo, yyyy = date.split('/')
    return parser.parse("%s %s %s %s:%s:%s" % (yyyy,mo,dd,hh,mm,ss))

df['Time'] = df['Time'].apply(lambda x:x[1:-7])