pandas 将日期从excel文件转换为pandas
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43023226/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert date from excel file to pandas
提问by Arnold Klein
I'm importing excel file, where the 'Date' column has different ways of writing:
我正在导入 excel 文件,其中“日期”列有不同的书写方式:
Date
13/03/2017
13/03/2017
13/03/2017
13/03/2017
10/3/17
10/3/17
9/3/17
9/3/17
9/3/17
9/3/17
Importing to pandas:
导入Pandas:
df = pd.read_excel('data_excel.xls')
df.Date = pd.to_datetime(df.Date)
results in:
结果是:
Date
13/03/2017
64 13/03/2017
65 13/03/2017
66 13/03/2017
67 2017-10-03 00:00:00
68 2017-10-03 00:00:00
69 2017-09-03 00:00:00
70 2017-09-03 00:00:00
71 2017-09-03 00:00:00
72 2017-09-03 00:00:00
Which means, pandas did not parse properly date and time:
这意味着,pandas 没有正确解析日期和时间:
10/3/17 -> 2017-10-03
when I tried to specify the format:
当我尝试指定格式时:
df.Date = pd.to_datetime(df.Date, format='%d%m%Y')
got the error:
得到错误:
ValueError: time data u'13/03/2017' does not match format '%d%m%Y' (match)
Question:
题:
How to import properly date and times from the excel file to pandas?
如何将日期和时间从 excel 文件正确导入到 Pandas?
回答by mechanical_meat
New answer:
新答案:
Actually pd.to_datetime
has a dayfirst
keyword argument that is useful here:
实际上pd.to_datetime
有一个dayfirst
关键字参数在这里很有用:
df.Date = pd.to_datetime(df.Date,dayfirst=True)
Result:
结果:
>>> df.Date
0 2017-03-13
1 2017-03-13
2 2017-03-13
3 2017-03-13
4 2017-03-10
5 2017-03-10
6 2017-03-09
7 2017-03-09
8 2017-03-09
9 2017-03-09
Name: Date, dtype: datetime64[ns]
Old answer:
旧答案:
Use the third-party module dateutil
which can handle these kinds of variations. It has a dayfirst
keyword argument that is useful here:
使用dateutil
可以处理这些变化的第三方模块。它有一个dayfirst
关键字参数,在这里很有用:
import dateutil
df = pd.read_excel('data_excel.xls')
df.Date = df.Date.apply(lambda x: dateutil.parser.parse(x,dayfirst=True))
Result:
结果:
>>> df.Date
0 2017-03-13
1 2017-03-13
2 2017-03-13
3 2017-03-13
4 2017-03-10
5 2017-03-10
6 2017-03-09
7 2017-03-09
8 2017-03-09
9 2017-03-09
Name: Date, dtype: datetime64[ns]