Python Pandas 中的 Parse_dates
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23797491/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parse_dates in Pandas
提问by user3576212
The following code can't parse my date column into dates from csv file.
以下代码无法将我的日期列解析为 csv 文件中的日期。
data=pd.read_csv('c:/data.csv',parse_dates=True,keep_date_col = True)
or
或者
data=pd.read_csv('c:/data.csv',parse_dates=[0])
data is like following
数据如下
date value
30MAR1990 140000
30JUN1990 30000
30SEP1990 120000
30DEC1990 34555
What did I do wrong? Please help!
我做错了什么?请帮忙!
Thanks.
谢谢。
采纳答案by Andy Hayden
This is a non-standard format, so not caught by the default parser, you can pass your own:
这是一种非标准格式,因此不会被默认解析器捕获,您可以传递自己的:
In [11]: import datetime as dt
In [12]: dt.datetime.strptime('30MAR1990', '%d%b%Y')
Out[12]: datetime.datetime(1990, 3, 30, 0, 0)
In [13]: parser = lambda date: pd.datetime.strptime(date, '%d%b%Y')
In [14]: pd.read_csv(StringIO(s), parse_dates=[0], date_parser=parser)
Out[14]:
date value
0 1990-03-30 140000
1 1990-06-30 30000
2 1990-09-30 120000
3 1990-12-30 34555
Another option is to use to_datetime afteryou've read in the strings:
另一种选择是在读入字符串后使用 to_datetime :
df['date'] = pd.to_datetime(df['date'], format='%d%b%Y')
回答by TomAugspurger
You can use the date_parser
argument to read_csv
您可以使用date_parser
read_csv的参数
In [62]: from pandas.compat import StringIO
In [63]: s = """date,value
30MAR1990,140000
30JUN1990,30000
30SEP1990,120000
30DEC1990,34555
"""
In [64]: from pandas.compat import StringIO
In [65]: import datetime
date_parser
expects a function that will be called on an array of strings. func
calls datetime.datetime.strptime
on each string. Check out the datetime
module in the python docs for more on the format codes.
date_parser
期望一个将在字符串数组上调用的函数。func
调用datetime.datetime.strptime
每个字符串。datetime
有关格式代码的更多信息,请查看python 文档中的模块。
In [66]: func = lambda dates: [datetime.datetime.strptime(x, '%d%b%Y') for x in dates]
In [67]: s = """date,value
30MAR1990,140000
30JUN1990,30000
30SEP1990,120000
30DEC1990,34555
"""
In [68]: pd.read_csv(StringIO(s), parse_dates=['date'], date_parser=func)
Out[68]:
date value
0 1990-03-30 140000
1 1990-06-30 30000
2 1990-09-30 120000
3 1990-12-30 34555
[4 rows x 2 columns]