在 Python 和 Pandas 中使用 dd.mm.yyyy 读取 csv
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30833133/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read csv with dd.mm.yyyy in Python and Pandas
提问by RogerWilco77
I am reading a csv file with German date format. Seems like it worked ok in this post:
我正在阅读德国日期格式的 csv 文件。似乎在这篇文章中工作正常:
Picking dates from an imported CSV with pandas/python
使用 pandas/python 从导入的 CSV 中选择日期
However, it seems like in my case the date is not recognized as such. I could not find any wrong string in the test file.
但是,在我的情况下,似乎日期不被识别。我在测试文件中找不到任何错误的字符串。
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import style
from pandas import DataFrame
style.use('ggplot')
df = pd.read_csv('testdata.csv', dayfirst=True, parse_dates=True)
df[:5]


This results in:
这导致:


So, the Column with the dates is not recognized as such. What am I doing wrong here? Or is this date format simply not compatible?
因此,带有日期的列不被识别。我在这里做错了什么?还是这种日期格式根本不兼容?
- OSX 10.10.3
- Anaconda conda 3.13.0
- Python 3.4.3-0
- iPython notebook 3.1.0
- OSX 10.10.3
- 蟒蛇康达 3.13.0
- 蟒蛇 3.4.3-0
- iPython 笔记本 3.1.0
回答by unutbu
If you use parse_dates=Truethen read_csvtries to parse the indexas a date.
Therefore, you would also need to declare the first column as the index with index_col=[0]:
如果您使用parse_dates=Truethenread_csv尝试将索引解析为 date。因此,您还需要将第一列声明为索引index_col=[0]:
In [216]: pd.read_csv('testdata.csv', dayfirst=True, parse_dates=True, index_col=[0])
Out[216]:
morgens mittags abends
Datum
2015-03-16 382 452 202
2015-03-17 288 467 192
Alternatively, if you don't want the Datumcolumn to be an index, you could use
parse_dates=[0]to explicitly tell read_csvto parse the first column as dates:
或者,如果您不希望该Datum列成为索引,您可以使用
parse_dates=[0]显式告诉read_csv将第一列解析为日期:
In [217]: pd.read_csv('testdata.csv', dayfirst=True, parse_dates=[0])
Out[217]:
Datum morgens mittags abends
0 2015-03-16 382 452 202
1 2015-03-17 288 467 192
Under the hood read_csvuses dateutil.parser.parseto parse date strings:
在引擎盖下read_csv用于dateutil.parser.parse解析日期字符串:
In [218]: import dateutil.parser as DP
In [221]: DP.parse('16.03.2015', dayfirst=True)
Out[221]: datetime.datetime(2015, 3, 16, 0, 0)
Since dateutil.parserhas no trouble parsing date strings in DD.MM.YYYYformat, you don't have to declare a custom date parser here.
由于dateutil.parser解析DD.MM.YYYY格式日期字符串没有问题,因此您不必在此处声明自定义日期解析器。
回答by Ophir Yoktan
use the date_parser parameter of read_csv to pass a custom date parsing function (a lambda that wraps strptime with the relevant date format)
使用 read_csv 的 date_parser 参数传递自定义日期解析函数(使用相关日期格式包装 strptime 的 lambda)
回答by Aleksandr
May be this will help
可能这会有所帮助
from datetime import datetime as dt
dtm = lambda x: dt.strptime(str(x), "%d.%m.%Y")
df["Datum"] = df["Datum"].apply(dtm)

