在 Python 和 Pandas 中使用 dd.mm.yyyy 读取 csv

Question

提问by RogerWilco77

I am reading a csv file with German date format. Seems like it worked ok in this post:

我正在阅读德国日期格式的 csv 文件。似乎在这篇文章中工作正常：

Picking dates from an imported CSV with pandas/python

However, it seems like in my case the date is not recognized as such. I could not find any wrong string in the test file.

但是，在我的情况下，似乎日期不被识别。我在测试文件中找不到任何错误的字符串。

import pandas as pd
import numpy as np


%matplotlib inline
import matplotlib.pyplot as plt

from matplotlib import style
from pandas import DataFrame

style.use('ggplot')

df = pd.read_csv('testdata.csv', dayfirst=True, parse_dates=True)
df[:5]

table

This results in:

这导致：

screenshot

So, the Column with the dates is not recognized as such. What am I doing wrong here? Or is this date format simply not compatible?

因此，带有日期的列不被识别。我在这里做错了什么？还是这种日期格式根本不兼容？

OSX 10.10.3
Anaconda conda 3.13.0
Python 3.4.3-0
iPython notebook 3.1.0

OSX 10.10.3
蟒蛇康达 3.13.0
蟒蛇 3.4.3-0
iPython 笔记本 3.1.0

Answer 1

回答by unutbu

If you use parse_dates=Truethen read_csvtries to parse the indexas a date. Therefore, you would also need to declare the first column as the index with index_col=[0]:

如果您使用parse_dates=Truethenread_csv尝试将索引解析为 date。因此，您还需要将第一列声明为索引index_col=[0]：

In [216]: pd.read_csv('testdata.csv', dayfirst=True, parse_dates=True, index_col=[0])
Out[216]: 
            morgens  mittags  abends
Datum                               
2015-03-16      382      452     202
2015-03-17      288      467     192

Alternatively, if you don't want the Datumcolumn to be an index, you could use parse_dates=[0]to explicitly tell read_csvto parse the first column as dates:

或者，如果您不希望该Datum列成为索引，您可以使用 parse_dates=[0]显式告诉read_csv将第一列解析为日期：

In [217]: pd.read_csv('testdata.csv', dayfirst=True, parse_dates=[0])
Out[217]: 
       Datum  morgens  mittags  abends
0 2015-03-16      382      452     202
1 2015-03-17      288      467     192

Under the hood read_csvuses dateutil.parser.parseto parse date strings:

在引擎盖下read_csv用于dateutil.parser.parse解析日期字符串：

In [218]: import dateutil.parser as DP

In [221]: DP.parse('16.03.2015', dayfirst=True)
Out[221]: datetime.datetime(2015, 3, 16, 0, 0)

Since dateutil.parserhas no trouble parsing date strings in DD.MM.YYYYformat, you don't have to declare a custom date parser here.

由于dateutil.parser解析DD.MM.YYYY格式日期字符串没有问题，因此您不必在此处声明自定义日期解析器。

Answer 2

回答by Ophir Yoktan

use the date_parser parameter of read_csv to pass a custom date parsing function (a lambda that wraps strptime with the relevant date format)

使用 read_csv 的 date_parser 参数传递自定义日期解析函数（使用相关日期格式包装 strptime 的 lambda）

pandas.read_csv

Answer 3

回答by Aleksandr

May be this will help

可能这会有所帮助

    from datetime import datetime as dt
    dtm = lambda x: dt.strptime(str(x), "%d.%m.%Y")
    df["Datum"] = df["Datum"].apply(dtm)

在 Python 和 Pandas 中使用 dd.mm.yyyy 读取 csv

提问by RogerWilco77

回答by unutbu

回答by Ophir Yoktan

回答by Aleksandr

相关推荐

最近更新

标签

在 Python 和 Pandas 中使用 dd.mm.yyyy 读取 csv

提问by RogerWilco77

回答by unutbu

回答by Ophir Yoktan

回答by Aleksandr

相关推荐

当使用“pandas.read_hdf()”读取巨大的 HDF5 文件时，为什么即使我通过指定块大小读取块，我仍然会收到 MemoryError？

pandas 何时使用类别而不是对象？

Pandas to_csv 调用在前面加上一个逗号

是否有一个 Pandas 函数来显示前 n 列/后 n 列，如 .head() 和 .tail()？

相关推荐

最近更新

标签