pandas.to_datetime 不一致的时间字符串格式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15929861/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:45:28  来源:igfitidea点击:

pandas.to_datetime inconsistent time string format

pythondatetimepandas

提问by random.me

I am attempting to convert the index of a pandas.DataFramefrom string format to a datetime index, using pandas.to_datetime().

我试图将pandas.DataFrame字符串格式的索引转换为日期时间索引,使用pandas.to_datetime().

Import pandas:

导入Pandas:

In [1]: import pandas as pd

In [2]: pd.__version__
Out[2]: '0.10.1'

Create an example DataFrame:

创建一个示例数据帧:

In [3]: d = {'data' : pd.Series([1.,2.], index=['26/12/2012', '10/01/2013'])}

In [4]: df=pd.DataFrame(d)

Look at indices. Note that the date format is day/month/year:

看指数。注意日期格式是日/月/年:

In [5]: df.index
Out[5]: Index([26/12/2012, 10/01/2013], dtype=object)

Convert index to datetime:

将索引转换为日期时间:

In [6]: pd.to_datetime(df.index)
Out[6]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-12-26 00:00:00, 2013-10-01 00:00:00]
Length: 2, Freq: None, Timezone: None

Already at this stage, you can see that the date format for each entry has been formatted differently. The first is fine, the second has swapped month and day.

在此阶段,您可以看到每个条目的日期格式都采用了不同的格式。第一个很好,第二个已经交换了月和日。

This is what I want to write, but avoiding the inconsistent formatting of date strings:

这就是我想写的,但要避免日期字符串的格式不一致:

In [7]: df.set_index(pd.to_datetime(df.index))
Out[7]: 
data
2012-12-26   1
2013-10-01   2

I guess the first entry is correct because the function 'knows' there aren't 26 months, and so does not choose the default month/day/year format.

我猜第一个条目是正确的,因为函数“知道”没有 26 个月,因此没有选择默认的月/日/年格式。

Is there another/better way to do this? Can I pass the format into the to_datetime()function?

有没有另一种/更好的方法来做到这一点?我可以将格式传递给to_datetime()函数吗?

Thank you.

谢谢你。

EDIT:

编辑:

I have found a way to do this, without pandas.to_datetime:

我找到了一种方法来做到这一点,而无需 pandas.to_datetime:

import datetime.datetime as dt
date_string_list = df.index.tolist()
datetime_list = [ dt.strptime(date_string_list[x], '%d/%m/%Y') for x in range(len(date_string_list)) ]
df.index=datetime_list

but it's a bit messy. Any improvements welcome.

但它有点乱。欢迎任何改进。

采纳答案by Andy Hayden

There are (hidden?) dayfirstargument to to_datetime:

有(隐藏的?)dayfirst论点to_datetime

In [23]: pd.to_datetime(df.index, dayfirst=True)
Out[23]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-12-26 00:00:00, 2013-01-10 00:00:00]
Length: 2, Freq: None, Timezone: None

In pandas 0.11 (onwards) you'll be able to use the formatargument:

在 pandas 0.11(以后)中,您将能够使用该format参数:

In [24]: pd.to_datetime(df.index, format='%d/%m/%Y')
Out[24]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-12-26 00:00:00, 2013-01-10 00:00:00]
Length: 2, Freq: None, Timezone: None