to_datetime 值错误:至少必须指定 [年、月、日] Pandas

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39992411/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:11:12  来源:igfitidea点击:

to_datetime Value Error: at least that [year, month, day] must be specified Pandas

pythonpandascsvdatetime

提问by Jed

I am reading from two different CSVs each having date values in their columns. After read_csv I want to convert the data to datetime with the to_datetime method. The formats of the dates in each CSV are slightly different, and although the differences are noted and specified in the to_datetime format argument, the one converts fine, while the other returns the following value error.

我正在从两个不同的 CSV 中读取,每个 CSV 的列中都有日期值。在 read_csv 之后,我想使用 to_datetime 方法将数据转换为日期时间。每个 CSV 中的日期格式略有不同,虽然在 to_datetime 格式参数中指出并指定了差异,但一个转换正常,而另一个返回以下值错误。

ValueError: to assemble mappings requires at least that [year, month, day] be sp
ecified: [day,month,year] is missing

first dte.head()

第一个 dte.head()

0  10/14/2016  10/17/2016  10/19/2016    8/9/2016  10/17/2016   7/20/2016
1   7/15/2016   7/18/2016   7/20/2016    6/7/2016   7/18/2016   4/19/2016
2   4/15/2016   4/14/2016   4/18/2016   3/15/2016   4/18/2016   1/14/2016
3   1/15/2016   1/19/2016   1/19/2016  10/19/2015   1/19/2016  10/13/2015
4  10/15/2015  10/14/2015  10/19/2015   7/23/2015  10/14/2015   7/15/2015

this dataframe converts fine using the following code:

使用以下代码可以很好地转换此数据框:

dte = pd.to_datetime(dte, infer_datetime_format=True)

or

或者

dte = pd.to_datetime(dte[x], format='%m/%d/%Y')

the second dtd.head()

第二个 dtd.head()

0   2004-01-02 2004-01-02  2004-01-09 2004-01-16  2004-01-23  2004-01-30
1   2004-01-05 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
2   2004-01-06 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
3   2004-01-07 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06
4   2004-01-08 2004-01-09  2004-01-16 2004-01-23  2004-01-30  2004-02-06

this csv doesn't convert using either:

此 csv 不会使用以下任一方法进行转换:

dtd = pd.to_datetime(dtd, infer_datetime_format=True)

or

或者

dtd = pd.to_datetime(dtd, format='%Y-%m-%d')

It returns the value error above. Interestingly, however, using the parse_dates and infer_datetime_format as arguments of the read_csv method work fine. What is going on here?

它返回上面的值错误。然而,有趣的是,使用 parse_dates 和 infer_datetime_format 作为 read_csv 方法的参数工作正常。这里发生了什么?

采纳答案by piRSquared

You can stack/ pd.to_datetime/ unstack

你可以stack/ pd.to_datetime/unstack

pd.to_datetime(dte.stack()).unstack()

enter image description here

在此处输入图片说明

explanation
pd.to_datetimeworks on a string, list, or pd.Series. dteis a pd.DataFrameand is why you are having issues. dte.stack()produces a a pd.Serieswhere all rows are stacked on top of each other. However, in this stacked form, because it is a pd.Series, I can get a vectorized pd.to_datetimeto work on it. the subsequent unstacksimply reverses the initial stackto get the original form of dte

解释
pd.to_datetime适用于字符串、列表或pd.Series. dte是 apd.DataFrame并且是您遇到问题的原因。dte.stack()生成 aapd.Series,其中所有行都堆叠在彼此的顶部。然而,在这种堆叠形式中,因为它是一个pd.Series,我可以得到一个矢量化pd.to_datetime来处理它。随后的unstack简单地反转初始stack以获得原始形式dte

回答by jezrael

For me works applyfunction to_datetime:

对我来说工作apply功能to_datetime

print (dtd)
            1           2           3           4           5           6
0                                                                        
0  2004-01-02  2004-01-02  2004-01-09  2004-01-16  2004-01-23  2004-01-30
1  2004-01-05  2004-01-09  2004-01-16  2004-01-23  2004-01-30  2004-02-06
2  2004-01-06  2004-01-09  2004-01-16  2004-01-23  2004-01-30  2004-02-06
3  2004-01-07  2004-01-09  2004-01-16  2004-01-23  2004-01-30  2004-02-06
4  2004-01-08  2004-01-09  2004-01-16  2004-01-23  2004-01-30  2004-02-06


dtd = dtd.apply(pd.to_datetime)

print (dtd)
           1          2          3          4          5          6
0                                                                  
0 2004-01-02 2004-01-02 2004-01-09 2004-01-16 2004-01-23 2004-01-30
1 2004-01-05 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
2 2004-01-06 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
3 2004-01-07 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06
4 2004-01-08 2004-01-09 2004-01-16 2004-01-23 2004-01-30 2004-02-06

回答by Guilherme Fernandes Lopes

It works for me:

这个对我有用:

dtd.apply(lambda x: pd.to_datetime(x,errors = 'coerce', format = '%Y-%m-%d'))

This way you can use function attributes like above (errors and format). See more https://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html

这样你就可以使用上面的函数属性(错误和格式)。查看更多https://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html

回答by rishi jain

Just would like to add - errors = 'coerce' to avoid any errors / NULL values you might have

只想添加 - errors = 'coerce' 以避免您可能拥有的任何错误 / NULL 值

dtd = dtd.apply(pd.to_datetime(errors='coerce'))