pandas 类型错误:无法识别的值类型:<class 'str'>
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50554107/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
TypeError: Unrecognized value type: <class 'str'>
提问by Luke
I am currently trying to convert a Pandas column into a datetime column in order to work out the differences between three sets of date columns (1. date of hotel search, 2. date of stay arrival, 3. date of departure)
我目前正在尝试将 Pandas 列转换为日期时间列,以便计算出三组日期列之间的差异(1. 酒店搜索日期,2. 入住日期,3. 出发日期)
Here is a sample of the how it looks:
这是它的外观示例:
>>> print(df2)
date Arrive Depart
20180516
20180516
20180518 6172018 6242018
20180515
20180519
20180517
20180515 6052018 6062018
20180517 8132018 8162018
20180515 7112018 7152018
20180517 7272018 8012018
The Arrive and Depart are strings.
Arrive 和 Depart 是字符串。
I tried to convert df2['Arrive']
by using:
我尝试df2['Arrive']
使用以下方法进行转换:
df2['Arrive'] = pd.to_datetime(df2['Arrive'])
However this throws an error:
但是,这会引发错误:
TypeError: Unrecognized value type: <class 'str'>
I went through many articles but couldn't quite find what was going wrong or how to fix it.
我浏览了很多文章,但无法完全找到问题所在或如何解决。
回答by jezrael
Add parameter errors='coerce'
with format='%m%d%Y'
in to_datetime
:
添加参数errors='coerce'
与format='%m%d%Y'
在to_datetime
:
df2['Arrive'] = pd.to_datetime(df2['Arrive'], errors='coerce', format='%m%d%Y')
print (df2)
date Arrive Depart
0 20180516 NaT NaN
1 20180516 NaT NaN
2 20180518 2018-06-17 6242018
3 20180515 NaT NaN
4 20180519 NaT NaN
5 20180517 NaT NaN
6 20180515 2018-06-05 6062018
7 20180517 2018-08-13 8162018
8 20180515 2018-07-11 7152018
9 20180517 2018-07-27 8012018