pandas 将熊猫数据框中的对象列转换为日期时间
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51898826/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting object column in pandas dataframe to datetime
提问by Matt Houston
I have an object column in a pandas dataframe in the format dd/mm/yyyy, that I want to convert with to_datetime.
我在格式为 dd/mm/yyyy 的 Pandas 数据框中有一个对象列,我想用 to_datetime 进行转换。
I tried to convert it to datetime using the below:
我尝试使用以下方法将其转换为日期时间:
df['Time stamp'] = pd.to_datetime(df['Time stamp'], format= '%d/%m/%Y')
I get the following errors:
我收到以下错误:
TypeError: Unrecognized value type: <class 'str'>
ValueError: unconverted data remains:
Does this mean that there is a blank row somewhere, I have checked the original csv and I cannot see one.
这是否意味着某处有一个空白行,我已经检查了原始 csv 并且我看不到一个。
采纳答案by ALollz
It means you have an extra space. Though pd.to_datetime
is very good at parsing dates normally without any format specified, when you actually specify a format, it has to match EXACTLY.
这意味着你有一个额外的空间。虽然pd.to_datetime
在没有指定任何格式的情况下通常非常擅长解析日期,但当您实际指定格式时,它必须完全匹配。
You can likely solve your issue by adding .str.strip()
to remove the extra whitespace before converting.
您可以通过.str.strip()
在转换之前添加删除额外的空格来解决您的问题。
import pandas as pd
df['Time stamp'] = pd.to_datetime(df['Time stamp'].str.strip(), format='%d/%m/%Y')
Alternatively, you can take advantage of its ability to parse various formats of dates by using the dayfirst=True
argument
或者,您可以通过使用dayfirst=True
参数来利用它解析各种日期格式的能力
df['Time stamp'] = pd.to_datetime(df['Time stamp'], dayfirst=True)
Example:
例子:
import pandas as pd
df = pd.DataFrame({'Time stamp': ['01/02/1988', '01/02/1988 ']})
pd.to_datetime(df['Time stamp'], format= '%d/%m/%Y')
ValueError: unconverted data remains:
ValueError:未转换的数据仍然存在:
pd.to_datetime(df['Time stamp'].str.strip(), format='%d/%m/%Y')
#0 1988-02-01
#1 1988-02-01
#Name: Time stamp, dtype: datetime64[ns]
pd.to_datetime(df['Time stamp'], dayfirst=True)
#0 1988-02-01
#1 1988-02-01
#Name: Time stamp, dtype: datetime64[ns]