Pandas read_sql 数据类型
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46594631/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas read_sql DataTypes
提问by MattR
I have to compare two data sources to see if the same record is the same across all rows. One data source comes from an Excel File, where another comes from a SQL Table. I tried using DataFrame.equals()
Like i have in the past.
我必须比较两个数据源,以查看所有行中的相同记录是否相同。一个数据源来自 Excel 文件,另一个来自 SQL 表。我试过DataFrame.equals()
像我过去一样使用。
However, the issue is due to pesky datatype issues. Even though the data looksthe same, the datatypes are making excel_df.loc[excel_df['ID'] = 1].equals(sql_df.loc[sql_df['ID'] = 1])
return False
. Here is an example of the datatype from pd.read_excel()
:
但是,该问题是由于讨厌的数据类型问题造成的。即使数据看起来相同,数据类型也在excel_df.loc[excel_df['ID'] = 1].equals(sql_df.loc[sql_df['ID'] = 1])
返回False
。以下是来自 的数据类型示例pd.read_excel()
:
COLUMN ID int64
ANOTHER Id float64
SOME Date datetime64[ns]
Another Date datetime64[ns]
The same columns from pd.read_sql
:
来自 的相同列pd.read_sql
:
COLUMN ID float64
ANOTHER Id float64
SOME Date object
Another Date object
I could try using the converters
argument from pd.read_excel()
to match SQL. Or also doing df['Column_Name] = df['Column_Name].astype(dtype_here)
But I am dealing with a lot of columns. Is there an easier way to check for values across all columns?
我可以尝试使用converters
参数 frompd.read_excel()
来匹配 SQL。或者也在做df['Column_Name] = df['Column_Name].astype(dtype_here)
但是我正在处理很多列。有没有更简单的方法来检查所有列的值?
checking pd.read_sql()
there is no thing like converters
but I'm looking for something like:
检查pd.read_sql()
没有类似的东西,converters
但我正在寻找类似的东西:
df = pd.read_sql("Select * From Foo", con, dtypes = ({Column_name: str,
Column_name2:int}))
回答by Igor Raush
How about
怎么样
excel_df = pd.read_excel(...)
sql_df = pd.read_sql(...)
# attempt to cast all columns of excel_df to the types of sql_df
excel_df.astype(sql_df.dtypes.to_dict()).equals(sql_df)
回答by Mikhail Venkov
If you are seeing "Object" dtype that means that pandas can't interpret some of the rows as dates so instead in casts the whole column as Object (which is basically string)
如果您看到“Object”dtype,这意味着Pandas无法将某些行解释为日期,因此将整列转换为 Object(基本上是字符串)
Looking at documentation for dtypes, converters and parse_dates arguments: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
查看 dtypes、转换器和 parse_dates 参数的文档:https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
You can also check dayfirst argument to parse the dates correctly.
您还可以检查 dayfirst 参数以正确解析日期。