Python pandas to_sql 所有列都为 nvarchar

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34383000/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:54:47  来源:igfitidea点击:

pandas to_sql all columns as nvarchar

pythonpandassqlalchemy

提问by flyingmeatball

I have a pandas dataframe that is dynamically created with columns names that vary. I'm trying to push them to sql, but don't want them to go to mssqlserver as the default datatype "text" (can anyone explain why this is the default? Wouldn't it make sense to use a more common datatype?)

我有一个 Pandas 数据框,它是动态创建的,列名各不相同。我正在尝试将它们推送到 sql,但不希望它们将 mssqlserver 作为默认数据类型“文本”(谁能解释为什么这是默认值?使用更常见的数据类型是否有意义? )

Does anyone know how I can specify a datatype for all columns?

有谁知道如何为所有列指定数据类型?

column_errors.to_sql('load_errors',push_conn, if_exists = 'append', index = False, dtype = #Data type for all columns#)

the dtype argument takes a dict, and since I don't know what the columns will be it is hard to set them all to be 'sqlalchemy.types.NVARCHAR'

dtype 参数采用 dict,因为我不知道列是什么,所以很难将它们全部设置为 'sqlalchemy.types.NVARCHAR'

This is what I would like to do:

这就是我想做的:

column_errors.to_sql('load_errors',push_conn, if_exists = 'append', index = False, dtype = 'sqlalchemy.types.NVARCHAR')

Any help/understanding of how best to specify all column types would be much appreciated!

任何有关如何最好地指定所有列类型的帮助/理解将不胜感激!

采纳答案by joris

You can create this dict dynamically if you do not know the column names in advance:

如果您事先不知道列名,则可以动态创建此字典:

from sqlalchemy.types import NVARCHAR
df.to_sql(...., dtype={col_name: NVARCHAR for col_name in df})

Note that you have to pass the sqlalchemy type object itself (or an instance to specify parameters like NVARCHAR(length=10)) and nota string as in your example.

请注意,您必须传递 sqlalchemy 类型对象本身(或指定参数的实例,例如NVARCHAR(length=10)),而不是示例中的字符串。

回答by Parfait

To use dtype, pass a dictionary keyed to each data frame column with corresponding sqlalchemy types. Change keys to actual data frame column names:

要使用dtype,请传递一个字典,该字典以具有相应sqlalchemy 类型的每个数据框列为键。将键更改为实际数据框列名称:

import sqlalchemy
import pandas as pd
...

column_errors.to_sql('load_errors',push_conn, 
                      if_exists = 'append', 
                      index = False, 
                      dtype={'datefld': sqlalchemy.DateTime(), 
                             'intfld':  sqlalchemy.types.INTEGER(),
                             'strfld': sqlalchemy.types.NVARCHAR(length=255)
                             'floatfld': sqlalchemy.types.Float(precision=3, asdecimal=True)
                             'booleanfld': sqlalchemy.types.Boolean})

You may even be able to dynamically create this dtypedictionary given you do not know column names or types beforehand:

如果您dtype事先不知道列名或类型,您甚至可以动态创建此字典:

def sqlcol(dfparam):    

    dtypedict = {}
    for i,j in zip(dfparam.columns, dfparam.dtypes):
        if "object" in str(j):
            dtypedict.update({i: sqlalchemy.types.NVARCHAR(length=255)})

        if "datetime" in str(j):
            dtypedict.update({i: sqlalchemy.types.DateTime()})

        if "float" in str(j):
            dtypedict.update({i: sqlalchemy.types.Float(precision=3, asdecimal=True)})

        if "int" in str(j):
            dtypedict.update({i: sqlalchemy.types.INT()})

    return dtypedict

outputdict = sqlcol(df)    
column_errors.to_sql('load_errors', 
                     push_conn, 
                     if_exists = 'append', 
                     index = False, 
                     dtype = outputdict)