pandas 使用 pymssql 将数据插入 SQL Server 表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47929592/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Insert Data to SQL Server Table using pymssql
提问by Krishnang K Dalal
I am trying to write the data frame into the SQL Server Table. My code:
我正在尝试将数据框写入 SQL Server 表。我的代码:
conn = pymssql.connect(host="Dev02", database="DEVDb")
cur = conn.cursor()
query = "INSERT INTO dbo.SCORE_TABLE VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
cur.executemany(query, df_sql)
conn.commit()
cur.close()
conn.close()
The dimension of the df_sql
is (5860, 20)
i.e. the number of columns in the data frame is same as the number of columns in the SQL Server Table. Still I am getting following error:
的尺寸df_sql
是(5860, 20)
在所述数据帧的列即数目是相同的列在SQL Server表中的数字。我仍然收到以下错误:
ValueError: more placeholders in sql than params available
ValueError:sql 中的占位符多于可用的参数
UPDATED BELOW
更新如下
As per one of the comments, I tried using turbodbc
as below:
根据评论之一,我尝试使用turbodbc
如下:
conn = turbodbc.connect(driver="{SQL Server}", server="Dev02", Database="DEVDb")
conn.use_async_io = True
cur = conn.cursor()
query = "INSERT INTO dbo.STG_CONTACTABILITY_SCORE VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
cur.executemany(query, df_sql.values)
cur.commit()
cur.close()
conn.close()
I am getting following error:
我收到以下错误:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
ValueError:包含多个元素的数组的真值不明确。使用 a.any() 或 a.all()
I don't get it. What is wrong here. I see df_sql.values
and I don't find anything wrong.
我不明白。这里有什么问题。我明白了df_sql.values
,我没有发现任何问题。
The first row of ndarray is as below:
ndarray的第一行如下:
[nan 'DUSTIN HOPKINS' 'SOUTHEAST MISSOURI STATE UNIVERSITY' 13.0
'5736512217' None None 'Monday' '8:00AM' '9:00AM' 'Summer' None None None
None '2017-12-22 10:39:30.626331' 'Completed' None '1-11KUFFZ'
'Central Time Zone']
采纳答案by virtualdvid
I think you just need to specify each column name and don't forget the table must have the id field to charge the data frame index:
我认为您只需要指定每个列名,不要忘记表必须有 id 字段来收取数据框索引:
conn = pymssql.connect(host="Dev02", database="DEVDb")
cur = conn.cursor()
query = """INSERT INTO dbo.SCORE_TABLE(index, column1, column2, ..., column20)
VALUES (?, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
%s, %s, %s, %s, %s, %s)"""
cur.executemany(query, df_sql)
conn.commit()
cur.close()
conn.close()
Ok I have been using pandas and I exported the last data frame to csv like:
好的,我一直在使用Pandas,我将最后一个数据框导出到 csv,例如:
df.to_csv('new_file_name.csv', sep=',', encoding='utf-8')
Then I just used pyobdc
and BULK INSERT
Transact-SQL like:
然后,我只是用pyobdc
和BULK INSERT
的Transact-SQL这样的:
import pyodbc
conn = pyodbc.connect(DRIVER='{SQL Server}', Server='server_name', Database='Database_name', trusted_connection='yes')
cur = conn.cursor()
cur.execute("""BULK INSERT table_name
FROM 'C:\Users\folders path\new_file_name.csv'
WITH
(
CODEPAGE = 'ACP',
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)""")
conn.commit()
cur.close()
conn.close()
It was a second to charge 15314 rows into SQL Server. I hope this gives you an idea.
将 15314 行装入 SQL Server 只需一秒钟。我希望这能给你一个想法。
回答by Parfait
Possibly executemany
treats each row in the ndarray
from your df.values
call as oneitem since there are no comma separators between values. Hence, the placeholders outnumber actual binded values and you receive the mismatch error.
可能executemany
会将在每一行ndarray
从您的df.values
通话作为一个项目,因为有值之间没有逗号分隔符。因此,占位符的数量超过了实际绑定值,您会收到不匹配错误。
Consider converting array to a tuple of tuples (or lists of lists/tuple of lists/list of tuples) and then pass that object into executemany
:
考虑将数组转换为元组元组(或列表/列表元组/元组列表),然后将该对象传递给executemany
:
query = "INTO dbo.SCORE_TABLE VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
sql_data = tuple(map(tuple, df.values))
cur.executemany(query, sql_data)
cur.commit()
回答by MaxU
If i understand correctly you want to use DataFrame.to_sql()method:
如果我理解正确,您想使用DataFrame.to_sql()方法:
df_sql.to_sql('dbo.SCORE_TABLE', conn, index=False, if_exists='append')