使用 Python 将 CSV 文件导入 SQL Server
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39899088/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Import CSV file into SQL Server using Python
提问by Cesar
I am having trouble uploading a CSV file into a table in MS SQL Server, The CSV file has 25 columns and the header has the same name as table in SQL which also has 25 columns. When I run the script it throws an error
我在将 CSV 文件上传到 MS SQL Server 中的表时遇到问题,CSV 文件有 25 列,标题与 SQL 中的表同名,后者也有 25 列。当我运行脚本时,它会引发错误
params arg (<class 'list'>) can be only a tuple or a dictionary
What is the best way to import this data into MS SQL? Both the CSV and SQL table have the exact same column names.
将此数据导入 MS SQL 的最佳方法是什么?CSV 和 SQL 表都具有完全相同的列名。
Here is the code:
这是代码:
import csv
import pymssql
conn = pymssql.connect(
server="xx.xxx.xx.90",
port = 2433,
user='SQLAdmin',
password='xxxxxxxx',
database='NasrWeb'
)
cursor = conn.cursor()
customer_data = csv.reader('cleanNVG.csv') #25 columns with same header as SQL
for row in customer_data:
cursor.execute('INSERT INTO zzzOracle_Extract([Customer Name]\
,[Customer #]\
,[Account Name]\
,[Identifying Address Flag]\
,[Address1]\
,[Address2]\
,[Address3]\
,[Address4]\
,[City]\
,[County]\
,[State]\
,[Postal Code]\
,[Country]\
,[Category ]\
,[Class]\
,[Reference]\
,[Party Status]\
,[Address Status]\
,[Site Status]\
,[Ship To or Bill To]\
,[Default Warehouse]\
,[Default Order Type]\
,[Default Shipping Method]\
,[Optifacts Customer Number]\
,[Salesperson])''VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,)',row)
conn.commit()
cursor.close()
print("Done")
conn.close()
This is what the first rows of the CSV file looks like
这是 CSV 文件的第一行的样子
回答by citynorman
Try d6tstackwhich has fast pandas to SQL functionalitybecause it uses native DB import commands. It works for Postgres and MYSQL, MS SQL is experimental. Comment or raise an issue if it doesn't work.
尝试d6tstack,它具有快速的Pandas到 SQL 功能,因为它使用本机 DB 导入命令。它适用于 Postgres 和 MYSQL,MS SQL 是实验性的。如果它不起作用,请发表评论或提出问题。
import pandas as pd
df = pd.read_csv('cleanNVG.csv')
uri_mssql = 'mssql+pymssql://usr:pwd@localhost/db'
d6tstack.utils.pd_to_mssql(df, uri_mssql, 'table', 'schema') # experimental
It is also useful for importing multiple CSV with data schema changes and/or preprocess with pandas before writing to db, see further down in examples notebook
在写入数据库之前导入多个带有数据模式更改的 CSV 和/或使用 Pandas 进行预处理也很有用,请参阅示例笔记本中的进一步内容
d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'),
apply_after_read=apply_fun).to_mssql_combine(uri_psql, 'table')
回答by Gord Thompson
You are using csv.reader
incorrectly. The first argument to .reader
is not the path to the CSV file, it is
您使用csv.reader
不当。第一个参数.reader
不是 CSV 文件的路径,而是
[an] object which supports the iterator protocol and returns a string each time its
__next__()
method is called — file objects and list objects are both suitable.
[an] 支持迭代器协议并在每次
__next__()
调用其方法时返回一个字符串的对象——文件对象和列表对象都适用。
Hence, according to the example in the documentation, you should be doing something like this:
因此,根据文档中的示例,您应该执行以下操作:
import csv
with open('cleanNVG.csv', newline='') as csvfile:
customer_data = csv.reader(csvfile)
for row in customer_data:
cursor.execute(sql, tuple(row))
回答by Gord Thompson
Check the data types on the table, and the sizes of each field as well. If it is varchar(10) and your data is 20 characters long, it will throw an error.
检查表上的数据类型,以及每个字段的大小。如果它是 varchar(10) 并且您的数据长度为 20 个字符,则会引发错误。
Also,
还,
Consider building the query dynamically to ensure the number of placeholders matches your table and CSV file format. Then it's just a matter of ensuring your table and CSV file are correct, instead of checking that you typed enough ? placeholders in your code.
考虑动态构建查询以确保占位符的数量与您的表和 CSV 文件格式相匹配。那么只需确保您的表格和 CSV 文件正确无误,而不是检查您输入的内容是否足够?代码中的占位符。
The following example assumes
以下示例假设
CSV file contains column names in the first line
Connection is already built
File name is test.csv
Table name is MyTable
Python 3
...
with open ('test.csv', 'r') as f:
reader = csv.reader(f)
columns = next(reader)
query = 'insert into MyTable({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
cursor = connection.cursor()
for data in reader:
cursor.execute(query, data)
cursor.commit()
If column names are not included in the file:
如果文件中不包含列名:
...
with open ('test.csv', 'r') as f:
reader = csv.reader(f)
data = next(reader)
query = 'insert into dbo.Test values ({0})'
query = query.format(','.join('?' * len(data)))
cursor = connection.cursor()
cursor.execute(query, data)
for data in reader:
cursor.execute(query, data)
cursor.commit()
Basically, though, your code looks fine. Here is a generic sample.
不过,基本上,您的代码看起来不错。这是一个通用示例。
cur=cnxn.cursor() # Get the cursor
csv_data = csv.reader(file(Samplefile.csv')) # Read the csv
for rows in csv_data: # Iterate through csv
cur.execute("INSERT INTO MyTable(Col1,Col2,Col3,Col4) VALUES (?,?,?,?)",rows)
cnxn.commit()