使用python将excel文件中的数据导入SQL Server

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51268991/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:48:08  来源:igfitidea点击:

Importing data from an excel file using python into SQL Server

pythonpython-3.xpyodbcxlrd

提问by mathlover

I have found some other questions that have a similar error to what I am getting, but have not been able to figure out how to resolve this based on the answers. I am trying to import an excel file into SQL Server with the help of python. This is the code I wrote:

我发现了一些其他问题,这些问题与我得到的问题有类似的错误,但无法根据答案弄清楚如何解决这个问题。我正在尝试在 python 的帮助下将一个 excel 文件导入 SQL Server。这是我写的代码:

import pandas as pd
import numpy as np
import pandas.io.sql
import pyodbc
import xlrd

server = "won't disclose private info"
db = 'private info'
conn = pyodbc.connect('DRIVER={SQL Server};SERVER=' + Server + ';DATABASE=' + 
db + ';Trusted_Connection=yes')

cursor = conn.cursor()
book = xlrd.open_workbook("Daily Flash.xlsx")
sheet = book.sheet_by_name("Sheet1")

query1 = """CREATE TABLE [LEAF].[MK] ([LEAF][Lease_Number] varchar(255), 
[LEAF][Start_Date] varchar(255), [LEAF][Report_Status] varchar(255), [LEAF] 
[Status_Date] varchar(255), [LEAF][Current_Status] varchar(255), [LEAF] 
[Sales_Rep] varchar(255), [LEAF][Customer_Name] varchar(255),[LEAF] 
[Total_Finance] varchar(255),
[LEAF][Rate_Class] varchar(255) ,[LEAF][Supplier_Name] varchar(255) ,[LEAF] 
[DecisionStatus] varchar(255))"""


query = """INSERT INTO [LEAF].[MK] (Lease_Number, Start_Date, Report_Status, 
Status_Date, Current_Status, Sales_Rep, Customer_Name,Total_Finance,
Rate_Class,Supplier_Name,DecisionStatus) VALUES (%s, %s, %s, %s, %s, %s, %s, 
%s, %s, %s, %s)"""

for r in range(1, sheet.nrows):
    Lease_Number  = sheet.cell(r,0).value
    Start_Date    = sheet.cell(r,1).value
    Report_Status = sheet.cell(r,2).value
    Status_Date   = sheet.cell(r,3).value
    Current_Status= sheet.cell(r,4).value
    Sales_Rep     = sheet.cell(r,5).value
    Customer_Name = sheet.cell(r,6).value
    Total_Financed= sheet.cell(r,7).value
    Rate_Class    = sheet.cell(r,8).value
    Supplier_Name = sheet.cell(r,9).value
    DecisionStatus= sheet.cell(r,10).value


    values = (Lease_Number, Start_Date, Report_Status, Status_Date, 
    Current_Status, Sales_Rep, Customer_Name, Total_Financed, Rate_Class, 
    Supplier_Name, DecisionStatus)

    cursor.execute(query1)

    cursor.execute(query, values)


database.commit()


database.close()


database.commit()

The error message I get is:

我得到的错误信息是:

ProgrammingError                          Traceback (most recent call last)
<ipython-input-24-c525ebf0af73> in <module>()
 16 
 17     # Execute sql Query
 ---> 18     cursor.execute(query, values)
 19 
 20 # Commit the transaction

 ProgrammingError: ('The SQL contains 0 parameter markers, but 11 parameters 
 were supplied', 'HY000')

Can someone please explain the problem to me and how I can fix it? Thank you!

有人可以向我解释这个问题以及我如何解决它吗?谢谢!

Update:

更新:

I have gotten that error message to go away based on the comments below. I modified my query also because the table into which I am trying to insert values into was not previously created, so I updated my code in an attempt to create it.

根据下面的评论,我收到了该错误消息。我修改了我的查询也是因为我尝试向其中插入值的表以前没有创建,所以我更新了我的代码以尝试创建它。

However, now I am getting the error message:

但是,现在我收到错误消息:

ProgrammingError: ('42000', '[42000] [Microsoft][ODBC SQL Server Driver][SQL 
Server]The specified schema name "dbo" either does not exist or you do not 
have permission to use it. (2760) (SQLExecDirectW)')

I tried changing that slightly by writing CREATE [HELLO][MK] instead of just CREATE MK but that tells me that MK is already in the database... What steps should I take next?

我尝试通过编写 CREATE [HELLO][MK] 而不是 CREATE MK 来稍微改变它,但这告诉我 MK 已经在数据库中......接下来我应该采取什么步骤?

采纳答案by Scratch'N'Purr

Based on the conversation we had in our chat, here are a few takeaways:

根据我们在聊天中的对话,以下是一些要点:

  1. After executing your CREATE TABLEquery, make sure to commit it immediately before running any subsequent INSERTqueries.
  2. Use error catching for cases when the table already exists in the database. You asked that if you wanted to import more data to the table, would the script still run. The answer is no, since Python will throw an exception at cursor.execute(query1).
  3. If you want to validate whether your insert operations were successful, you can do a simple record count check.
  1. 执行CREATE TABLE查询后,请确保在运行任何后续INSERT查询之前立即提交它。
  2. 当表已存在于数据库中时,使用错误捕获。您询问是否要将更多数据导入表中,脚本是否仍会运行。答案是否定的,因为 Python 会在cursor.execute(query1).
  3. 如果要验证插入操作是否成功,可以进行简单的记录计数检查。

EDITYesterday, when I had @mkheifetz test my code out, he caught a minor bug where the validation check would return False, and the reason was because the database already had existing records, so when comparing against only the current data being imported, the validation would fail. Therefore, as a solution to address the bug, I have modified the code again.

编辑昨天,当我让@mkheifetz 测试我的代码时,他发现了一个小错误,验证检查将返回 False,原因是因为数据库已经有现有记录,所以当只与正在导入的当前数据进行比较时,验证将失败。因此,为了解决这个bug,我再次修改了代码。

Below is how I would modify your code:

以下是我将如何修改您的代码:

import pandas as pd
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt

import pandas.io.sql
import pyodbc

import xlrd
server = 'XXXXX'
db = 'XXXXXdb'

# create Connection and Cursor objects
conn = pyodbc.connect('DRIVER={SQL Server};SERVER=' + server + ';DATABASE=' + db + ';Trusted_Connection=yes')
cursor = conn.cursor()

# read data
data = pd.read_excel('Flash Daily Apps through 070918.xls')

# rename columns
data = data.rename(columns={'Lease Number': 'Lease_Number',
                            'Start Date': 'Start_Date',
                            'Report Status': 'Report_Status',
                            'Status Date': 'Status_Date',
                            'Current Status': 'Current_Status',
                            'Sales Rep': 'Sales_Rep',
                            'Customer Name': 'Customer_Name',
                            'Total Financed': 'Total_Financed',
                            'Rate Class': 'Rate_Class',
                            'Supplier Name': 'Supplier_Name'})

# export
data.to_excel('Daily Flash.xlsx', index=False)

# Open the workbook and define the worksheet
book = xlrd.open_workbook("Daily Flash.xlsx")
sheet = book.sheet_by_name("Sheet1")

query1 = """
CREATE TABLE [LEAF].[ZZZ] (
    Lease_Number varchar(255),
    Start_Date varchar(255),
    Report_Status varchar(255),
    Status_Date varchar(255),
    Current_Status varchar(255),
    Sales_Rep varchar(255),
    Customer_Name varchar(255),
    Total_Finance varchar(255),
    Rate_Class varchar(255),
    Supplier_Name varchar(255),
    DecisionStatus varchar(255)
)"""

query = """
INSERT INTO [LEAF].[ZZZ] (
    Lease_Number,
    Start_Date,
    Report_Status,
    Status_Date,
    Current_Status,
    Sales_Rep,
    Customer_Name,
    Total_Finance,
    Rate_Class,
    Supplier_Name,
    DecisionStatus
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"""

# execute create table
try:
    cursor.execute(query1)
    conn.commit()
except pyodbc.ProgrammingError:
    pass

# grab existing row count in the database for validation later
cursor.execute("SELECT count(*) FROM LEAF.ZZZ")
before_import = cursor.fetchone()

for r in range(1, sheet.nrows):
    Lease_Number = sheet.cell(r,0).value
    Start_Date = sheet.cell(r,1).value
    Report_Status = sheet.cell(r,2).value
    Status_Date = sheet.cell(r,3).value
    Current_Status= sheet.cell(r,4).value
    Sales_Rep = sheet.cell(r,5).value
    Customer_Name = sheet.cell(r,6).value
    Total_Financed= sheet.cell(r,7).value
    Rate_Class = sheet.cell(r,8).value
    Supplier_Name = sheet.cell(r,9).value
    DecisionStatus= sheet.cell(r,10).value

    # Assign values from each row
    values = (Lease_Number, Start_Date, Report_Status, Status_Date, Current_Status,
              Sales_Rep, Customer_Name, Total_Financed, Rate_Class, Supplier_Name,
              DecisionStatus)

    # Execute sql Query
    cursor.execute(query, values)

# Commit the transaction
conn.commit()

# If you want to check if all rows are imported
cursor.execute("SELECT count(*) FROM LEAF.ZZZ")
result = cursor.fetchone()

print((result[0] - before_import[0]) == len(data.index))  # should be True

# Close the database connection
conn.close()