vba 将 Excel 表数据传输到 SQL 2008R2 的最快方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10708077/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 16:14:41  来源:igfitidea点击:

Fastest way to transfer Excel table data to SQL 2008R2

excelvbasql-server-2008importlarge-data

提问by cshenderson

Does anyone know the fastest way to get data from and Excel table (VBA Array) to a table on SQL 2008 withoutusing an external utility (i.e. bcp)? Keep in mind my datasets are usually 6500-15000 rows, and about 150-250 columns; and I end up transferring about 20-150 of them during an automated VBA batch script.

有谁知道在使用外部实用程序(即 bcp)的情况下将数据从 Excel 表(VBA 数组)获取到 SQL 2008 上的表的最快方法吗?请记住,我的数据集通常为 6500-15000 行,大约 150-250 列;我最终在自动 VBA 批处理脚本中传输了大约 20-150 个。

I have tried several methods for getting large amounts of data from an Excel table (VBA) to SQL 2008. I have listed those below:

我尝试了几种方法来从 Excel 表 (VBA) 获取大量数据到 SQL 2008。我在下面列出了这些:

Method 1. Pass table into VBA Array and send to stored procedure (ADO) -- Sending to SQL is SLOW

方法 1. 将表传递到 VBA 数组并发送到存储过程 (ADO) -- 发送到 SQL 很慢

Method 2. Create disconnected RecordSet load it, then sync. -- Sending to SQL VERY SLOW

方法 2. 创建断开连接的 RecordSet 加载它,然后同步。-- 发送到 SQL 非常慢

Method 3. Put table into VBA array, loop though the array and concatenate(using delimiters) then send to stored procedure. -- Sending to SQL SLOW, but faster than Method 1 or 2.

方法 3. 将表放入 VBA 数组中,遍历数组并连接(使用分隔符)然后发送到存储过程。-- 发送到 SQL 很慢,但比方法 1 或 2 快。

Method 4. Put table into VBA array, loop though the array and concatenate(using delimiters) then place each row with ADO recordset .addnew command. --Sending to SQL very FAST (about 20 times faster than methods 1-3), but now I will need to split that data using a separate procedure, which will add significant wait time.

方法 4. 将表放入 VBA 数组中,遍历数组并连接(使用分隔符),然后使用 ADO 记录集 .addnew 命令放置每一行。-- 发送到 SQL 的速度非常快(比方法 1-3 快大约 20 倍),但现在我需要使用单独的过程拆分该数据,这将增加大量的等待时间。

Method 5. Put table in VBA array, serialize into XML, send to stored procedure as VARCHAR and specify XML in stored procedure. --Sending to SQL INCREDIBLY SLOW (about 100 times slower than methods 1 or 2)

方法5.将表放入VBA数组,序列化成XML,作为VARCHAR发送到存储过程,并在存储过程中指定XML。-- 发送到 SQL 非常慢(比方法 1 或 2 慢大约 100 倍)

Anything I am missing?

我缺少什么吗?

采纳答案by Fink

There is no single fastest way, as it's dependent on a number of factors. Make sure the indexes in SQL are configured and optimized. Lots of indexes will kill insert/update performance since each insert will need to update the index. Make sure you only make one connection to the database, and do not open/close it during the operation. Run the update when the server is under minimal load. The only other method you haven't tried is to use a ADO Command object, and issue a direct INSERT statement. When using the 'AddNew' Method of the recordset object, be sure to issue only one 'UpdateBatch' Command at the end of the inserts. Short of that, the VBA can only run as fast as the SQL server accepting the inputs.

没有一种最快的方法,因为它取决于许多因素。确保 SQL 中的索引已配置和优化。大量索引会影响插入/更新性能,因为每次插入都需要更新索引。确保您只与数据库建立一个连接,并且在操作过程中不要打开/关闭它。当服务器处于最小负载时运行更新。您还没有尝试过的唯一其他方法是使用 ADO Command 对象,并发出直接 INSERT 语句。使用记录集对象的“AddNew”方法时,请确保在插入结束时仅发出一个“UpdateBatch”命令。除此之外,VBA 的运行速度只能与接受输入的 SQL 服务器一样快。

EDIT: Seems like you've tried everything. There is also what is known as 'Bulk-Logged' recovery mode in SQL Server, that reduces the overhead of writting so much to the transaction log. Might be something worth looking into. It can be troublesome since it requires fiddling with the database recovery model a bit, but it could be useful for you.

编辑:似乎你已经尝试了一切。在 SQL Server 中还有所谓的“大容量日志”恢复模式,它减少了写入事务日志的开销。可能是值得研究的东西。这可能很麻烦,因为它需要稍微摆弄数据库恢复模型,但它可能对您有用。

回答by SaiKiran Mandhala

The following code will transfer the thousands of data in just few seconds(2-3 sec).

以下代码将在几秒钟(2-3 秒)内传输数千个数据。

Dim sheet As Worksheet
    Set sheet = ThisWorkbook.Sheets("DataSheet")        

    Dim Con As Object
    Dim cmd As Object
    Dim ServerName As String
    Dim level As Long
    Dim arr As Variant
    Dim row As Long
    Dim rowCount As Long

    Set Con = CreateObject("ADODB.Connection")
    Set cmd = CreateObject("ADODB.Command")

    ServerName = "192.164.1.11" 

    'Creating a connection
    Con.ConnectionString = "Provider=SQLOLEDB;" & _
                                    "Data Source=" & ServerName & ";" & _
                                    "Initial Catalog=Adventure;" & _
                                    "UID=sa; PWD=123;"

    'Setting provider Name
     Con.Provider = "Microsoft.JET.OLEDB.12.0"

    'Opening connection
     Con.Open                

    cmd.CommandType = 1             ' adCmdText

    Dim Rst As Object
    Set Rst = CreateObject("ADODB.Recordset")
    Table = "EmployeeDetails" 'This should be same as the database table name.
    With Rst
        Set .ActiveConnection = Con
        .Source = "SELECT * FROM " & Table
        .CursorLocation = 3         ' adUseClient
        .LockType = 4               ' adLockBatchOptimistic
        .CursorType = 0             ' adOpenForwardOnly
        .Open

        Dim tableFields(200) As Integer
        Dim rangeFields(200) As Integer

        Dim exportFieldsCount As Integer
        exportFieldsCount = 0

        Dim col As Integer
        Dim index As Integer
        index = 1

        For col = 1 To .Fields.Count
            exportFieldsCount = exportFieldsCount + 1
            tableFields(exportFieldsCount) = col
            rangeFields(exportFieldsCount) = index
            index = index + 1
        Next

        If exportFieldsCount = 0 Then
            ExportRangeToSQL = 1
            GoTo ConnectionEnd
        End If            

        endRow = ThisWorkbook.Sheets("DataSheet").Range("A65536").End(xlUp).row 'LastRow with the data.
        arr = ThisWorkbook.Sheets("DataSheet").Range("A1:CE" & endRow).Value 'This range selection column count should be same as database table column count.

        rowCount = UBound(arr, 1)            

        Dim val As Variant

        For row = 1 To rowCount
            .AddNew
            For col = 1 To exportFieldsCount
                val = arr(row, rangeFields(col))
                    .Fields(tableFields(col - 1)) = val
            Next
        Next

        .UpdateBatch
    End With

    flag = True

    'Closing RecordSet.
     If Rst.State = 1 Then
       Rst.Close
    End If

   'Closing Connection Object.
    If Con.State = 1 Then
      Con.Close
    End If

'Setting empty for the RecordSet & Connection Objects
Set Rst = Nothing
Set Con = Nothing
End Sub

回答by FKSP

works pretty fine, on the other hand to improve speed we may still modify the query:

工作得很好,另一方面,为了提高速度,我们仍然可以修改查询:

Instead: Source = "SELECT * FROM " & Table

反而: Source = "SELECT * FROM " & Table

We can use: Source = "SELECT TOP 1 * FROM " & Table

我们可以用: Source = "SELECT TOP 1 * FROM " & Table

Here is we only need column names. So no need to maka a query for entire table, which is extending the process as long as new data imported.

这里我们只需要列名。因此无需对整个表进行查询,只要导入新数据,就可以扩展该过程。

回答by darlove

As far as I remember, you can create a linked server to the Excel file (as long as the server can find the path; it's best to put the file on the server's local disk) and then use SQL to retrieve data from it.

据我所知,您可以创建一个链接服务器到 Excel 文件(只要服务器可以找到路径;最好将文件放在服务器的本地磁盘上),然后使用 SQL 从中检索数据。

回答by Mind Driver

Having just tried a few methods, I came back to a relatively simple but speedy one. It's fast because it makes the SQL server do all the work, including an efficient execution plan.

刚刚尝试了几种方法后,我又回到了一种相对简单但快速的方法。它很快,因为它让 SQL 服务器完成所有工作,包括高效的执行计划。

I just build a long string containing a script of INSERT statements.

我只是构建了一个包含 INSERT 语句脚本的长字符串。

    Public Sub Upload()
        Const Tbl As String = "YourTbl"
        Dim InsertQuery As String, xlRow As Long, xlCol As Integer
        Dim DBconnection As New ADODB.Connection

        DBconnection.Open "Provider=SQLOLEDB.1;Password=MyPassword" & _
            ";Persist Security Info=false;User ID=MyUserID" & _
            ";Initial Catalog=MyDB;Data Source=MyServer"

        InsertQuery = ""
        xlRow = 2
        While Cells(xlRow, 1) <> ""
            InsertQuery = InsertQuery & "INSERT INTO " & Tbl & " VALUES('"

            For xlCol = 1 To 6 'Must match the table structure
                InsertQuery = InsertQuery & Replace(Cells(xlRow, xlCol), "'", "''") & "', '"  'Includes mitigation for apostrophes in the data
            Next xlCol
            InsertQuery = InsertQuery & Format(Now(), "M/D/YYYY") & "')" & vbCrLf 'The last column is a date stamp, either way, don't forget to close that parenthesis
            xlRow = xlRow + 1
        Wend

        DBconnection.Execute InsertQuery 'I'll leave any error trapping to you
        DBconnection.Close  'But do be tidy :-)
        Set DBconnection = Nothing
    End Sub

回答by mwolfe02

By far the fastest way to do this is via T-SQL's BULK INSERT.

到目前为止,最快的方法是通过 T-SQL 的BULK INSERT.

There are a few caveats.

有一些注意事项。

  • You will likely need to export your data to a csv first (you may be able to import directly from Excel; my experience is in going from Access .mdbs to SQL Server which requires the interim step to csv).
  • The SQL Server machine needs to have access to that csv (when you run the BULK INSERTcommand and specify a filename, remember that the filename will be resolved on the machine where SQL Server is running).
  • You may need to tweak the default FIELDTERMINATORand ROWTERMINATORvalues to match your CSV.
  • 您可能需要先将数据导出到 csv(您可以直接从 Excel 导入;我的经验是从 Access .mdbs 到 SQL Server,这需要过渡到 csv 的步骤)。
  • SQL Server 计算机需要有权访问该 csv(当您运行BULK INSERT命令并指定文件名时,请记住文件名将在运行 SQL Server 的计算机上解析)。
  • 您可能需要调整默认值FIELDTERMINATORROWTERMINATOR值以匹配您的 CSV。

It took some trial and error for me to get this set up initially, but the performance increase was phenomenal compared to every other technique I had tried.

最初我花了一些试验和错误来完成这个设置,但与我尝试过的所有其他技术相比,性能提升是惊人的。