vba 使用 ADO 将 Excel 电子表格导入数组的更快方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26743347/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-12 05:07:47  来源:igfitidea点击:

Faster Way to Import Excel Spreadsheet to Array With ADO

excelvbaexcel-vbaexcel-2007ado

提问by 110SidedHexagon

I am trying to import and sort data from a large excel report into a new file using Excel 2007 VBA. I have come up with two methods so far for doing this:

我正在尝试使用 Excel 2007 VBA 将大型 Excel 报告中的数据导入和排序到新文件中。到目前为止,我已经提出了两种方法来做到这一点:

  1. Have Excel actually open the file (code below), gather all data into arrays and output the arrays onto new sheets in the same file and save/close it.

     Public Sub GetData()
    
         Dim FilePath As String
    
         FilePath = "D:\File_Test.xlsx"
         Workbooks.OpenText Filename:=FilePath, FieldInfo:=Array(Array(2, 2))
         ActiveWorkbook.Sheets(1).Select
    
     End Sub
    
  2. Use ADO to get all data out of the closed workbook, import the whole datasheet into an array (code below) and sort data from there and then output data into a new workbook and save/close that.

     Private Sub PopArray() 'Uses ADO to populate an array that will be used to sort data
         Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
         Dim Getvalue, SourceRange, SourceFile, dbConnectionString  As String
    
         SourceFile = "D:\File_Test.xlsx"
         SourceRange = "B1:Z180000"
    
         dbConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" & _
         "Data Source=" & SourceFile & ";" & _
         "Extended Properties=""Excel 12.0 Xml;HDR=No"";"
         Set dbConnection = New ADODB.Connection
         dbConnection.Open dbConnectionString 'open the database connection
    
         Set rs = dbConnection.Execute("SELECT * FROM [" & SourceRange & "]")
         Arr = rs.GetRows
    
         UpBound = UBound(Arr, 2)
         rs.Close
     End Sub
    
  1. 让 Excel 实际打开文件(下面的代码),将所有数据收集到数组中,并将数组输出到同一文件中的新工作表上,然后保存/关闭它。

     Public Sub GetData()
    
         Dim FilePath As String
    
         FilePath = "D:\File_Test.xlsx"
         Workbooks.OpenText Filename:=FilePath, FieldInfo:=Array(Array(2, 2))
         ActiveWorkbook.Sheets(1).Select
    
     End Sub
    
  2. 使用 ADO 从关闭的工作簿中获取所有数据,将整个数据表导入一个数组(下面的代码)并从那里对数据进行排序,然后将数据输出到一个新工作簿中并保存/关闭它。

     Private Sub PopArray() 'Uses ADO to populate an array that will be used to sort data
         Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
         Dim Getvalue, SourceRange, SourceFile, dbConnectionString  As String
    
         SourceFile = "D:\File_Test.xlsx"
         SourceRange = "B1:Z180000"
    
         dbConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" & _
         "Data Source=" & SourceFile & ";" & _
         "Extended Properties=""Excel 12.0 Xml;HDR=No"";"
         Set dbConnection = New ADODB.Connection
         dbConnection.Open dbConnectionString 'open the database connection
    
         Set rs = dbConnection.Execute("SELECT * FROM [" & SourceRange & "]")
         Arr = rs.GetRows
    
         UpBound = UBound(Arr, 2)
         rs.Close
     End Sub
    

The test file used has about 65000 records to sort through (about a third of what I will end up using it for). I was kind of disappointed when the ADO version only performed marginally better than the open worksheet (~44 seconds vs ~40 seconds run time). I was wondering if there is some way to improve the ADO import method (or a completely different method - ExecuteExcel4Macro maybe? - if there is one) that would boost my speed. The only thing I could think of was that I am using "B1:Z180000"as my SourceRangeas a maximum range that is then truncated by setting Arr = rs.GetRowsto accurately reflect the total number of records. If that is what is causing the slow down, I'm not sure how I would go about finding how many rows are in the sheet.

使用的测试文件有大约 65000 条记录需要整理(大约是我最终将使用它的三分之一)。当 ADO 版本的性能仅比打开的工作表好一点(~44 秒对~40 秒的运行时间)时,我有点失望。我想知道是否有某种方法可以改进 ADO 导入方法(或完全不同的方法 - ExecuteExcel4Macro 也许? - 如果有的话)可以提高我的速度。我唯一能想到的是,我将"B1:Z180000"SourceRange用作最大范围,然后通过设置Arr = rs.GetRows来截断以准确反映记录总数。如果这就是导致速度变慢的原因,我不确定我将如何查找工作表中有多少行。

Edit - I am using Range("A1:A" & i) = (Array) to insert data into the new worksheet.

编辑 - 我正在使用 Range("A1:A" & i) = (Array) 将数据插入到新工作表中。

回答by Ralph

This answer might not be what you are looking for but I still felt compelled to post it based on your side note [...] or a completely different method ]...].

这个答案可能不是你想要的,但我仍然觉得有必要根据你的旁注 [...] 或完全不同的方法来发布它]...]。

Here, I am working with files of 200MB (and more) each which are merely text files including delimiters. I do not load them into Excel anymore. I also had the problem that Excel was too slow and needs to load the entire file. Yet, Excel is very fast at opening these files using the Openmethod:

在这里,我正在处理 200MB(或更多)的文件,每个文件只是包含分隔符的文本文件。我不再将它们加载到 Excel 中。我也遇到了 Excel 太慢并且需要加载整个文件的问题。然而,Excel 使用以下Open方法打开这些文件的速度非常快:

Open strFileNameAndPath For Input Access Read Lock Read As #intPointer

In this case Excel is not loading the entire file but merely reading it line by line. So, Excel can already process the data (forward it) and then grab the next line of data. Like this Excel does not neet the memory to load 200MB.

在这种情况下,Excel 不会加载整个文件,而只是逐行读取。所以,Excel 已经可以处理数据(转发),然后抓取下一行数据。像这样 Excel 不需要加载 200MB 的内存。

With this method I am then loading the data in a locally installed SQL which transfers the data directly to our DWH (also SQL). To speed up the transfer using the above mething and getting the data fast into the SQL server I am transferring the data in chunks of 1000 rows each. The string variable in Excel can hold up to 2 billion characters. So, there is not problem there.

使用这种方法,我然后将数据加载到本地安装的 SQL 中,该 SQL 将数据直接传输到我们的 DWH(也是 SQL)。为了使用上述方法加快传输速度并使数据快速进入 SQL 服务器,我将数据以每块 1000 行的形式传输。Excel 中的字符串变量最多可容纳 20 亿个字符。所以,那里没有问题。

One might wonder why I am not simply using SSIS if I am already using a local installation of SQL. Yet, the problem is that I am not the one loading all these files anymore. Using Excel to generate this "import tool" allowed me to forward these tools to others, who are now uploading all these files for me. Giving all of them access to SSIS was not an option nor the possibility of using a destined network drive where one could place these files and SSIS would automatically load them (ever 10+ minutes or so).

有人可能想知道,如果我已经在使用本地安装的 SQL,为什么我不简单地使用 SSIS。然而,问题是我不再是加载所有这些文件的人了。使用 Excel 生成这个“导入工具”让我可以将这些工具转发给其他人,他们现在正在为我上传所有这些文件。让所有人都可以访问 SSIS 既不是一种选择,也不是使用目标网络驱动器的可能性,人们可以在其中放置这些文件,SSIS 会自动加载它们(每 10 分钟以上)。

In the end my code looks something like this.

最后我的代码看起来像这样。

Set conRCServer = New ADODB.Connection
conRCServer.ConnectionString = "PROVIDER=SQLOLEDB; " _
    & "DATA SOURCE=" & Ref.Range("C2").Value2 & ";" _
    & "INITIAL CATALOG=" & Ref.Range("C4").Value & ";" _
    & "Integrated Security=SSPI "
On Error GoTo SQL_ConnectionError
conRCServer.Open
On Error GoTo 0

'Save the name of the current file
strCurrentFile = ActiveWorkbook.Name

'Prepare a dialog box for the user to pick a file and show it
'   ...if no file has been selected then exit
'   ...otherwise parse the selection into it's path and the name of the file
Call Application.FileDialog(msoFileDialogOpen).Filters.Clear
Call Application.FileDialog(msoFileDialogOpen).Filters.Add("Extracts", "*.csv")
Application.FileDialog(msoFileDialogOpen).Title = "Select ONE Extract to import..."
Application.FileDialog(msoFileDialogOpen).AllowMultiSelect = False
intChoice = Application.FileDialog(msoFileDialogOpen).Show
If intChoice <> 0 Then
    strFileToPatch = Application.FileDialog(msoFileDialogOpen).SelectedItems(1)
Else
    Exit Sub
End If

'Open the Extract for import and close it afterwards
intPointer = FreeFile()
Open strFileNameAndPath For Input Access Read Lock Read As #intPointer

intCounter = 0
strSQL = vbNullString
Do Until EOF(intPointer)
    Line Input #intPointer, strLine
    If Left(strLine, 4) = """@@@" Then Exit Sub
    '*********************************************************************
    '** Starting a new SQL command
    '*********************************************************************
    If intCounter = 0 Then
        Set rstResult = New ADODB.Recordset
        strSQL = "set nocount on; "
        strSQL = strSQL & "insert into dbo.tblTMP "
        strSQL = strSQL & "values "
    End If
    '*********************************************************************
    '** Transcribe the current line into SQL
    '*********************************************************************
    varArray = Split(strLine, ",")
    strSQL = strSQL & " (" & varArray(0) & ", " & varArray(1) & ", N'" & varArray(2) & "', "
    strSQL = strSQL & " N'" & varArray(3) & "', N'" & varArray(4) & "', N'" & varArray(5) & "', "
    strSQL = strSQL & " N'" & varArray(6) & "', " & varArray(8) & ", N'" & varArray(9) & "', "
    strSQL = strSQL & " N'" & varArray(10) & "', N'" & varArray(11) & "', N'" & varArray(12) & "', "
    strSQL = strSQL & " N'" & varArray(13) & "', N'" & varArray(14) & "', N'" & varArray(15) & "' ), "
    '*********************************************************************
    '** Execute the SQL command in bulks of 1.000
    '*********************************************************************
    If intCounter >= 1000 Then
        strSQL = Mid(strSQL, 1, Len(strSQL) - 2)
        rstResult.ActiveConnection = conRCServer
        On Error GoTo SQL_StatementError
        rstResult.Open strSQL
        On Error GoTo 0
        If Not rstResult.EOF And Not rstResult.BOF Then
            strErrorMessage = "The server returned the following error message(s):" & Chr(10)
            While Not rstResult.EOF And Not rstResult.BOF
                strErrorMessage = Chr(10) & strErrorMessage & rstResult.Fields(0).Value
                rstResult.MoveNext
            Wend
            MsgBox strErrorMessage & Chr(10) & Chr(10) & "Aborting..."
            Exit Sub
        End If
    End If
    intCounter = intCounter + 1
Loop

Close intPointer

Set rstResult = Nothing

Exit Sub

SQL_ConnectionError:
Y = MsgBox("Couldn't connect to the server. Please make sure that you have a working internet connection. " & _
            "Do you want me to prepare an error-email?", 52, "Problems connecting to Server...")
If Y = 6 Then
    Set OutApp = CreateObject("Outlook.Application")
    Set OutMail = OutApp.CreateItem(0)
    With OutMail
        .To = Ref.Range("C7").Value2
        .CC = Ref.Range("C8").Value2
        .Subject = "Problems connecting to database '" & Ref.Range("C4").Value & "' on server '" & Ref.Range("C2").Value & "'"
        .HTMLBody = "<span style=""font-size:10px"">---Automatically generated Error-Email---" & _
                "</span><br><br>Error report from the file '" & _
                "<span style=""color:blue"">" & ActiveWorkbook.Name & _
                "</span>' located and saved on '<span style=""color:blue"">" & _
                ActiveWorkbook.Path & "</span>'.<br>" & _
                "Excel is not able to establish a connection to the server. Technical data to follow." & "<br><br>" & _
                "Computer Name:    <span style=""color:green;"">" & Environ("COMPUTERNAME") & "</span><br>" & _
                "Logged in as:     <span style=""color:green;"">" & Environ("USERDOMAIN") & "/" & Environ("USERNAME") & "</span><br>" & _
                "Domain Server:    <span style=""color:green;"">" & Environ("LOGONSERVER") & "</span><br>" & _
                "User DNS Domain:  <span style=""color:green;"">" & Environ("USERDNSDOMAIN") & "</span><br>" & _
                "Operating System: <span style=""color:green;"">" & Environ("OS") & "</span><br>" & _
                "Excel Version:    <span style=""color:green;"">" & Application.Version & "</span><br>" & _
                "<br><span style=""font-size:10px""><br>" & _
                "<br><br>---Automatically generated Error-Email---"
        .Display
    End With
    Set OutMail = Nothing
    Set OutApp = Nothing
End If
Exit Sub

SQL_StatementError:
Y = MsgBox("There seems to be a problem with the SQL Syntax in the programming. " & _
            "May I send an error-email to development team?", 52, "Problems with the coding...")
If Y = 6 Then
    Set OutApp = CreateObject("Outlook.Application")
    Set OutMail = OutApp.CreateItem(0)
    With OutMail
        .To = Ref.Range("C8").Value2
        '.CC = ""
        .Subject = "Problems with the SQL Syntax in file '" & ActiveWorkbook.Name & "'."
        .HTMLBody = "<span style=""font-size:10px"">" & _
                "---Automatically generated Error-Email---" & _
                "</span><br><br>" & _
                "Error report from the file '" & _
                "<span style=""color:blue"">" & _
                ActiveWorkbook.Name & _
                "</span>" & _
                "' located and saved on '" & _
                "<span style=""color:blue"">" & _
                ActiveWorkbook.Path & _
                "</span>" & _
                "'.<br>" & _
                "It seems that there is a problem with the SQL-Code within trying to upload an extract to the server." & _
                "SQL-Code causing the problems:" & _
                "<br><br><span style=""color:green;"">" & _
                strSQL & _
                "</span><br><br><span style=""font-size:10px"">" & _
                "---Automatically generated Error-Email---"
        .Display
    End With
    Set OutMail = Nothing
    Set OutApp = Nothing
End If
Exit Sub

End Sub

回答by Rafa Barragan

i think that @Mr. Mascaro is right the easiest way to past your data from a Recordsetinto a spreadsheet is:

我认为@Mr. Mascaro 是将数据从Recordset电子表格传递到电子表格的最简单方法是:

Private Sub PopArray()
    .....
    Set rs = dbConnection.Execute("SELECT * FROM [" & SourceRange & "]")  
    '' This is faster
    Range("A1").CopyFromRecordset rs
    ''Arr = rs.GetRows
End Sub

but if you still want to use Arraysyou could try this:

但如果你仍然想使用Arrays你可以试试这个:

Sub ArrayTest  

'' Array for Test
Dim aSingleArray As Variant  
Dim aMultiArray as Variant  

'' Set values 
aSingleArray = Array("A","B","C","D","E")  
aMultiArray = Array(aSingleArray, aSingleArray)

'' You can drop data from the Array using 'Resize'
'' Btw, your Array must be transpose to use this :P
Range("A1").Resize( _
            UBound(aMultiArray(0), 1) + 1, _  
            UBound(aMultiArray, 1) + 1) = Application.Transpose(aMultiArray)

End Sub