将 csv 文件加载到 VBA 数组而不是 Excel 工作表中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12259595/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Load csv file into a VBA array rather than Excel Sheet
提问by The_Barman
I am currently able to enter csv file data into Excel VBA by uploading the data via the code below then handling the table, surely not the best way as I am only interested in some of the data and delete the sheet after using the data:
我目前可以通过下面的代码上传数据然后处理表格,将 csv 文件数据输入到 Excel VBA 中,这肯定不是最好的方法,因为我只对某些数据感兴趣并在使用数据后删除表格:
Sub CSV_Import()
Dim ws As Worksheet, strFile As String
Set ws = ActiveSheet 'set to current worksheet name
strFile = Application.GetOpenFilename("Text Files (*.csv),*.csv", ,"Please select text file...")
With ws.QueryTables.Add(Connection:="TEXT;" & strFile, Destination:=ws.Range("A1"))
.TextFileParseType = xlDelimited
.TextFileCommaDelimiter = True
.Refresh
End With
End Sub
Is it possible to simply load the csv into a two dimensional variant array in VBA rather than going through the use of an excel worksheet?
是否可以简单地将 csv 加载到 VBA 中的二维变量数组中,而不是通过使用 excel 工作表?
采纳答案by The_Barman
OK, after looking into this, the solution I have arived at is to use ADODB (requires reference to ActiveX Data Objects, this loads the csv file into array without cycling the rows columns. Does require the data to be in good condition.
好的,在研究了这个之后,我想到的解决方案是使用 ADODB(需要引用 ActiveX 数据对象,这将 csv 文件加载到数组中而不循环行列。确实需要数据处于良好状态。
Sub LoadCSVtoArray()
strPath = ThisWorkbook.Path & "\"
Set cn = CreateObject("ADODB.Connection")
strcon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=""text;HDR=Yes;FMT=Delimited"";"
cn.Open strcon
strSQL = "SELECT * FROM SAMPLE.csv;"
Dim rs As Recordset
Dim rsARR() As Variant
Set rs = cn.Execute(strSQL)
rsARR = WorksheetFunction.Transpose(rs.GetRows)
rs.Close
Set cn = Nothing
[a1].Resize(UBound(rsARR), UBound(Application.Transpose(rsARR))) = rsARR
End Sub
回答by Nigel Heffernan
Okay, looks like you need two things: stream the data from the file, and populate a 2-D array.
好的,看起来您需要两件事:从文件中流式传输数据,并填充一个二维数组。
I have a 'Join2d' and a 'Split2d' function lying around (I recall posting them in another reply on StackOverflow a while ago). Do look at the comments in the code, there are things you might need to know about efficient string-handling if you're handling large files.
我有一个 'Join2d' 和一个 'Split2d' 函数(我记得不久前在 StackOverflow 上的另一个回复中发布了它们)。请查看代码中的注释,如果您正在处理大文件,您可能需要了解一些有关有效字符串处理的信息。
However, it's not a complicated function to use: just paste the code if you're in a hurry.
但是,它的使用并不复杂:如果您有急事,只需粘贴代码即可。
Streaming the file is simple BUT we're making assumptions about the file format: are the lines in the file delimited by Carriage-Return characters or the Carriage-Return-and-Linefeed character pair? I'm assuming 'CR' rather than CRLF, but you need to check that.
流式传输文件很简单,但我们对文件格式进行了假设:文件中的行是由回车字符还是回车符和换行符字符对分隔的?我假设是“CR”而不是 CRLF,但您需要检查一下。
Another assumption about the format is that numeric data will appear as-is, and string or character data will be encapsulated in quote marks. This shouldbe true, but often isn't... And stripping out the quote marks adds a lot of processing - lots of allocating and deallocating strings - which you really don't want to be doing in a large array. I've short-cut the obvious cell-by-cell find-and-replace, but it's still an issue on large files.
关于格式的另一个假设是数字数据将按原样显示,字符串或字符数据将封装在引号中。这应该是真的,但通常不是......并且去掉引号会增加很多处理——大量的分配和解除分配字符串——你真的不想在一个大数组中做这些。我已经缩短了明显的逐个单元格查找和替换的方法,但它仍然是大文件的问题。
If your file has commas embedded in the string values, this code won't work: and don't try to code up a parser that picks out the encapsulated text and skips these embedded commas when splitting-up the rows of data into individual fields, because this intensive string-handling can't be optimised into a fast and reliable csv reader by VBA.
如果您的文件在字符串值中嵌入了逗号,则此代码将不起作用:并且不要尝试编写一个解析器来挑选出封装的文本并在将数据行拆分为单个字段时跳过这些嵌入的逗号,因为这种密集的字符串处理无法通过 VBA 优化为快速可靠的 csv 阅读器。
Anyway: here's the source code: watch out for line-breaks inserted by StackOverflow's textbox control:
无论如何:这是源代码:注意 StackOverflow 的文本框控件插入的换行符:
Running the code:
运行代码:
Note that you'll need a reference to the Microsoft Scripting Runtime (system32\scrrun32.dll)
请注意,您需要引用 Microsoft Scripting Runtime (system32\scrrun32.dll)
Private Sub test()
Dim arrX As Variant
arrX = ArrayFromCSVfile("MyFile.csv")
End Sub
Streaming a csv file.
流式传输 csv 文件。
Note that I'm assuming your file is in the temp folder: C:\Documents and Settings[$USERNAME]\Local Settings\Temp You'll need to use filesystem commands to copy the file into a local folder: it's always quicker than working across the network.
请注意,我假设您的文件位于临时文件夹中:C:\Documents and Settings[$USERNAME]\Local Settings\Temp 您需要使用文件系统命令将文件复制到本地文件夹中:它总是比跨网络工作。
Public Function ArrayFromCSVfile( _
strName As String, _
Optional RowDelimiter As String = vbCr, _
Optional FieldDelimiter = ",", _
Optional RemoveQuotes As Boolean = True _
) As Variant
' Load a file created by FileToArray into a 2-dimensional array
' The file name is specified by strName, and it is exected to exist
' in the user's temporary folder. This is a deliberate restriction:
' it's always faster to copy remote files to a local drive than to
' edit them across the network
' RemoveQuotes=TRUE strips out the double-quote marks (Char 34) that
' encapsulate strings in most csv files.
On Error Resume Next
Dim objFSO As Scripting.FileSystemObject
Dim arrData As Variant
Dim strFile As String
Dim strTemp As String
Set objFSO = New Scripting.FileSystemObject
strTemp = objFSO.GetSpecialFolder(Scripting.TemporaryFolder).ShortPath
strFile = objFSO.BuildPath(strTemp, strName)
If Not objFSO.FileExists(strFile) Then ' raise an error?
Exit Function
End If
Application.StatusBar = "Reading the file... (" & strName & ")"
If Not RemoveQuotes Then
arrData = Join2d(objFSO.OpenTextFile(strFile, ForReading).ReadAll, RowDelimiter, FieldDelimiter)
Application.StatusBar = "Reading the file... Done"
Else
' we have to do some allocation here...
strTemp = objFSO.OpenTextFile(strFile, ForReading).ReadAll
Application.StatusBar = "Reading the file... Done"
Application.StatusBar = "Parsing the file..."
strTemp = Replace$(strTemp, Chr(34) & RowDelimiter, RowDelimiter)
strTemp = Replace$(strTemp, RowDelimiter & Chr(34), RowDelimiter)
strTemp = Replace$(strTemp, Chr(34) & FieldDelimiter, FieldDelimiter)
strTemp = Replace$(strTemp, FieldDelimiter & Chr(34), FieldDelimiter)
If Right$(strTemp, Len(strTemp)) = Chr(34) Then
strTemp = Left$(strTemp, Len(strTemp) - 1)
End If
If Left$(strTemp, 1) = Chr(34) Then
strTemp = Right$(strTemp, Len(strTemp) - 1)
End If
Application.StatusBar = "Parsing the file... Done"
arrData = Split2d(strTemp, RowDelimiter, FieldDelimiter)
strTemp = ""
End If
Application.StatusBar = False
Set objFSO = Nothing
ArrayFromCSVfile = arrData
Erase arrData
End Function
Split2dCreates a 2-dimensional VBA array from a string:
Split2d从字符串创建一个二维 VBA 数组:
Public Function Split2d(ByRef strInput As String, _
Optional RowDelimiter As String = vbCr, _
Optional FieldDelimiter = vbTab, _
Optional CoerceLowerBound As Long = 0 _
) As Variant
' Split up a string into a 2-dimensional array.
' Works like VBA.Strings.Split, for a 2-dimensional array.
' Check your lower bounds on return: never assume that any array in
' VBA is zero-based, even if you've set Option Base 0
' If in doubt, coerce the lower bounds to 0 or 1 by setting
' CoerceLowerBound
' Note that the default delimiters are those inserted into the
' string returned by ADODB.Recordset.GetString
On Error Resume Next
' Coding note: we're not doing any string-handling in VBA.Strings -
' allocating, deallocating and (especially!) concatenating are SLOW.
' We're using the VBA Join & Split functions ONLY. The VBA Join,
' Split, & Replace functions are linked directly to fast (by VBA
' standards) functions in the native Windows code. Feel free to
' optimise further by declaring and using the Kernel string functions
' if you want to.
' ** THIS CODE IS IN THE PUBLIC DOMAIN **
' Nigel Heffernan Excellerando.Blogspot.com
Dim i As Long
Dim j As Long
Dim i_n As Long
Dim j_n As Long
Dim i_lBound As Long
Dim i_uBound As Long
Dim j_lBound As Long
Dim j_uBound As Long
Dim arrTemp1 As Variant
Dim arrTemp2 As Variant
arrTemp1 = Split(strInput, RowDelimiter)
i_lBound = LBound(arrTemp1)
i_uBound = UBound(arrTemp1)
If VBA.LenB(arrTemp1(i_uBound)) <= 0 Then
' clip out empty last row: a common artifact in data
'loaded from files with a terminating row delimiter
i_uBound = i_uBound - 1
End If
i = i_lBound
arrTemp2 = Split(arrTemp1(i), FieldDelimiter)
j_lBound = LBound(arrTemp2)
j_uBound = UBound(arrTemp2)
If VBA.LenB(arrTemp2(j_uBound)) <= 0 Then
' ! potential error: first row with an empty last field...
j_uBound = j_uBound - 1
End If
i_n = CoerceLowerBound - i_lBound
j_n = CoerceLowerBound - j_lBound
ReDim arrData(i_lBound + i_n To i_uBound + i_n, j_lBound + j_n To j_uBound + j_n)
' As we've got the first row already... populate it
' here, and start the main loop from lbound+1
For j = j_lBound To j_uBound
arrData(i_lBound + i_n, j + j_n) = arrTemp2(j)
Next j
For i = i_lBound + 1 To i_uBound Step 1
arrTemp2 = Split(arrTemp1(i), FieldDelimiter)
For j = j_lBound To j_uBound Step 1
arrData(i + i_n, j + j_n) = arrTemp2(j)
Next j
Erase arrTemp2
Next i
Erase arrTemp1
Application.StatusBar = False
Split2d = arrData
End Function
Join2DTurns a 2-dimensional VBA array to a string:
Join2D将二维 VBA 数组转换为字符串:
Public Function Join2d(ByRef InputArray As Variant, _
Optional RowDelimiter As String = vbCr, _
Optional FieldDelimiter = vbTab, _
Optional SkipBlankRows As Boolean = False _
) As String
' Join up a 2-dimensional array into a string. Works like the standard
' VBA.Strings.Join, for a 2-dimensional array.
' Note that the default delimiters are those inserted into the string
' returned by ADODB.Recordset.GetString
On Error Resume Next
' Coding note: we're not doing any string-handling in VBA.Strings -
' allocating, deallocating and (especially!) concatenating are SLOW.
' We're using the VBA Join & Split functions ONLY. The VBA Join,
' Split, & Replace functions are linked directly to fast (by VBA
' standards) functions in the native Windows code. Feel free to
' optimise further by declaring and using the Kernel string functions
' if you want to.
' ** THIS CODE IS IN THE PUBLIC DOMAIN **
' Nigel Heffernan Excellerando.Blogspot.com
Dim i As Long
Dim j As Long
Dim i_lBound As Long
Dim i_uBound As Long
Dim j_lBound As Long
Dim j_uBound As Long
Dim arrTemp1() As String
Dim arrTemp2() As String
Dim strBlankRow As String
i_lBound = LBound(InputArray, 1)
i_uBound = UBound(InputArray, 1)
j_lBound = LBound(InputArray, 2)
j_uBound = UBound(InputArray, 2)
ReDim arrTemp1(i_lBound To i_uBound)
ReDim arrTemp2(j_lBound To j_uBound)
For i = i_lBound To i_uBound
For j = j_lBound To j_uBound
arrTemp2(j) = InputArray(i, j)
Next j
arrTemp1(i) = Join(arrTemp2, FieldDelimiter)
Next i
If SkipBlankRows Then
If Len(FieldDelimiter) = 1 Then
strBlankRow = String(j_uBound - j_lBound, FieldDelimiter)
Else
For j = j_lBound To j_uBound
strBlankRow = strBlankRow & FieldDelimiter
Next j
End If
Join2d = Replace(Join(arrTemp1, RowDelimiter), strBlankRow, RowDelimiter, "")
i = Len(strBlankRow & RowDelimiter)
If Left(Join2d, i) = strBlankRow & RowDelimiter Then
Mid$(Join2d, 1, i) = ""
End If
Else
Join2d = Join(arrTemp1, RowDelimiter)
End If
Erase arrTemp1
End Function
Share and enjoy.
分享和享受。
回答by Siddharth Rout
Yes read it as a text file.
是的,将其作为文本文件读取。
See this example
看这个例子
Option Explicit
Sub Sample()
Dim MyData As String, strData() As String
Open "C:\MyFile.CSV" For Binary As #1
MyData = Space$(LOF(1))
Get #1, , MyData
Close #1
strData() = Split(MyData, vbCrLf)
End Sub
FOLLOWUP
跟进
Like I mentioned below in the comments, AFAIK, there is no direct way of filling a 2d Array from a csv. You will have to use the code that I gave above and then split it per line and finally filling up a 2D array which can be cumbersome. Filling up a column is easy but if you specifically want say from Row 5 to Col 7 Data then it becomes cumbersome as you will have to check if there are sufficient columns/rows in the data. Here is a basic example to get Col B in a 2D Array.
就像我在下面的评论中提到的那样,AFAIK,没有直接的方法可以从 csv 填充二维数组。您将不得不使用我上面给出的代码,然后将其每行拆分,最后填充一个可能很麻烦的二维数组。填充一列很容易,但如果您特别想说从第 5 行到第 7 列数据,那么它就会变得很麻烦,因为您必须检查数据中是否有足够的列/行。这是在 2D 数组中获取 Col B 的基本示例。
NOTE: I have not done any error handling. I am sure you can take care of that.
注意:我没有做任何错误处理。我相信你可以解决这个问题。
Let's say our CSV File looks likes this.
假设我们的 CSV 文件看起来像这样。
When you run this code
当您运行此代码时
Option Explicit
Const Delim As String = ","
Sub Sample()
Dim MyData As String, strData() As String, TmpAr() As String
Dim TwoDArray() As String
Dim i As Long, n As Long
Open "C:\Users\Siddharth Rout\Desktop\Sample.CSV" For Binary As #1
MyData = Space$(LOF(1))
Get #1, , MyData
Close #1
strData() = Split(MyData, vbCrLf)
n = 0
For i = LBound(strData) To UBound(strData)
If Len(Trim(strData(i))) <> 0 Then
TmpAr = Split(strData(i), Delim)
n = n + 1
ReDim Preserve TwoDArray(1, 1 To n)
'~~> TmpAr(1) : 1 for Col B, 0 would be A
TwoDArray(1, n) = TmpAr(1)
End If
Next i
For i = 1 To n
Debug.Print TwoDArray(1, i)
Next i
End Sub
You will get the output as shown below
您将获得如下所示的输出
BTW, I am curious that since you are doing this in Excel, why not use inbuilt Workbooks.Open
or QueryTables
method and then read the range into a 2D array? That would be much simpler...
顺便说一句,我很好奇,既然你在 Excel 中这样做,为什么不使用内置Workbooks.Open
或QueryTables
方法,然后将范围读入二维数组?那会简单很多...
回答by NigelH
To get a known format csv data file into a 2D array I finally adopted the following method, which seems to work well and is quite quick. I decided that file read operations are fairly fast nowadays, so I run a first pass on the csv file to get the size required for both dimension of the array. With the array suitably dimensioned it is then a simple task to re-read the file, line by line, and populate the array.
为了将已知格式的 csv 数据文件转换为二维数组,我最终采用了以下方法,该方法似乎运行良好且速度非常快。我决定现在文件读取操作相当快,所以我对 csv 文件运行了第一遍,以获得数组的两个维度所需的大小。使用适当的数组大小,然后一行一行地重新读取文件并填充数组是一项简单的任务。
Function ImportTestData(ByRef srcFile As String, _
ByRef dataArr As Variant) _
As Boolean
Dim FSO As FileSystemObject, Fo As TextStream
Dim line As String, Arr As Variant
Dim lc As Long, cc As Long
Dim i As Long, j As Long
ImportTestData = False
Set FSO = CreateObject("Scripting.FilesystemObject")
Set Fo = FSO.OpenTextFile(srcFile)
' First pass; read the file to get array size
lc = 0 ' Counter for number of lines in the file
cc = 0 ' Counter for number of columns in the file
While Not Fo.AtEndOfStream ' Read the csv file line by line
line = Fo.ReadLine
If lc = 0 Then ' Count commas to get array's 2nd dim index
cc = 1 + Len(line) - Len(Replace(line, ",", ""))
End If
lc = lc + 1
Wend
Fo.Close
' Set array dimensions to accept file contents
ReDim dataArr(0 To lc - 1, 0 To cc - 1)
'Debug.Print "CSV has "; n; " rows with "; lc; " fields/row"
If lc > 1 And cc > 1 Then
ImportTestData = True
End If
' Second pass; Re-open data file and copy to array
Set Fo = FSO.OpenTextFile(srcFile)
lc = 0
While Not Fo.AtEndOfStream
line = Fo.ReadLine
Arr = Split(line, ",")
For i = 0 To UBound(Arr)
dataArr(lc, i) = Arr(i)
Next i
lc = lc + 1
Wend
End Function 'ImportTestData()
I created this as a Function rather than a Sub to get a simple return value, if required.
Reading a file with 8,500 rows of 20 columns takes approximately 180ms.
This method assumes that the structure (number of delimiters) of the CSV file is the same for every row, typical of a data logging application.
如果需要,我将其创建为 Function 而不是 Sub 以获得简单的返回值。读取一个包含 8,500 行 20 列的文件大约需要 180 毫秒。
此方法假定 CSV 文件的结构(分隔符的数量)对于每一行都是相同的,这是典型的数据记录应用程序。
回答by Toten Hosen
The following solution does not use ActiveX:
以下解决方案不使用 ActiveX:
I wrote code to import a csv (actually tab-separated) file into an array. That code is the following.
我编写了将 csv(实际上是制表符分隔的)文件导入数组的代码。该代码如下。
First let's designate the array (initially it is completely void but it will be resized appropriately later):
首先让我们指定数组(最初它是完全无效的,但稍后会适当调整大小):
Dim TxtFile$()
Now for the sub-procedure:
现在为子过程:
' Fills TxtFile$() array
Sub FillTextFileArray(A$)
'***********************************************************************
' Declarations
'***********************************************************************
Dim I, J As Integer
Dim LineString As String
'***********************************************************************
I = -1: J = 0 ' Will hold array dimentions
Open A$ For Input As #1
Do While Not EOF(1) ' Loop until end of file.
Line Input #1, LineString
LineString = LineString + vbTab ' If not done empty lines give error with Split()
I = I + 1
If J < UBound(Split(LineString, vbTab)) Then J = UBound(Split(LineString, vbTab))
Loop
ReDim TxtFile$(1 To I + 4, 1 To J + 4) ' Not indexed from 0 ! (Plus some room at the end.) This is done to match worksheet format.
Seek #1, 1 ' Reset to start
I = -1 ' Will hold array row index
Do While Not EOF(1) ' Loop until end of file.
Line Input #1, LineString
LineString = LineString + vbTab ' If not done empty lines give error with Split()
I = I + 1
For J = 0 To UBound(Split(LineString, vbTab))
TxtFile$(I + 1, J + 1) = Split(LineString, vbTab)(J)
Next J
Loop
Close #1 ' Close file.
' TxtFile$() now holds the contents of the text file
End Sub
Obviously you can then do what you want with the TxtFile$ array. A$ is the location and name of the text file. As already said, this particular code works with tab-delimited files (vbTab), not comma-delimited (separated), but any adaptation should not be too difficult. It has the advantage of avoiding ActiveX complications.
显然,您可以使用 TxtFile$ 数组做您想做的事情。A$ 是文本文件的位置和名称。如前所述,这个特殊的代码适用于制表符分隔的文件 (vbTab),而不是逗号分隔的(分隔的),但任何改编都不应该太困难。它具有避免 ActiveX 并发症的优点。
回答by agcala
Alternatively you can use a code like this
或者,您可以使用这样的代码
Dim line As String, Arr
Dim FSO As Object, Fo As Object
Set FSO = CreateObject("Scripting.FileSystemObject")
Set Fo = FSO.OpenTextFile("csvfile.csv")
While Not Fo.AtEndOfStream
line = Fo.ReadLine ' Read the csv file line by line
Arr = Split(line, ",") ' The csv line is loaded into the Arr as an array
For i = 0 To UBound(Arr) - 1: Debug.Print Arr(i) & " ";: Next
Debug.Print
Wend
01/01/2019 1 1 1 36 55.6 0.8 85.3 95 95 109 102 97 6 2.5 2.5 3.9
01/01/2019 1 2 0 24 0.0 2.5 72.1 89 0 0 97 95 10 6.7 4.9 3.9
01/01/2019 1 3 1 36 26.3 4 80.6 92 92 101 97 97 8 5.5 5.3 3.7
01/01/2019 1 4 0 16 30.0 8 79.2 75 74 87 87 86 10 3.8 4 4.2