从文本文件加载 VBA 中的格式化数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1043874/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 10:31:39  来源:igfitidea点击:

loading formatted data in VBA from a text file

vbaformatted-input

提问by Jean-Fran?ois Corbett

I'm looking for the best way of loading formatted data in VBA. I've spent quite some time trying to find the equivalent of C-like or Fortran-like fscanftype functions, but without success.

我正在寻找在 VBA 中加载格式化数据的最佳方式。我花了相当多的时间试图找到类似 C 或 Fortran 的fscanf类型函数的等价物,但没有成功。

Basically I want to read from a text file millions of numbers placed on many (100,000's) lines with 10 numbers each (except the last line, possibly 1-10 numbers). The numbers are separated by spaces, but I don't know in advance the width of each field (and this width changes between data blocks). e.g.

基本上我想从一个文本文件中读取数百万个数字,这些数字放置在许多(100,000 个)行上,每行 10 个数字(最后一行除外,可能是 1-10 个数字)。数字用空格分隔,但我事先不知道每个字段的宽度(并且这个宽度在数据块之间变化)。例如

  397143.1   396743.1   396343.1   395943.1   395543.1   395143.1   394743.1   394343.1   393943.1   393543.1

   -0.11    -0.10    -0.10    -0.10    -0.10    -0.09    -0.09    -0.09    -0.09    -0.09

 0.171  0.165  0.164  0.162  0.158  0.154  0.151  0.145  0.157  0.209 

Previously I've used the Midfunction but in this case I can't, because I don't know in advance the width of each field. Also it's too many lines to load in an Excel sheet. I can think of a brute force way in which I look at each successive character and determine whether it's a space or a number, but it seems terribly clumsy.

以前我使用过这个Mid函数,但在这种情况下我不能,因为我事先不知道每个字段的宽度。此外,在 Excel 工作表中加载的行太多。我可以想到一种蛮力的方式,我查看每个连续的字符并确定它是空格还是数字,但这似乎非常笨拙。

I'm also interested in pointers on how to write formatted data, but this seems easier -- just format each string and concatenate them using &.

我也对如何编写格式化数据的指针感兴趣,但这似乎更容易——只需格式化每个字符串并使用&.

回答by e.James

The following snippet will read whitespace-delimited numbers from a text file:

以下代码段将从文本文件中读取以空格分隔的数字:

Dim someNumber As Double

Open "YourDataFile.txt" For Input As #1

Do While Not (EOF(1))
    Input #1, someNumber
    `// do something with someNumber here...`
Loop

Close #1

update:Here is how you could read one line at a time, with a variable number of items on each line:

更新:这是一次读取一行的方法,每行包含可变数量的项目:

Dim someNumber As Double
Dim startPosition As Long
Dim endPosition As Long
Dim temp As String

Open "YourDataFile" For Input As #1

Do While Not (EOF(1))
    startPosition = Seek(1)  '// capture the current file position'
    Line Input #1, temp      '// read an entire line'
    endPosition = Seek(1)    '// determine the end-of-line file position'
    Seek 1, startPosition    '// jump back to the beginning of the line'

    '// read numbers from the file until the end of the current line'
    Do While Not (EOF(1)) And (Seek(1) < endPosition)
        Input #1, someNumber
        '// do something with someNumber here...'
    Loop

Loop

Close #1

回答by Marc

You could also use regular expressions to replace multiple whitespaces to one space and then use the Split function for each line like the example code shows below.

您还可以使用正则表达式将多个空格替换为一个空格,然后对每一行使用 Split 函数,如下面的示例代码所示。

After 65000 rows have been processed a new sheet will be added to the Excel workbook so the source file can be bigger than the max number of rows in Excel.

处理完 65000 行后,新工作表将添加到 Excel 工作簿中,因此源文件可以大于 Excel 中的最大行数。

Dim rx As RegExp

Sub Start()

    Dim fso As FileSystemObject
    Dim stream As TextStream
    Dim originalLine As String
    Dim formattedLine As String
    Dim rowNr As Long
    Dim sht As Worksheet
    Dim shtCount As Long

    Const maxRows As Long = 65000

    Set fso = New FileSystemObject
    Set stream = fso.OpenTextFile("c:\data.txt", ForReading)

    rowNr = 1
    shtCount = 1

    Set sht = Worksheets.Add
    sht.Name = shtCount

    Do While Not stream.AtEndOfStream
        originalLine = stream.ReadLine
        formattedLine = ReformatLine(originalLine)
        If formattedLine <> "" Then
            WriteValues formattedLine, rowNr, sht
            rowNr = rowNr + 1
            If rowNr > maxRows Then
                rowNr = 1
                shtCount = shtCount + 1
                Set sht = Worksheets.Add
                sht.Name = shtCount
            End If
        End If
    Loop

End Sub


Function ReformatLine(line As String) As String

    Set rx = New RegExp

    With rx
        .MultiLine = False
        .Global = True
        .IgnoreCase = True
        .Pattern = "[\s]+"
        ReformatLine = .Replace(line, " ")
    End With

End Function


Function WriteValues(formattedLine As String, rowNr As Long, sht As Worksheet)

    Dim colNr As Long
    colNr = 1

    stringArray = Split(formattedLine, " ")
    For Each stringItem In stringArray
        sht.Cells(rowNr, colNr) = stringItem
        colNr = colNr + 1
    Next

End Function