vba Excel VBA将csv字符串处理为数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12867926/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 18:06:42  来源:igfitidea点击:

Excel VBA process csv string into array

excelvbacsvsplit

提问by Gavin

I have csv string (utf-8) obtained via a http download.

我有通过 http 下载获得的 csv 字符串(utf-8)。

Depending on the situation the data in the string could contain a different number of columns, but each individual time a string is processed it will contain the same number of columns and be contiguous. (the data will be even).

根据情况,字符串中的数据可能包含不同数量的列,但每次处理一个字符串时,它将包含相同数量的列并且是连续的。(数据将是偶数)。

The string could contain any number of rows.

该字符串可以包含任意数量的行。

The first row will always be the headings.

第一行将始终是标题。

String fields will be encased in double quotes and could contain commas, quotes and newlines.

字符串字段将用双引号括起来,并且可以包含逗号、引号和换行符

quotes and double quotes inside a string are escaped by doubling so "" and ''

字符串中的引号和双引号通过将 "" 和 '' 加倍进行转义

In other words this is a well formed csv format. Excel through it's standard file open mechanism has no problem formatting this data.

换句话说,这是一种格式良好的 csv 格式。Excel通过它的标准文件打开机制格式化这些数据没有问题。

However I want to avoid saving to a file and then opening the csv as I will need to process the output in some cases, or even merge with existing data on a worksheet.

但是,我想避免保存到文件然后打开 csv,因为在某些情况下我需要处理输出,甚至与工作表上的现有数据合并。

(Added the following information via edit) The Excel Application will be distributed to various destinations and I want to avoid if possible potential permissions issues, seems that writing nothing to disk is a good way to do that

通过编辑添加了以下信息Excel 应用程序将分发到各个目的地,如果可能的话,我想避免潜在的权限问题,似乎不向磁盘写入任何内容是一个好方法

I am thinking something like the following pseudo:

我在想类似下面的伪:

rows = split(csvString, vbCrLf)  'wont work due to newlines inside string fields?

FOREACH rows as row
    fields = split(row, ',')     'wont work due to commas in string fields?
ENDFOR

Obviously that cant handle the fields containing special tokens.

显然,不能处理包含特殊标记的字段。

What is a solid way of parsing this data?

解析这些数据的可靠方法是什么?

Thanks

谢谢

EDIT 13/10/2012 Data Samples

编辑 13/10/2012 数据样本

csv as it would appear in notepad (note not all line breaks will be \r\n some could be \n)

csv,因为它会出现在记事本中(请注意,并非所有换行符都是 \r\n 有些可能是 \n)

LanguageID,AssetID,String,TypeID,Gender
3,50820,"A string of natural language",3,0
3,50819,"Complex text, with comma, "", '' and new line
all being valid",3,0
3,50818,"Some more language",3,0

The same csv in Excel 2010 - opened from shell (double click - no extra options) enter image description here

Excel 2010 中的相同 csv - 从外壳打开(双击 - 没有额外选项) 在此处输入图片说明

采纳答案by chris neilsen

I can think of three possibilities:

我能想到三种可能:

  1. Use Regular Expressions to process the text. There are plenty of examples available on SO and via google for separating strings like this.
  2. Use the power of Excel: save the text to a temp file, open into a temp sheet and read the data off the sheet. Delete the file and sheet when done.
  3. Use ADO to query the data. Save the string to a temp file and run a query on that to return the fields you want.
  1. 使用正则表达式处理文本。在 SO 上和通过 google 有很多示例可用于分离这样的字符串。
  2. 使用 Excel 的强大功能:将文本保存到临时文件,打开临时表并从表中读取数据。完成后删除文件和工作表。
  3. 使用 ADO 查询数据。将字符串保存到临时文件并对其运行查询以返回所需的字段。

To offer any more specific advice I would need samples of input data and expected output

为了提供更具体的建议,我需要输入数据和预期输出的样本

回答by Daniel

If you don't mind putting the data in your workbook: You could use a blank worksheet, add the data in 1 column, then call TextToColumns. Then if you want to get the data back as an array just load it from the UsedRange of the worksheet.

如果您不介意将数据放入工作簿:您可以使用空白工作表,将数据添加到 1 列中,然后调用TextToColumns。然后,如果您想将数据作为数组取回,只需从工作表的 UsedRange 加载它。

'Dim myArray 'Uncomment line if storing data to array.
'Assumes cvsString is already defined
'Used Temp as sheet for processing
With Sheets("Temp")
    .Cells.Delete
    .Cells(1, 1) = cvsString
    .Cells(1, 1).TextToColumns Destination:=Cells(1, 1), DataType:=xlDelimited, _
        TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=False, _
        Semicolon:=False, Comma:=True, Space:=False, Other:=False
    'myArray = .UsedRange 'Uncomment line if storing data to array
End With