vb.net 使用正则表达式将逗号分隔的字符串拆分为数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17285981/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 14:01:32  来源:igfitidea点击:

Split comma delimited string to array using regex

regexvb.netcsvsplit

提问by bzamfir

I have a string as below, which needs to be split to an array, using VB.NET

我有一个如下的字符串,需要使用 VB.NET 将其拆分为一个数组

10,"Test, t1",10.1,,,"123"

10,"测试, t1",10.1,,,"123"

The result array must have 6 rows as below

结果数组必须有 6 行,如下所示

10
Test, t1
10.1
(empty)
(empty)
123

So: 1. quotes around strings must be removed 2. comma can be inside strings, and will remain there (row 2 in result array) 3. can have empty fields (comma after comma in source string, with nothing in between)

所以: 1. 必须删除字符串周围的引号 2. 逗号可以在字符串内,并将保留在那里(结果数组中的第 2 行) 3. 可以有空字段(源字符串中逗号后的逗号,中间没有任何内容)

Thanks

谢谢

回答by Joel Coehoorn

Don't use String.Split(): it's slow, and doesn't account for a number of possible edge cases.

不要使用String.Split():它很慢,并且没有考虑到许多可能的边缘情况。

Don't use RegEx. RegEx can be shoe-horned to do this accurately, but to correctly account for all the cases the expression tends to be very complicated, hard to maintain, and at this point isn't much faster than the .Split()option.

不要使用正则表达式。正则表达式可以准确地做到这一点,但要正确地解释所有情况,表达式往往非常复杂,难以维护,并且在这一点上并不比.Split()选项快多少。

Do use a dedicated CSV parser. Options include the Microsoft.VisualBasic.TextFieldParsertype, FastCSV, linq-to-csv, and a parser I wrotefor another answer.

请使用专用的 CSV 解析器。选项包括Microsoft.VisualBasic.TextFieldParser类型、FastCSVlinq-to-csv为另一个答案编写解析器

回答by Fabian Bigler

You can write a function yourself. This should do the trick:

你可以自己写一个函数。这应该可以解决问题:

Dim values as New List(Of String)
Dim currentValueIsString as Boolean
Dim valueSeparator as Char = ","c
Dim currentValue as String = String.Empty

For Each c as Char in inputString
   If c = """"c Then
     If currentValueIsString Then
        currentValueIsString = False
     Else 
        currentValueIsString = True
     End If
   End If

   If c = valueSeparator Andalso not currentValueIsString Then
     If String.IsNullOrEmpty(currentValue) Then currentValue = "(empty)"
     values.Add(currentValue)
     currentValue = String.Empty
   End If

   currentValue += c
Next

回答by tinstaafl

Here's another simple way that loops by the delimiter instead of by character:

这是按分隔符而不是按字符循环的另一种简单方法:

Public Function Parser(ByVal ParseString As String) As List(Of String)
    Dim Trimmer() As Char = {Chr(34), Chr(44)}
    Parser = New List(Of String)
    While ParseString.Length > 1
        Dim TempString As String = ""
        If ParseString.StartsWith(Trimmer(0)) Then
            ParseString = ParseString.TrimStart(Trimmer)
            Parser.Add(ParseString.Substring(0, ParseString.IndexOf(Trimmer(0))))
            ParseString = ParseString.Substring(Parser.Last.Length)
            ParseString = ParseString.TrimStart(Trimmer)
        ElseIf ParseString.StartsWith(Trimmer(1)) Then
            Parser.Add("")
            ParseString = ParseString.Substring(1)
        Else
            Parser.Add(ParseString.Substring(0, ParseString.IndexOf(Trimmer(1))))
            ParseString = ParseString.Substring(ParseString.IndexOf(Trimmer(1)) + 1)
        End If
    End While
End Function

This returns a list. If you must have an array just use the ToArray method when you call the function

这将返回一个列表。如果您必须有一个数组,请在调用函数时使用 ToArray 方法

回答by Mataniko

Why not just use the split method?

为什么不直接使用 split 方法?

Dim s as String = "10,\"Test, t1\",10.1,,,\"123\""
s = s.Replace("\"","")
Dim arr as String[] = s.Split(',')

My VB is rusty so consider this pseudo-code

我的 VB 生锈了,所以考虑这个伪代码