vb.net TextFieldParser 类
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16588454/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
TextFieldParser Class
提问by optimusprime
I am using the TextFieldParserClass to read comma separated value (.csv) file. Fields in this file are enclosed with double quotes like "Field1","Field2".
我正在使用TextFieldParser类来读取逗号分隔值 (.csv) 文件。此文件中的字段用双引号括起来,如"Field1","Field2".
So, to read file, I've set the HasFieldsEnclosedInQuotesproperty of TextFieldParserobject to true. But I get an error of MalformedLineExceptionwhen any of fields contain double quote (`"+ ) in the beginning.
因此,为了读取文件,我将object的HasFieldsEnclosedInQuotes属性设置TextFieldParser为 true。但是MalformedLineException当任何字段在开头包含双引号 (`"+) 时,我会收到错误消息。
Example: ""Field2"with additional"Here I should see "Field2" with additionalas output.
示例:""Field2"with additional"在这里我应该看到"Field2" with additional输出。
However, if "is anywhere except first position then it works fine.
Like line with "Field2 "with" additional"works perfectly fine and gives me Field2 "with" additionalas output.
但是,如果"在第一个位置以外的任何位置,则它可以正常工作。就像 line with"Field2 "with" additional"工作得很好,并给了我Field2 "with" additional作为输出。
Does any one have same issue? Is there any way I can resolve this issue?
有没有人有同样的问题?有什么办法可以解决这个问题吗?
This is my code:
这是我的代码:
Private Sub ReadTextFile(ByVal txtFilePath As String)
Dim myReader As tfp = New Microsoft.VisualBasic.FileIO.TextFieldParser(txtFilePath)
myReader.Delimiters = New String() {","}
myReader.TextFieldType = FileIO.FieldType.Delimited
myReader.HasFieldsEnclosedInQuotes = True
myReader.TrimWhiteSpace = True
Dim currentRow As String()
Dim headerRow As Integer = 0
While Not myReader.EndOfData
Try
currentRow = myReader.ReadFields()
'Read Header
If (headerRow = 0) Then
'Do work for Header Row
headerRow += 1
Else
'Do work for Data Row
End If
Catch ex As Exception
Dim errorline As String = myReader.ErrorLine
End Try
End While
End Sub
This is my Data in csv file:
这是我在 csv 文件中的数据:
"Column1","Column2","Column3" "Value1","Value2",""A" Block in Building 123"
回答by Heinzi
Your example ""A" Block"is malformed CSV; thus, TextFieldParser has every right to reject it. The CSV standardsays:
您的示例""A" Block"是格式错误的 CSV;因此,TextFieldParser 完全有权拒绝它。该CSV标准说:
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
If you encode your data correctly, i.e., ...
如果您正确编码您的数据,即...
"Column1","Column2","Column3"
"Value1","Value2","""A"" Block in Building 123"
... TextFieldParser works fine and correctly returns "A" Block in Building 123.
... TextFieldParser 工作正常并正确返回"A" Block in Building 123。
So, the first step would be to tell the guy producing the CSV file to create a valid CSV file instead of something-that-looks-like-CSV-but-isn't.
因此,第一步是告诉生成 CSV 文件的人创建一个有效的 CSV 文件,而不是一些看起来像 CSV 但实际上不是的东西。
If you cannot do that, you might want to make two passes through the file:
如果您不能这样做,您可能需要两次通过文件:
- Fix the file by converting it into a "valid" CSV file (for example by replacing quotes not followed or preceded by a comma by two quotes).
- Then, TextFieldParser can parse the "valid" CSV file without trouble.
- 通过将文件转换为“有效”CSV 文件来修复该文件(例如,通过用两个引号替换后面没有逗号或前面没有逗号的引号)。
- 然后,TextFieldParser 可以毫无困难地解析“有效”的 CSV 文件。
回答by Alex Filipovici
[Original answer]
[原答案]
Try this:
尝试这个:
using System;
using System.IO;
using System.Linq;
class Test
{
static void Main()
{
var file = "Test.txt";
var r = File.ReadAllLines(file)
.Select((i, index) => new { Line = index, Fields = i.Split(new char[] { ',' }) });
// header
var header = r.First();
// do work for header
for (int j = 0; j < header.Fields.Count(); j++)
{
Console.Write("{0} ", header.Fields[j].Substring(1, header.Fields[j].Length-2));
}
Console.WriteLine();
var rows = r.Skip(1).ToList();
// do work for rows
for (int i = 0; i < rows.Count; i++)
{
for (int j = 0; j < rows[i].Fields.Count(); j++)
{
Console.Write("{0} ", rows[i].Fields[j].Trim(new[] { '"' }));
}
Console.WriteLine();
}
}
}
Note: I'm posting in C# since the question is still being tagged with it.
注意:我在 C# 中发帖,因为问题仍被标记为它。
As the C# tag is gone, please refer to http://converter.telerik.com/for help in converting the code to VB.
由于 C# 标签已消失,请参阅http://converter.telerik.com/以获取将代码转换为 VB 的帮助。
[Updated answer]
[更新答案]
Trying a different approach (this time, in VB.Net):
尝试不同的方法(这次是在 VB.Net 中):
Imports System
Imports System.IO
Imports System.Linq
Class Test
Public Shared Sub Main()
Dim file__1 = "Test.txt"
Dim r = File.ReadAllLines(file__1).[Select](Function(i, index) New With { _
.Line = index, _
.Fields = i.Substring(1, i.Length - 2).Split(New String() {""","""}, StringSplitOptions.None) _
})
' header
Dim header = r.First()
' do work for header
For j As Integer = 0 To header.Fields.Count() - 1
Console.Write("{0} ", header.Fields(j))
Next
Console.WriteLine()
Dim rows = r.Skip(1).ToList()
' do work for rows
For i As Integer = 0 To rows.Count - 1
For j As Integer = 0 To rows(i).Fields.Count() - 1
Console.Write("{0} ", rows(i).Fields(j))
Next
Console.WriteLine()
Next
End Sub
End Class

