C# 从文本文件中读取固定宽度的记录
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/162727/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read fixed width record from text file
提问by Chris Karcher
I've got a text file full of records where each field in each record is a fixed width. My first approach would be to parse each record simply using string.Substring(). Is there a better way?
我有一个充满记录的文本文件,其中每个记录中的每个字段都是固定宽度。我的第一种方法是简单地使用 string.Substring() 解析每条记录。有没有更好的办法?
For example, the format could be described as:
例如,格式可以描述为:
<Field1(8)><Field2(16)><Field3(12)>
And an example file with two records could look like:
包含两条记录的示例文件可能如下所示:
SomeData0000000000123456SomeMoreData
Data2 0000000000555555MoreData
I just want to make sure I'm not overlooking a more elegant way than Substring().
我只是想确保我没有忽略比 Substring() 更优雅的方式。
Update:I ultimately went with a regex like Killersponge suggested:
更新:我最终使用了像 Killersponge 建议的正则表达式:
private readonly Regex reLot = new Regex(REGEX_LOT, RegexOptions.Compiled);
const string REGEX_LOT = "^(?<Field1>.{6})" +
"(?<Field2>.{16})" +
"(?<Field3>.{12})";
I then use the following to access the fields:
然后我使用以下内容访问字段:
Match match = reLot.Match(record);
string field1 = match.Groups["Field1"].Value;
采纳答案by Jon Skeet
Substring sounds good to me. The only downside I can immediately think of is that it means copying the data each time, but I wouldn't worry about that until you prove it's a bottleneck. Substring is simple :)
子串对我来说听起来不错。我能立即想到的唯一缺点是这意味着每次都复制数据,但在您证明这是一个瓶颈之前,我不会担心这一点。子串很简单:)
You coulduse a regex to match a whole record at a time and capture the fields, but I think that would be overkill.
您可以使用正则表达式一次匹配整个记录并捕获字段,但我认为这有点矫枉过正。
回答by Will Hartung
Nope, Substring is fine. That's what it's for.
不,Substring 很好。这就是它的用途。
回答by Sekhat
You may have to watch out, if the end of the lines aren't padded out with spaces to fill the field, your substring won't work without a bit of fiddling to work out how much more of the line there is to read. This of course only applies to the last field :)
您可能需要注意,如果行的末尾没有用空格填充以填充字段,那么您的子字符串将无法工作,而无需稍微摆弄以计算出还有多少行要读取。这当然仅适用于最后一个字段:)
回答by Leandro Oliveira
Use FileHelpers.
使用FileHelpers。
Example:
例子:
[FixedLengthRecord()]
public class MyData
{
[FieldFixedLength(8)]
public string someData;
[FieldFixedLength(16)]
public int SomeNumber;
[FieldFixedLength(12)]
[FieldTrim(TrimMode.Right)]
public string someMoreData;
}
Then, it's as simple as this:
然后,就这么简单:
var engine = new FileHelperEngine<MyData>();
// To Read Use:
var res = engine.ReadFile("FileIn.txt");
// To Write Use:
engine.WriteFile("FileOut.txt", res);
回答by Jon Limjap
Unfortunately out of the box the CLR only provides Substring for this.
不幸的是,开箱即用的 CLR 仅为此提供了 Substring。
Someone over at CodeProject made a custom parser using attributes to define fields, you might wanna look at that.
CodeProject 的某个人使用属性来定义字段制作了一个自定义解析器,您可能想看看它。
回答by Soraz
You could set up an ODBC data source for the fixed format file, and then access it as any other database table. This has the added advantage that specific knowledge of the file format is not compiled into your code for that fateful day that someone decides to stick an extra field in the middle.
您可以为固定格式文件设置 ODBC 数据源,然后像访问任何其他数据库表一样访问它。这有一个额外的好处,即文件格式的特定知识不会编译到您的代码中,因为有人决定在中间添加一个额外的字段。
回答by Colonel Panic
Why reinvent the wheel? Use .NET's TextFieldParserclass per this how-to for Visual Basic.
为什么要重新发明轮子?根据 Visual Basic 的此操作方法使用 .NET 的TextFieldParser类。