C# 将 .csv 文件解析为二维数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18806757/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 13:15:29  来源:igfitidea点击:

Parsing .csv file into 2d array

c#csv

提问by

I'm trying to parse a CSV file into a 2D array in C#. I'm having a very strange issue, here is my code:

我正在尝试将 CSV 文件解析为 C# 中的二维数组。我有一个非常奇怪的问题,这是我的代码:

string filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";
StreamReader sr = new StreamReader(filePath);
data = null; 
int Row = 0;
while (!sr.EndOfStream)
{
    string[] Line = sr.ReadLine().Split(',');
    if (Row == 0)
    {
        data = new string[Line.Length, Line.Length];
    }
    for (int column = 0; column < Line.Length; column++)
    {
        data[Row, column] = Line[column];
    }
    Row++;
    Console.WriteLine(Row);
}

My .csv file has 87 rows, but there is a strange issue in the execution where it will read the first 15 rows into the data array exactly as expected but when it gets down to the data[Row, column] = Line[column];line for the 16th time it seems to just break out of the entire loop (without meeting the sr.EndOfStreamcondition) and not read any more data into the data array.

我的 .csv 文件有 87 行,但在执行中有一个奇怪的问题,它会完全按照预期将前 15 行读入数据数组,但是当它data[Row, column] = Line[column];第 16 次到达该行时,它似乎刚刚爆发整个循环(sr.EndOfStream不满足条件)并且不再将任何数据读入数据数组。

Can anyone explain what might be happening?

谁能解释一下可能会发生什么?

采纳答案by Khan

Nothing in your code gets the number of lines out of your file in time to use it.

您的代码中没有任何内容可以及时从文件中获取行数以使用它。

Line.Lengthrepresents the number of columns in your csv, but it looks like you're also trying to use it to specify the number of lines in your file.

Line.Length表示您的 csv 中的列数,但您似乎也在尝试使用它来指定文件中的行数。

This should get you your expected result:

这应该会得到您的预期结果:

string filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";
StreamReader sr = new StreamReader(filePath);
var lines = new List<string[]>();
int Row = 0;
while (!sr.EndOfStream)
{
    string[] Line = sr.ReadLine().Split(',');
    lines.Add(Line);
    Row++;
    Console.WriteLine(Row);
}

var data = lines.ToArray();

回答by matthewrdev

Without knowing the contents of your csv file, I would assume that the error is generated by this line:

在不知道 csv 文件的内容的情况下,我认为错误是由这一行生成的:

if (Row == 0)
{
    data = new string[Line.Length, Line.Length];
}

By initialising the total amount of rows to the amount of columns in the first line of the csv, you are assuming that the amount of rows is always equal to the amount of columns.

通过将总行数初始化为 csv 第一行中的列数,您假设行数始终等于列数。

As soon as the amount of rows is greater than the total columns of the first line of the csv, you are going to overrun the dataarray by attempting to access a row that isn't there.

一旦行数大于 csv 第一行的总列数,您将data通过尝试访问不存在的行来溢出数组。

You can simplify your code by changing your datato be a list to allow for dynamic adding of items:

您可以通过将您的代码更改data为列表以允许动态添加项目来简化代码:

string filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";
StreamReader sr = new StreamReader(filePath);
List<string> data = new List<string[]>();
int Row = 0;
while (!sr.EndOfStream)
{
    string[] Line = sr.ReadLine().Split(',');
    data.Add(Line);
    Row++;
    Console.WriteLine(Row);
}

回答by Pavel Bastov

A shorter version of the code above:

上面代码的较短版本:

var filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";
var data = File.ReadLines(filePath).Select(x => x.Split(',')).ToArray();

Note the user of ReadLinesinstead of ReadAllLines, which is more efficient on larger files as per MSDN documentation:

请注意用户,ReadLines而不是ReadAllLines,根据MSDN 文档,这对较大的文件更有效:

When you use ReadLines, you can start enumerating the collection of strings before the whole collection is returned; when you use ReadAllLines, you must wait for the whole array of strings be returned before you can access the array. Therefore, when you are working with very large files, ReadLines can be more efficient.

使用 ReadLines 时,可以在返回整个集合之前开始枚举字符串集合;使用 ReadAllLines 时,必须等待整个字符串数组返回,然后才能访问该数组。因此,当您处理非常大的文件时,ReadLines 会更有效率。

回答by JeffS

This is the same as posted by Pavel, but it ignores empty lines that may cause your program to crash.

这与 Pavel 发布的相同,但它忽略了可能导致程序崩溃的空行。

var filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";

string[][] data = File.ReadLines(filepath).Where(line => line != "").Select(x => x.Split('|')).ToArray();

回答by David

With Open File Dialog

使用打开文件对话框

OpenFileDialog opn = new OpenFileDialog();

        if (opn.ShowDialog() == DialogResult.OK)
        {
           StreamReader sr = new StreamReader(opn.FileName);

           List<string[]> data = new List<string[]>(); 

           int Row = 0;

           while (!sr.EndOfStream)
           {
               string[] Line = sr.ReadLine().Split(',');
               data.Add(Line);
               Row++;
               Console.WriteLine(Row);
           }


        }