C# CSV 字符串处理

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4432/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-07-31 15:57:36  来源:igfitidea点击:

CSV string handling

提问by Christian Hagelid

Typical way of creating a CSVstring (pseudocode):

创建CSV字符串(伪代码)的典型方法:

  1. Create a CSV container object (like a StringBuilder in C#).
  2. Loop through the strings you want to add appending a comma after each one.
  3. After the loop, remove that last superfluous comma.
  1. 创建一个 CSV 容器对象(如 C# 中的 StringBuilder)。
  2. 遍历要添加的字符串,在每个字符串后附加一个逗号。
  3. 在循环之后,删除最后一个多余的逗号。

Code sample:

代码示例:

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();
    foreach (Contact c in contactList)
    {
        sb.Append(c.Name + ",");
    }

    sb.Remove(sb.Length - 1, 1);
    //sb.Replace(",", "", sb.Length - 1, 1)

    return sb.ToString();
}

I like the idea of adding the comma by checking if the container is empty, but doesn't that mean more processing as it needs to check the length of the string on each occurrence?

我喜欢通过检查容器是否为空来添加逗号的想法,但这是否意味着更多的处理,因为它需要在每次出现时检查字符串的长度?

I feel that there should be an easier/cleaner/more efficient way of removing that last comma. Any ideas?

我觉得应该有一种更简单/更清洁/更有效的方法来删除最后一个逗号。有任何想法吗?

采纳答案by David Wengier

You could use LINQ to Objects:

您可以使用LINQ to Objects

string [] strings = contactList.Select(c => c.Name).ToArray();
string csv = string.Join(",", strings);

Obviously that could all be done in one line, but it's a bit clearer on two.

显然,这一切都可以在一行中完成,但在两行中会更清楚一些。

回答by Chris Farmer

You could instead add the comma as the first thing inside your foreach.

您可以改为在 foreach 中添加逗号作为第一件事。

if (sb.Length > 0) sb.Append(",");

if (sb.Length > 0) sb.Append(",");

回答by Yaakov Ellis

How about tracking whether you are on the first item, and only add a comma beforethe item if it is not the first one.

如何跟踪您是否在第一个项目上,如果不是第一个项目,则只在项目添加一个逗号。

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();
    bool isFirst = true;

    foreach (Contact c in contactList) {
        if (!isFirst) { 
          // Only add comma before item if it is not the first item
          sb.Append(","); 
        } else {
          isFirst = false;
        }

        sb.Append(c.Name);
    }

    return sb.ToString();
}

回答by Xenph Yan

I like the idea of adding the comma by checking if the container is empty, but doesn't that mean more processing as it needs to check the length of the string on each occurrence?

我喜欢通过检查容器是否为空来添加逗号的想法,但这是否意味着更多的处理,因为它需要在每次出现时检查字符串的长度?

You're prematurely optimizing, the performance hit would be negligible.

您过早地进行了优化,对性能的影响可以忽略不计。

回答by Ishmaeel

You could also make an array of c.Namedata and use String.Joinmethod to create your line.

您还可以创建一个c.Name数据数组并使用String.Join方法来创建您的行。

public string ReturnAsCSV(ContactList contactList)
{
    List<String> tmpList = new List<string>();

    foreach (Contact c in contactList)
    {
        tmpList.Add(c.Name);
    }

    return String.Join(",", tmpList.ToArray());
}

This might not be as performant as the StringBuilderapproach, but it definitely looks cleaner.

这可能不如StringBuilder方法的性能好,但它看起来确实更简洁。

Also, you might want to consider using .CurrentCulture.TextInfo.ListSeparatorinstead of a hard-coded comma -- If your output is going to be imported into other applications, you might have problems with it. ListSeparator may be different across different cultures, and MS Excel at the very least, honors this setting. So:

此外,您可能需要考虑使用.CurrentCulture.TextInfo.ListSeparator而不是硬编码的逗号——如果您的输出要导入其他应用程序,您可能会遇到问题。ListSeparator 在不同的文化中可能会有所不同,至少 MS Excel 尊重此设置。所以:

return String.Join(
    System.Globalization.CultureInfo.CurrentCulture.TextInfo.ListSeparator,
    tmpList.ToArray());

回答by GateKiller

How about some trimming?

修剪一下怎么样?

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();

    foreach (Contact c in contactList)
    {
        sb.Append(c.Name + ",");
    }

    return sb.ToString().Trim(',');
}

回答by Kev

Just a thought, but remember to handle comma's and quotation marks (") in the field values, otherwise your CSV file may break the consumers reader.

只是一个想法,但请记住处理字段值中的逗号和引号 ("),否则您的 CSV 文件可能会破坏消费者阅读器。

回答by Matt Hamilton

Don't forget our old friend "for". It's not as nice-looking as foreach but it has the advantage of being able to start at the second element.

不要忘记我们的老朋友“for”。它不像 foreach 那样好看,但它的优点是能够从第二个元素开始。

public string ReturnAsCSV(ContactList contactList)
{
    if (contactList == null || contactList.Count == 0)
        return string.Empty;

    StringBuilder sb = new StringBuilder(contactList[0].Name);

    for (int i = 1; i < contactList.Count; i++)
    {
        sb.Append(",");
        sb.Append(contactList[i].Name);
    }

    return sb.ToString();
}

You could also wrap the second Append in an "if" that tests whether the Name property contains a double-quote or a comma, and if so, escape them appropriately.

您还可以将第二个 Append 包装在“if”中,以测试 Name 属性是否包含双引号或逗号,如果是,则适当地将它们转义。

回答by dbkk

Your code not really compliant with full CSV format. If you are just generating CSV from data that has no commas, leading/trailing spaces, tabs, newlines or quotes, it should be fine. However, in most real-world data-exchange scenarios, you do need the full imlementation.

您的代码并不真正符合完整的 CSV 格式。如果您只是从没有逗号、前导/尾随空格、制表符、换行符或引号的数据生成 CSV,应该没问题。但是,在大多数现实世界的数据交换场景中,您确实需要完整的实现。

For generation to proper CSV, you can use this:

要生成正确的 CSV,您可以使用以下命令:

public static String EncodeCsvLine(params String[] fields)
{
    StringBuilder line = new StringBuilder();

    for (int i = 0; i < fields.Length; i++)
    {
        if (i > 0)
        {
            line.Append(DelimiterChar);
        }

        String csvField = EncodeCsvField(fields[i]);
        line.Append(csvField);
    }

    return line.ToString();
}

static String EncodeCsvField(String field)
{
    StringBuilder sb = new StringBuilder();
    sb.Append(field);

    // Some fields with special characters must be embedded in double quotes
    bool embedInQuotes = false;

    // Embed in quotes to preserve leading/tralining whitespace
    if (sb.Length > 0 && 
        (sb[0] == ' ' || 
         sb[0] == '\t' ||
         sb[sb.Length-1] == ' ' || 
         sb[sb.Length-1] == '\t' ))
    {
        embedInQuotes = true;
    }

    for (int i = 0; i < sb.Length; i++)
    {
        // Embed in quotes to preserve: commas, line-breaks etc.
        if (sb[i] == DelimiterChar || 
            sb[i]=='\r' || 
            sb[i]=='\n' || 
            sb[i] == '"') 
        { 
            embedInQuotes = true;
            break;
        }
    }

    // If the field itself has quotes, they must each be represented 
    // by a pair of consecutive quotes.
    sb.Replace("\"", "\"\"");

    String rv = sb.ToString();

    if (embedInQuotes)
    {
        rv = "\"" + rv + "\"";
    }

    return rv;
}

Might not be world's most efficient code, but it has been tested. Real world sucks compared to quick sample code :)

可能不是世界上最高效的代码,但它已经过测试。与快速示例代码相比,现实世界很糟糕:)

回答by Autodidact

I've used this method before. The Length property of StringBuilder is NOT readonly so subtracting it by one means truncate the last character. But you have to make sure your length is not zero to start with (which would happen if your list is empty) because setting the length to less than zero is an error.

我以前用过这个方法。StringBuilder 的 Length 属性不是只读的,因此通过一种方式减去它会截断最后一个字符。但是您必须确保您的长度不为零开始(如果您的列表为空,则会发生这种情况),因为将长度设置为小于零是一个错误。

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();

    foreach (Contact c in contactList)       
    { 
        sb.Append(c.Name + ",");       
    }

    if (sb.Length > 0)  
        sb.Length -= 1;

    return sb.ToString();  
}