C# 用于分隔字符串的比逗号更独特的分隔符是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/815782/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 02:18:10  来源:igfitidea点击:

What is a more unique delimiter than comma for separating strings?

c#delimited-text

提问by KingNestor

I have several textboxes where users can enter information into them. This can include commas, so I can't use the standard comma delimited strings.

我有几个文本框,用户可以在其中输入信息。这可以包括逗号,所以我不能使用标准的逗号分隔字符串。

What is a good delimiter to denote that strings should be separated based on that character that isn't typically used by users in their writings? I'm going to be combining these fields into a string string and passing them off to my Encryption method I have. After I decrypt them I need to be able to reliably separate them.

什么是表示字符串应该根据用户在其作品中通常不使用的字符进行分隔的好分隔符?我将把这些字段组合成一个字符串,并将它们传递给我的加密方法。在我解密它们之后,我需要能够可靠地将它们分开。

I'm using C# if it matters.

如果重要的话,我正在使用 C#。

采纳答案by Chad Grant

| would be next on my list and is often used as an alternative to CSV. google "pipe delimited" and you will find many examples.

| 将是我列表中的下一个,通常用作 CSV 的替代品。谷歌“管道分隔”,你会发现很多例子。

string[] items = new string[] {"Uno","Dos","Tres"};

string toEncrypt = String.Join("|", items);

items = toEncrypt.Split(new char[] {'|'}, StringSplitOptions.RemoveEmptyEntries);

foreach(string s in items)
  Console.WriteLine(s);

And since everyone likes to be a critic about the encoding and not provide the code, here is one way to encode the text so your | delim won't collide.

而且由于每个人都喜欢批评编码而不提供代码,因此这里有一种对文本进行编码的方法,以便您的 | delim 不会碰撞。

string[] items = new string[] {"Uno","Dos","Tres"};

for (int i = 0; i < items.Length; i++)
    items[i] = Convert.ToBase64String(Encoding.UTF8.GetBytes(items[i]));

string toEncrypt = String.Join("|", items);

items = toEncrypt.Split(new char[] {'|'}, StringSplitOptions.RemoveEmptyEntries);

foreach (string s in items)
     Console.WriteLine(Encoding.UTF8.GetString(Convert.FromBase64String(s)));

回答by Promit

The backtick. Nobody uses the backtick.

反引号。没有人使用反引号。

回答by Rob

The pipe character (|), perhaps? If your user base is remotely IT-shy, then this approach (asking them to delimit their text) might not be the best one to take; you could try something else, e.g. provide some means of dynamically adding a text box on the fly which accepts another string, etc.

也许是管道字符 (|)?如果您的用户群对 IT 很敏感,那么这种方法(要求他们对文本进行分隔)可能不是最好的选择;您可以尝试其他方法,例如提供一些动态添加文本框的方法,该文本框接受另一个字符串等。

If you provide a little more information about what you're doing, and for whom, it might be possible for someone to suggest an alternative approach.

如果您提供更多关于您正在做什么以及为谁做的信息,那么有人可能会建议另一种方法。

回答by Tim Robinson

Newline? (i.e. use a multi-line text box)

新队?(即使用多行文本框)

回答by mP.

The best solution is to stick to commas and introduce support for character escaping. Whatever character you select will eventually need to be entered so you may aswell provide support for this.

最好的解决方案是坚持使用逗号并引入对字符转义的支持。您选择的任何字符最终都需要输入,因此您也可以为此提供支持。

Think backslases + double quotes inside double quoted strings.

想想双引号字符串中的反斜杠+双引号。

Don't pick a character like backtick because some users might not know how to type it in...

不要选择像反引号这样的字符,因为有些用户可能不知道如何输入...

回答by Blerta

I would suggest using ";"

我建议使用“;”

回答by Frank Rosario

I prefer to use a combination of characters that would not likely be entered a by a normal person as my delimiter when possible. For example, I've used ")^&^(" and set it up as a const "cDelimiter" in my code; then concatenated all of my fields with that. By using a small unique string, I greatly reduce the likely hood of the user accidentally entering my delimiter. The likely hood of a user entering a | or a ~ is admittedly unlikely, but it doesn't mean it won't happen.

如果可能,我更喜欢使用普通人不太可能输入的字符组合作为我的分隔符。例如,我在我的代码中使用了“)^&^(”并将其设置为常量“cDelimiter”;然后将我的所有字段与它连接起来。通过使用一个小的唯一字符串,我大大减少了可能的引擎盖用户不小心输入了我的分隔符。用户输入 | 或 ~ 的可能性是不可否认的,但这并不意味着它不会发生。

回答by Colin Burnett

Any of the non-standard character pipe |, backtick `, tilde ~, bang !, or semi-colon ; would probably work. However, if you go this route you are reallyventuring away from usability. Asking them to escape commas with a backslash or something is begging for them to miss one.

任何非标准字符管道 |、反引号 `、波浪号 ~、砰 ! 或分号;可能会工作。然而,如果你走这条路,你真的是在冒险远离可用性。要求他们用反斜杠或其他东西来逃避逗号是在乞求他们错过一个。

If CSV is not possible then you should consider changing your UI. (Heck, you should stay away from CSV anyway for a user input!) You say textbox so I assume you're in web or some kind of win forms or WPF (definitely not a console). All of those give you better UI control than a single textbox and forcing users to conform to your difficult UI design.

如果无法使用 CSV,那么您应该考虑更改您的 UI。(哎呀,无论如何,对于用户输入,您都应该远离 CSV!)您说的是文本框,所以我假设您使用的是 web 或某种形式的 win 表单或 WPF(绝对不是控制台)。所有这些都为您提供了比单个文本框更好的 UI 控制,并迫使用户遵循您困难的 UI 设计。

More information would definitely help better guide answers.

更多信息肯定有助于更好地指导答案。

However, as an example of escaping a comma with a backslash. Note that you cannot escape the backslash before a comma with this. So @"uno, dos, tr\\,es" will end up with {"uno", " dos", "tr\es"}.

但是,作为使用反斜杠转义逗号的示例。请注意,您不能使用 this 在逗号前转义反斜杠。所以@"uno, dos, tr\\,es" 将以 {"uno", " dos", "tr\es"} 结尾。

string data = @"uno, dos, tr\,es";
string[] items = data.Split(','); // {"uno", " dos", @"tr\", "es"}
List<string> realitems = new List<string>();
for (int i=items.Length-1; i >= 0; i--)
{
    string item = items[i];
    if (item.Length == 0) { realitems.Insert(0, ""); continue; }

    if (realitems.Count == 0) { realitems.Insert(0, item); }
    else
    {
        if (item[item.Length - 1] == '\') { realitems[0] = item + "," + realitems[0]; }
        else { realitems.Insert(0, item); }
    }
}

// Should end up with {"uno", " dos", "tr,es"}

回答by Chris Doggett

I figure eventually, every character is going to be used by someone. Users always find a way to break our HL7 parser.

我想最终,每个角色都会被某人使用。用户总能找到破解我们的 HL7 解析器的方法。

Instead of a single character, maybe try a string that would be random enough that nobody'd ever use it. Something like "#!@!#".

而不是单个字符,也许尝试一个足够随机的字符串,以至于没人会使用它。就像是 ”#!@!#”。

回答by LukeH

Will the user be entering delimited strings into the textboxes, or will they be entering individual strings which will then be built into delimited strings by your code?

用户会在文本框中输入分隔字符串,还是会输入单个字符串,然后由您的代码将这些字符串内置到分隔字符串中?

In the first case it might be better to rethink your UI instead. eg, The user could enter one string at a time into a textbox and click an "Add to list" button after each one.

在第一种情况下,最好重新考虑您的 UI。例如,用户可以一次在文本框中输入一个字符串,然后在每个字符串后单击“添加到列表”按钮。

In the second case it doesn't really matter what delimiter you use. Choose any character you like, just ensure that you escape any other occurrences of that character.

在第二种情况下,您使用什么分隔符并不重要。选择您喜欢的任何字符,只需确保您避开该字符的任何其他出现。

EDIT

编辑

Since several comments on other answers are asking for code, here's a method to create a comma-delimited string, using backslash as the escape character:

由于对其他答案的一些评论要求提供代码,因此这里有一种创建逗号分隔字符串的方法,使用反斜杠作为转义字符:

public static string CreateDelimitedString(IEnumerable<string> items)
{
    StringBuilder sb = new StringBuilder();

    foreach (string item in items)
    {
        sb.Append(item.Replace("\", "\\").Replace(",", "\,"));
        sb.Append(",");
    }

    return (sb.Length > 0) ? sb.ToString(0, sb.Length - 1) : string.Empty;
}

And here's the method to convert that comma-delimited string back to a collection of individual strings:

这是将逗号分隔的字符串转换回单个字符串集合的方法:

public static IEnumerable<string> GetItemsFromDelimitedString(string s)
{
    bool escaped = false;
    StringBuilder sb = new StringBuilder();

    foreach (char c in s)
    {
        if ((c == '\') && !escaped)
        {
            escaped = true;
        }
        else if ((c == ',') && !escaped)
        {
            yield return sb.ToString();
            sb.Remove(0, sb.Length);
        }
        else
        {
            sb.Append(c);
            escaped = false;
        }
    }

    yield return sb.ToString();
}

And here's some example usage:

这是一些示例用法:

string[] test =
    {
        "no commas or backslashes",
        "just one, comma",
        @"a comma, and a\ backslash",
        @"lots, of\ commas,\ and\, backslashes",
        @"even\ more,, commas\ and,, backslashes"
    };

    string delimited = CreateDelimitedString(test);
    Console.WriteLine(delimited);

    foreach (string item in GetItemsFromDelimitedString(delimited))
    {
        Console.WriteLine(item);
    }