C# 如何将此字符串拆分为数组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/483702/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I split this string into an array?
提问by
My string is as follows:
我的字符串如下:
smtp:[email protected];SMTP:[email protected];X400:C=US;A= ;P=Test;O=Exchange;S=Hyman;G=Black;
I need back:
我需要回来:
smtp:[email protected]
SMTP:[email protected]
X400:C=US;A= ;P=Test;O=Exchange;S=Hyman;G=Black;
The problem is the semi-colons seperate the addresses and also part of the X400 address. Can anyone suggest how best to split this?
问题是分号分隔地址和 X400 地址的一部分。谁能建议如何最好地拆分它?
PS I should mentioned the order differs so it could be:
PS我应该提到顺序不同所以它可能是:
X400:C=US;A= ;P=Test;O=Exchange;S=Hyman;G=Black;;smtp:[email protected];SMTP:[email protected]
There can be more than 3 address, 4, 5.. 10 etc including an X500 address, however they do all start with either smtp: SMTP: X400 or X500.
可以有 3 个以上的地址,4、5.. 10 等,包括一个 X500 地址,但是它们都以 smtp: SMTP: X400 或 X500 开头。
回答by Jon Skeet
EDIT: With the updated information, this answer certainly won't do the trick - but it's still potentially useful, so I'll leave it here.
编辑:有了更新的信息,这个答案肯定不会成功 - 但它仍然可能有用,所以我会把它留在这里。
Will you always have three parts, and you just want to split on the first two semi-colons?
您是否总是有三个部分,而您只想在前两个分号上拆分?
If so, just use the overload of Split which lets you specify the number of substrings to return:
如果是这样,只需使用 Split 的重载,它可以让您指定要返回的子字符串的数量:
string[] bits = text.Split(new char[]{';'}, 3);
回答by The.Anti.9
http://msdn.microsoft.com/en-us/library/c1bs0eda.aspxcheck there, you can specify the number of splits you want. so in your case you would do
http://msdn.microsoft.com/en-us/library/c1bs0eda.aspx检查那里,您可以指定所需的拆分数量。所以在你的情况下你会做
string.split(new char[]{';'}, 3);
回答by Amy B
Do the semicolon (;) split and then loop over the result, re-combining each element where there is no colon (:) with the previous element.
将分号 (;) 拆分,然后循环遍历结果,将没有冒号 (:) 的每个元素与前一个元素重新组合。
string input = "X400:C=US;A= ;P=Test;O=Exchange;S=Hyman;G="
+"Black;;smtp:[email protected];SMTP:[email protected]";
string[] rawSplit = input.Split(';');
List<string> result = new List<string>();
//now the fun begins
string buffer = string.Empty;
foreach (string s in rawSplit)
{
if (buffer == string.Empty)
{
buffer = s;
}
else if (s.Contains(':'))
{
result.Add(buffer);
buffer = s;
}
else
{
buffer += ";" + s;
}
}
result.Add(buffer);
foreach (string s in result)
Console.WriteLine(s);
回答by Rob
This caught my curiosity.... So this code actually does the job, but again, wants tidying :)
这引起了我的好奇......所以这段代码实际上可以完成这项工作,但同样需要整理:)
My final attempt- stop changing what you need ;=)
我的最后一次尝试- 停止改变你需要的东西 ;=)
static void Main(string[] args)
{
string fneh = "X400:C=US400;A= ;P=Test;O=Exchange;S=Hyman;G=Black;x400:C=US400l;A= l;P=Testl;O=Exchangel;S=Hymanl;G=Blackl;smtp:[email protected];X500:C=US500;A= ;P=Test;O=Exchange;S=Hyman;G=Black;SMTP:[email protected];";
string[] parts = fneh.Split(new char[] { ';' });
List<string> addresses = new List<string>();
StringBuilder address = new StringBuilder();
foreach (string part in parts)
{
if (part.Contains(":"))
{
if (address.Length > 0)
{
addresses.Add(semiColonCorrection(address.ToString()));
}
address = new StringBuilder();
address.Append(part);
}
else
{
address.AppendFormat(";{0}", part);
}
}
addresses.Add(semiColonCorrection(address.ToString()));
foreach (string emailAddress in addresses)
{
Console.WriteLine(emailAddress);
}
Console.ReadKey();
}
private static string semiColonCorrection(string address)
{
if ((address.StartsWith("x", StringComparison.InvariantCultureIgnoreCase)) && (!address.EndsWith(";")))
{
return string.Format("{0};", address);
}
else
{
return address;
}
}
回答by Rad
Try these regexes. You can extract what you're looking for using named groups.
试试这些正则表达式。您可以使用命名组提取您要查找的内容。
X400:(?<X400>.*?)(?:smtp|SMTP|$)
smtp:(?<smtp>.*?)(?:;+|$)
SMTP:(?<SMTP>.*?)(?:;+|$)
Make sure when constructing them you specify case insensitive. They seem to work with the samples you gave
确保在构建它们时指定不区分大小写。他们似乎可以处理您提供的样品
回答by Greg
Not the fastest if you are doing this a lot but it will work for all cases I believe.
如果您经常这样做,则不是最快的,但我相信它适用于所有情况。
string input1 = "smtp:[email protected];SMTP:[email protected];X400:C=US;A= ;P=Test;O=Exchange;S=Hyman;G=Black;";
string input2 = "X400:C=US;A= ;P=Test;O=Exchange;S=Hyman;G=Black;;smtp:[email protected];SMTP:[email protected]";
Regex splitEmailRegex = new Regex(@"(?<key>\w+?):(?<value>.*?)(\w+:|$)");
List<string> sets = new List<string>();
while (input2.Length > 0)
{
Match m1 = splitEmailRegex.Matches(input2)[0];
string s1 = m1.Groups["key"].Value + ":" + m1.Groups["value"].Value;
sets.Add(s1);
input2 = input2.Substring(s1.Length);
}
foreach (var set in sets)
{
Console.WriteLine(set);
}
Console.ReadLine();
Of course many will claim Regex: Now you have two problems. There may even be a better regex answer than this.
当然,许多人会声称 Regex:现在您有两个问题。甚至可能有比这更好的正则表达式答案。
回答by Samuel
You could always split on the colon and have a little logic to grab the key and value.
你总是可以在冒号上拆分,并有一些逻辑来获取键和值。
string[] bits = text.Split(':');
List<string> values = new List<string>();
for (int i = 1; i < bits.Length; i++)
{
string value = bits[i].Contains(';') ? bits[i].Substring(0, bits[i].LastIndexOf(';') + 1) : bits[i];
string key = bits[i - 1].Contains(';') ? bits[i - 1].Substring(bits[i - 1].LastIndexOf(';') + 1) : bits[i - 1];
values.Add(String.Concat(key, ":", value));
}
Tested it with both of your samples and it works fine.
用您的两个样品对其进行了测试,效果很好。
回答by Orion Adrian
May I suggest building a regular expression
我可以建议建立一个正则表达式吗
(smtp|SMTP|X400|X500):((?!smtp:|SMTP:|X400:|X500:).)*;?
or protocol-less
或无协议
.*?:((?![^:;]*:).)*;?
in other words find anything that starts with one of your protocols. Match the colon. Then continue matching characters as long as you're not matching one of your protocols. Finish with a semicolon (optionally).
换句话说,找到任何以您的协议之一开头的内容。匹配冒号。然后继续匹配字符,只要您不匹配您的协议之一。以分号(可选)结束。
You can then parse through the list of matches splitting on ':' and you'll have your protocols. Additionally if you want to add protocols, just add them to the list.
然后,您可以解析在 ':' 上拆分的匹配列表,您将拥有自己的协议。此外,如果您想添加协议,只需将它们添加到列表中即可。
Likely however you're going to want to specify the whole thing as case-insensitive and only list the protocols in their uppercase or lowercase versions.
但是,您可能希望将整个内容指定为不区分大小写,并且仅以大写或小写版本列出协议。
The protocol-less version doesn't care what the names of the protocols are. It just finds them all the same, by matching everything up to, but excluding a string followed by a colon or a semi-colon.
无协议版本不关心协议的名称是什么。它只是通过匹配所有内容来找到它们,但不包括后跟冒号或分号的字符串。
回答by Dennis C
Split by the following regex pattern
按以下正则表达式模式拆分
string[] items = System.Text.RegularExpressions.Split(text, ";(?=\w+:)");
EDIT: better one can accept more special chars in the protocol name.
编辑:更好的是可以在协议名称中接受更多特殊字符。
string[] items = System.Text.RegularExpressions.Split(text, ";(?=[^;:]+:)");
回答by mangokun
here is another possible solution.
这是另一种可能的解决方案。
string[] bits = text.Replace(";smtp", "|smtp").Replace(";SMTP", "|SMTP").Replace(";X400", "|X400").Split(new char[] { '|' });
string[] bits = text.Replace(";smtp", "|smtp").Replace(";SMTP", "|SMTP").Replace(";X400", "|X400").Split(new char[] { '|' });
bits[0], bits[1], and bits[2] will then contains the three parts in the order from your original string.
bits[0]、bits[1] 和 bits[2] 将按照原始字符串的顺序包含三个部分。