C# 将 MatchCollection 转换为字符串数组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11416191/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting a MatchCollection to string array
提问by Vil
Is there a better way than this to convert a MatchCollection to a string array?
有没有比这更好的方法将 MatchCollection 转换为字符串数组?
MatchCollection mc = Regex.Matches(strText, @"\b[A-Za-z-']+\b");
string[] strArray = new string[mc.Count];
for (int i = 0; i < mc.Count;i++ )
{
strArray[i] = mc[i].Groups[0].Value;
}
P.S.: mc.CopyTo(strArray,0)throws an exception:
PS:mc.CopyTo(strArray,0)抛出异常:
At least one element in the source array could not be cast down to the destination array type.
源数组中的至少一个元素无法转换为目标数组类型。
采纳答案by Dave Bish
Try:
尝试:
var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b")
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
回答by Alex
Dave Bish's answer is good and works properly.
Dave Bish 的回答很好并且工作正常。
It's worth noting although that replacing Cast<Match>()with OfType<Match>()will speed things up.
值得注意的是,虽然替换Cast<Match>()withOfType<Match>()会加快速度。
Code wold become:
代码会变成:
var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b")
.OfType<Match>()
.Select(m => m.Groups[0].Value)
.ToArray();
Result is exactly the same (and addresses OP's issue the exact same way) but for huge strings it's faster.
结果完全相同(并以完全相同的方式解决 OP 的问题),但对于大字符串,速度更快。
Test code:
测试代码:
// put it in a console application
static void Test()
{
Stopwatch sw = new Stopwatch();
StringBuilder sb = new StringBuilder();
string strText = "this will become a very long string after my code has done appending it to the stringbuilder ";
Enumerable.Range(1, 100000).ToList().ForEach(i => sb.Append(strText));
strText = sb.ToString();
sw.Start();
var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b")
.OfType<Match>()
.Select(m => m.Groups[0].Value)
.ToArray();
sw.Stop();
Console.WriteLine("OfType: " + sw.ElapsedMilliseconds.ToString());
sw.Reset();
sw.Start();
var arr2 = Regex.Matches(strText, @"\b[A-Za-z-']+\b")
.Cast<Match>()
.Select(m => m.Groups[0].Value)
.ToArray();
sw.Stop();
Console.WriteLine("Cast: " + sw.ElapsedMilliseconds.ToString());
}
Output follows:
输出如下:
OfType: 6540
Cast: 8743
For very longstrings Cast() is therefore slower.
对于很长的字符串,Cast() 因此较慢。
回答by gpmurthy
Consider the Following Code...
考虑以下代码...
var emailAddress = "[email protected]; [email protected]; [email protected]";
List<string> emails = new List<string>();
emails = Regex.Matches(emailAddress, @"([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})")
.Cast<Match>()
.Select(m => m.Groups[0].Value)
.ToList();
Good Luck!
祝你好运!
回答by David DeMar
I ran the exact same benchmark that Alex has posted and found that sometimes Castwas faster and sometimes OfTypewas faster, but the difference between both was negligible. However, while ugly, the for loop is consistently faster than both of the other two.
我运行了与 Alex 发布的完全相同的基准测试,发现有时Cast更快,有时OfType更快,但两者之间的差异可以忽略不计。然而,虽然丑陋,但 for 循环始终比其他两个循环都快。
Stopwatch sw = new Stopwatch();
StringBuilder sb = new StringBuilder();
string strText = "this will become a very long string after my code has done appending it to the stringbuilder ";
Enumerable.Range(1, 100000).ToList().ForEach(i => sb.Append(strText));
strText = sb.ToString();
//First two benchmarks
sw.Start();
MatchCollection mc = Regex.Matches(strText, @"\b[A-Za-z-']+\b");
var matches = new string[mc.Count];
for (int i = 0; i < matches.Length; i++)
{
matches[i] = mc[i].ToString();
}
sw.Stop();
Results:
结果:
OfType: 3462
Cast: 3499
For: 2650
回答by Nicholas Petersen
One could also make use of this extension method to deal with the annoyance of MatchCollectionnot being generic. Not that it's a big deal, but this is almost certainly more performant than OfTypeor Cast, because it's just enumerating, which both of those also have to do.
还可以利用这种扩展方法来解决MatchCollection不通用的烦恼。并不是说这有什么大不了的,但这几乎肯定比OfTypeor更高效Cast,因为它只是枚举,这两个也必须这样做。
(Side note: I wonder if it would be possible for the .NET team to make MatchCollectioninherit generic versions of ICollectionand IEnumerablein the future? Then we wouldn't need this extra step to immediately have LINQ transforms available).
(附注:我不知道是否有可能为.NET团队做出MatchCollection继承的仿制药ICollection,并IEnumerable在未来那么我们就不需要这个额外的步骤,以立即有LINQ提供转换?)。
public static IEnumerable<Match> ToEnumerable(this MatchCollection mc)
{
if (mc != null) {
foreach (Match m in mc)
yield return m;
}
}

