C# 提取两个字符串之间的所有字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/13780654/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract all strings between two strings
提问by Anass
I'm trying to develop a method that will match all strings between two strings:
我正在尝试开发一种方法来匹配两个字符串之间的所有字符串:
I've tried this but it returns only the first match:
我试过这个,但它只返回第一场比赛:
string ExtractString(string s, string start,string end)
        {
            // You should check for errors in real-world code, omitted for brevity
            int startIndex = s.IndexOf(start) + start.Length;
            int endIndex = s.IndexOf(end, startIndex);
            return s.Substring(startIndex, endIndex - startIndex);
        }
Let's suppose we have this string
假设我们有这个字符串
String Text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2"
I would like a c# function doing the following :
我希望 ac# 函数执行以下操作:
public List<string> ExtractFromString(String Text,String Start, String End)
{
    List<string> Matched = new List<string>();
    .
    .
    .
    return Matched; 
}
// Example of use 
ExtractFromString("A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2","A1","A2")
    // Will return :
    // FIRSTSTRING
    // SECONDSTRING
    // THIRDSTRING
Thank you for your help !
感谢您的帮助 !
采纳答案by Flavia Obreja
    private static List<string> ExtractFromBody(string body, string start, string end)
    {
        List<string> matched = new List<string>();
        int indexStart = 0;
        int indexEnd = 0;
        bool exit = false;
        while (!exit)
        {
            indexStart = body.IndexOf(start);
            if (indexStart != -1)
            {
                indexEnd = indexStart + body.Substring(indexStart).IndexOf(end);
                matched.Add(body.Substring(indexStart + start.Length, indexEnd - indexStart - start.Length));
                body = body.Substring(indexEnd + end.Length);
            }
            else
            {
                exit = true;
            }
        }
        return matched;
    }
回答by Macros
You can split the string into an array using the start identifier in following code:
您可以使用以下代码中的起始标识符将字符串拆分为数组:
String str = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
String[] arr = str.Split("A1");
Then iterate through your array and remove the last 2 characters of each string (to remove the A2). You'll also need to discard the first array element as it will be empty assuming the string starts with A1.
然后遍历数组并删除每个字符串的最后 2 个字符(以删除 A2)。您还需要丢弃第一个数组元素,因为假设字符串以 A1 开头,它将为空。
Code is untested, currently on a mobile
代码未经测试,目前在手机上
回答by Zaid Masud
text.Split(new[] {"A1", "A2"}, StringSplitOptions.RemoveEmptyEntries);
回答by PeteGO
Here is a solution using RegEx. Don't forget to include the following using statement.
这是使用 RegEx 的解决方案。不要忘记包含以下 using 语句。
using System.Text.RegularExpressions
using System.Text.RegularExpressions
It will correctly return only text between the start and end strings given.
它只会正确返回给定的开始和结束字符串之间的文本。
Will not be returned:
不会退还:
akslakhflkshdflhksdf
Will be returned:
将被退回:
FIRSTSTRING
SECONDSTRING
THIRDSTRING
It uses the regular expression pattern [start string].+?[end string]
它使用正则表达式模式 [start string].+?[end string]
The start and end strings are escaped in case they contain regular expression special characters.
如果开始和结束字符串包含正则表达式特殊字符,则会对其进行转义。
    private static List<string> ExtractFromString(string source, string start, string end)
    {
        var results = new List<string>();
        string pattern = string.Format(
            "{0}({1}){2}", 
            Regex.Escape(start), 
            ".+?", 
             Regex.Escape(end));
        foreach (Match m in Regex.Matches(source, pattern))
        {
            results.Add(m.Groups[1].Value);
        }
        return results;
    }
You could make that into an extension method of String like this:
你可以把它变成 String 的扩展方法,如下所示:
public static class StringExtensionMethods
{
    public static List<string> EverythingBetween(this string source, string start, string end)
    {
        var results = new List<string>();
        string pattern = string.Format(
            "{0}({1}){2}",
            Regex.Escape(start),
            ".+?",
             Regex.Escape(end));
        foreach (Match m in Regex.Matches(source, pattern))
        {
            results.Add(m.Groups[1].Value);
        }
        return results;
    }
}
Useage:
用途:
string source = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
string start = "A1";
string end = "A2";
List<string> results = source.EverythingBetween(start, end);
回答by nawfal
This is a generic solution, and I believe more readable code. Not tested, so beware.
这是一个通用的解决方案,我相信代码更具可读性。没有测试,所以要小心。
public static IEnumerable<IList<T>> SplitBy<T>(this IEnumerable<T> source, 
                                               Func<T, bool> startPredicate,
                                               Func<T, bool> endPredicate, 
                                               bool includeDelimiter)
{
    var l = new List<T>();
    foreach (var s in source)
    {
        if (startPredicate(s))
        {
            if (l.Any())
            {
                l = new List<T>();
            }
            l.Add(s);
        }
        else if (l.Any())
        {
            l.Add(s);
        }
        if (endPredicate(s))
        {
            if (includeDelimiter)
                yield return l;
            else
                yield return l.GetRange(1, l.Count - 2);
            l = new List<T>();
        }
    }
}
In your case you can call,
在你的情况下,你可以打电话,
var text = "A1FIRSTSTRINGA2A1SECONDSTRINGA2akslakhflkshdflhksdfA1THIRDSTRINGA2";
var splits = text.SplitBy(x => x == "A1", x => x == "A2", false);
This is not the most efficient when you do not want the delimiter to be included (like your case) in result but efficient for opposite cases. To speed up your case one can directly call the GetEnumerator and make use of MoveNext.
当您不希望在结果中包含分隔符(如您的情况)但对相反情况有效时,这不是最有效的。为了加快您的案例,可以直接调用 GetEnumerator 并使用 MoveNext。

