C# 如何使用正则表达式在特定字符串/字符之后找到字符串

Question

提问by Flo

I am hopeless with regex (c#) so I would appreciate some help:

我对正则表达式（c#）感到绝望，所以我希望得到一些帮助：

Basicaly I need to parse a text and I need to find the following information inside the text:

基本上我需要解析一个文本，我需要在文本中找到以下信息：

Sample text:

示例文本：

KeywordB:***TextToFind* the rest is not relevant but **KeywordB:Text ToFindBand then some more text.

KeywordB:***TextToFind* 其余不相关，但 **KeywordB: Text ToFindB然后是更多文本。

I need to find the word(s) after a certain keyword which may end with a “:”.

我需要在某个可能以“:”结尾的关键字之后找到单词。

[UPDATE]

[更新]

Thanks Andrew and Alan: Sorry for reopening the question but there is quite an important thing missing in that regex. As I wrote in my last comment, Is it possible to have a variable (how many words to look for, depending on the keyword) as part of the regex?

谢谢 Andrew 和 Alan：很抱歉重新打开这个问题，但该正则表达式中缺少一件非常重要的事情。正如我在上一条评论中所写的那样，是否可以将变量（要查找多少个单词，取决于关键字）作为正则表达式的一部分？

Or: I could have a different regex for each keyword (will only be a hand full). But still don't know how to have the "words to look for" constant inside the regex

或者：我可以为每个关键字使用不同的正则表达式（只会满手）。但仍然不知道如何在正则表达式中保持“要查找的词”常量

Answer 1

采纳答案by Andrew Backer

Let me know if I should delete the old post, but perhaps someone wants to read it.

让我知道我是否应该删除旧帖子，但也许有人想阅读它。

The way to do a "words to look for" inside the regex is like this:

在正则表达式中做一个“要查找的词”的方法是这样的：

regex = @"(Key1|Key2|Key3|LastName|FirstName|Etc):"

What you are doing probably isn't worth the effort in a regex, though it can probablybe done the way you want (still not 100% clear on requirements, though). It involves looking ahead to the next match, and stopping at that point.

您正在做的事情可能不值得在正则表达式中付出努力，尽管它可能可以按照您想要的方式完成（尽管需求仍然不是 100% 明确）。它涉及展望下一场比赛，并在那个点停下来。

Here is a re-write as a regex + regular functional code that should do the trick. It doesn't care about spaces, so if you ask for "Key2" like below, it will separate it from the value.

这是一个重新编写的正则表达式 + 常规功能代码，应该可以解决问题。它不关心空格，所以如果你像下面这样要求“Key2”，它会将它与值分开。

string[] keys = {"Key1", "Key2", "Key3"};
string source = "Key1:Value1Key2: ValueAnd A: To Test Key3:   Something";
FindKeys(keys, source);

private void FindKeys(IEnumerable<string> keywords, string source) {
    var found = new Dictionary<string, string>(10);
    var keys = string.Join("|", keywords.ToArray());
    var matches = Regex.Matches(source, @"(?<key>" + keys + "):",
                          RegexOptions.IgnoreCase);            

    foreach (Match m in matches) {
        var key = m.Groups["key"].ToString();
        var start = m.Index + m.Length;
        var nx = m.NextMatch();
        var end = (nx.Success ? nx.Index : source.Length);
        found.Add(key, source.Substring(start, end - start));
    }

    foreach (var n in found) {
        Console.WriteLine("Key={0}, Value={1}", n.Key, n.Value);
    }                            
}

And the output from this is:

并且由此产生的输出是：

Key=Key1, Value=Value1
Key=Key2, Value= ValueAnd A: To Test 
Key=Key3, Value=   Something

Answer 2

回答by Tiago

/KeywordB\: (\w)/

This matches any word that comes after your keyword. As you didn′t mentioned any terminator, I assumed that you wanted only the word next to the keyword.

这匹配您的关键字之后的任何单词。由于您没有提到任何终止符，我假设您只需要关键字旁边的单词。

Answer 3

回答by Andrew Backer

The basic regex is this:

基本的正则表达式是这样的：

var pattern = @"KeywordB:\s*(\w*)";
    \s* = any number of spaces
    \w* = 0 or more word characters (non-space, basically)
    ()  = make a group, so you can extract the part that matched

var pattern = @"KeywordB:\s*(\w*)";
var test = @"KeywordB: TextToFind";
var match = Regex.Match(test, pattern);
if (match.Success) {
    Console.Write("Value found = {0}", match.Groups[1]);
}

If you have more than one of these on a line, you can use this:

如果您在一行中有多个这些，您可以使用这个：

var test = @"KeywordB: TextToFind KeyWordF: MoreText";
var matches = Regex.Matches(test, @"(?:\s*(?<key>\w*):\s?(?<value>\w*))");
foreach (Match f in matches ) {
    Console.WriteLine("Keyword '{0}' = '{1}'", f.Groups["key"], f.Groups["value"]);
}

Also, check out the regex designer here: http://www.radsoftware.com.au/. It is free, and I use it constantly. It works great to prototype expressions. You need to rearrange the UI for basic work, but after that it's easy.

另外，请在此处查看正则表达式设计器：http: //www.radsoftware.com.au/。它是免费的，我经常使用它。它适用于原型表达式。您需要为基本工作重新安排 UI，但之后就很容易了。

(fyi) The "@" before strings means that \ no longer means something special, so you can type @"c:\fun.txt" instead of "c:\fun.txt"

(fyi) 字符串前的“@”意味着\不再意味着特殊的东西，所以你可以输入@"c:\fun.txt"而不是"c:\fun.txt"

C# 如何使用正则表达式在特定字符串/字符之后找到字符串

提问by Flo

采纳答案by Andrew Backer

回答by Tiago

回答by Andrew Backer

相关推荐

最近更新

标签

C# 如何使用正则表达式在特定字符串/字符之后找到字符串

提问by Flo

采纳答案by Andrew Backer

回答by Tiago

回答by Andrew Backer

相关推荐

支持 7zip (LZMA) 的 C# 免费压缩库

如何在C#中获取USB-Stick的序列号

如何使用 C# 在组合框中设置所选项目以匹配我的字符串？

C# 通用列表 - 在列表中移动一个项目

相关推荐

最近更新

标签