正则表达式获取 C# 中模式的所有可能匹配项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/638297/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex to get all possible matches for a pattern in C#
提问by Archie
I'm learning regex and need to get all possible matches for a pattern out of a string.
我正在学习正则表达式,需要从字符串中获取模式的所有可能匹配项。
If my input is:
如果我的输入是:
case a
when cond1
then stmt1;
when cond2
then stmt2;
end case;
I need to get the matches which have groups as follows
我需要获得以下组的比赛
Group1:
第一组:
"cond1"
"stmt1;"
"cond1"
"stmt1;"
and Group2:
和 Group2:
"cond2"
"stmt2;"
"cond2"
"stmt2;"
Is it possible to get such groups using any regex?
是否有可能使用任何正则表达式来获得这样的组?
采纳答案by rslite
It's possible to use regex for this provided that you don't nest your statements. For example if your stmt1 is another case statment then all bets are off (you can't use regex for something like that, you need a regular parser).
如果您不嵌套语句,则可以为此使用正则表达式。例如,如果您的 stmt1 是另一个案例陈述,那么所有赌注都将关闭(您不能将正则表达式用于类似的事情,您需要一个常规解析器)。
Edit: If you really want to try it you can do it with something like (not tested, but you get the idea):
编辑:如果你真的想尝试它,你可以用类似的东西来做(未经测试,但你明白了):
Regex t = new Regex(@"when\s+(.*?)\s+then\s+(.*?;)", RegexOptions.Singleline)
allMatches = t.Matches(input_string)
But as I said this will work only for not nested statements.
但正如我所说,这仅适用于非嵌套语句。
Edit 2: Changed a little the regex to include the semicolon in the last group. This will not work as you wanted - instead it will give you multiple matches and each match will represent one whencondition, with the first group the condition and the second groupthe statement.
编辑 2:稍微更改正则表达式以在最后一组中包含分号。这不会如您所愿 - 相反,它会为您提供多个匹配项,每个匹配项将代表一个when条件,第一组是条件,第二组是语句。
I don't think you can build a regex that does exactly what you want, but this should be close enough (I hope).
我不认为您可以构建一个完全符合您要求的正则表达式,但这应该足够接近(我希望)。
Edit 3: New regex - should handle multiple statements
编辑 3:新的正则表达式 - 应该处理多个语句
Regex t = new Regex(@"when\s+(.*?)\s+then\s+(.*?)(?=(when|end))", RegexOptions.Singleline)
It contains a positive lookahead so that the second group matches from thento the next 'when' or 'end'. In my test it worked with this:
它包含一个正向前瞻,以便第二组从那时到下一个“何时”或“结束”匹配。在我的测试中,它适用于:
case a
when cond1
then stmt1;
stm1;
stm2;stm3
when cond2
then stmt2;
aaa;
bbb;
end case;
It's case sensitive for now, so if you need case insensitivity you need to add the corresponding regex flag.
它现在区分大小写,因此如果您需要不区分大小写,则需要添加相应的正则表达式标志。
回答by Spoike
If this was written in java I would write two patterns for the parser, one to match the cases and one to match the when-then cases. Here is how the latter could be written:
如果这是用 java 编写的,我将为解析器编写两种模式,一种用于匹配案例,另一种用于匹配 when-then 案例。下面是后者的写法:
CharSequence buffer = inputString.subSequence(0, inputString.length());
// inputString is the string you get after matching the case statements...
Pattern pattern = Pattern.compile(
"when (\S+).*"
+ "then (\S+).*");
Matcher matcher = pattern.matcher(buffer);
while (matcher.find()) {
DoWhenThen(matcher.group(1), matcher.group(2));
}
Note: I haven't tested this code as I'm not 100% sure on the pattern... but I'd be tinkering around this.
注意:我还没有测试过这个代码,因为我不是 100% 确定模式......但我会修补这个。
回答by Spoike
I don't think this is possible, primarily because any group that matches when...then... is going to match all of them, creating multiple captures within the same group.
我认为这是不可能的,主要是因为任何匹配 when...then... 的组都将匹配所有这些,在同一组内创建多个捕获。
I'd suggest using this regex:
我建议使用这个正则表达式:
(?:when(.*)\nthen(.*)\n)+?
which results in:
这导致:
Match 1:
* Group 1: cond1
* Group 2: stmt1;
Match 2:
* Group 1: cond2
* Group 2: stmt2;
第 1场比赛:* 第 1 组:cond1
* 第 2 组:stmt1;
第 2 场比赛:
* 第 1 组:cond2
* 第 2 组:stmt2;