java 如果匹配多次出现,有没有办法捕获每个组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4285013/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is there a way to capture each group if multiple occurrences are matched?
提问by Roman
I don't know how to explain the problem in plain English, so I help myself with regexp example. I have something similar to this (the example is pretty much simplified):
我不知道如何用简单的英语解释这个问题,所以我用 regexp 示例帮助自己。我有类似的东西(这个例子非常简化):
((\d+) - (\d+)\n)+
This pattern matches these lines at once:
此模式立即匹配这些行:
123 - 23
32 - 321
3 - 0
99 - 55
The pattern contains 3 groups: the first one matches a line, the 2nd one matches first number in the line, and the 3rd one matches second number in the line.
该模式包含 3 组:第一个匹配一行,第二个匹配该行中的第一个数字,第三个匹配该行中的第二个数字。
Is there a possibility to get all those numbers? Matcher has only 3 groups. The first one returns 99 - 55
, the 2nd one - 99
and the 3rd one - 55
.
是否有可能获得所有这些数字?Matcher 只有 3 个组。第一个返回99 - 55
,第二个-99
和第三个- 55
。
SSCCE:
SSCCE:
class Test {
private static final Pattern pattern = Pattern.compile("((\d+) - (\d+)\n)+");
public static void parseInput(String input) {
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
for (int i = 0; i <= matcher.groupCount(); i++) {
System.out.println("------------");
System.out.println("Group " + i + ": " + matcher.group(i));
}
System.out.println();
}
}
public static void main(String[] args) {
parseInput("123 - 23\n32 - 321\n3 - 0\n99 - 55\n");
}
}
回答by marczoid
One more remark about the answer of Mike Caron: the program will not work if you simple replace "if" with "while" and use "find" instead of "match". You should also change the regular expression: the last group with the "+" should be removed, because you want to search for multiple occurrences of this pattern, and not for one occurrence of a (..)+ group.
关于 Mike Caron 的回答的另一个评论:如果您简单地将“if”替换为“while”并使用“find”而不是“match”,则该程序将无法运行。您还应该更改正则表达式:应该删除最后一个带有“+”的组,因为您要搜索此模式的多次出现,而不是搜索 (..)+ 组的一次出现。
For clarity, this is the final program that works:
为清楚起见,这是最终有效的程序:
class Test {
private static final Pattern pattern = Pattern.compile("(\d+) - (\d+)\n");
public static void parseInput(String input) {
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
for (int i = 0; i <= matcher.groupCount(); i++) {
System.out.println("------------");
System.out.println("Group " + i + ": " + matcher.group(i));
}
System.out.println();
}
}
public static void main(String[] args) {
parseInput("123 - 23\n32 - 321\n3 - 0\n99 - 55\n");
}
}
It will give you three groups for each line, where the first group is the entire line and the two following groups each contain a number. This is a good tutorial that helped me to understand it better: http://tutorials.jenkov.com/java-regex/matcher.html
它将为每行提供三组,其中第一组是整行,接下来的两个组各包含一个数字。这是一个很好的教程,帮助我更好地理解它:http: //tutorials.jenkov.com/java-regex/matcher.html
回答by Mike Caron
IfI'm notmistaken (a distinct possibility), then every time you call So, basically, change the matcher.matches()
, it updates with the next match.if (matcher.matches())
into a while (matcher.find())
, and you're ready to go.
如果我没有弄错(一个明显的可能性),那么每次您调用 时所以,基本上,将 更改matcher.matches()
,它都会在下一场比赛中更新。if (matcher.matches())
为while (matcher.find())
,您就可以开始了。
EDIT: Actually, it's not matches
, it's find
that does this:
编辑:实际上,它不是matches
,它是find
这样做的:
http://download.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#find%28%29
http://download.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#find%28%29
Here's an example of using it:
下面是一个使用它的例子:
http://download.oracle.com/javase/tutorial/essential/regex/test_harness.html
http://download.oracle.com/javase/tutorial/essential/regex/test_harness.html
回答by Mrki
You're trying to match each line separately?
你想分别匹配每一行?
Remove the + to match only one line and change:
删除 + 以仅匹配一行并更改:
if (matcher.matches()) {
to:
到:
while (matcher.matches()) {
and it will loop once for each match and automatically skip any unmatched text between the matches.
它将为每个匹配循环一次,并自动跳过匹配之间的任何不匹配文本。
Note that matcher.group(0) returns the whole match. Actual groups start with 1.
请注意 matcher.group(0) 返回整个匹配项。实际组从 1 开始。