java 如果匹配多次出现,有没有办法捕获每个组?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4285013/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 05:41:41  来源:igfitidea点击:

Is there a way to capture each group if multiple occurrences are matched?

javaregex

提问by Roman

I don't know how to explain the problem in plain English, so I help myself with regexp example. I have something similar to this (the example is pretty much simplified):

我不知道如何用简单的英语解释这个问题,所以我用 regexp 示例帮助自己。我有类似的东西(这个例子非常简化):

((\d+) - (\d+)\n)+

This pattern matches these lines at once:

此模式立即匹配这些行:

123 - 23
32 - 321
3 - 0
99 - 55

The pattern contains 3 groups: the first one matches a line, the 2nd one matches first number in the line, and the 3rd one matches second number in the line.

该模式包含 3 组:第一个匹配一行,第二个匹配该行中的第一个数字,第三个匹配该行中的第二个数字。

Is there a possibility to get all those numbers? Matcher has only 3 groups. The first one returns 99 - 55, the 2nd one - 99and the 3rd one - 55.

是否有可能获得所有这些数字?Matcher 只有 3 个组。第一个返回99 - 55,第二个-99和第三个- 55

SSCCE:

SSCCE:

class Test {
    private static final Pattern pattern = Pattern.compile("((\d+) - (\d+)\n)+");

    public static void parseInput(String input) {

        Matcher matcher = pattern.matcher(input);

        if (matcher.matches()) {

            for (int i = 0; i <= matcher.groupCount(); i++) {
                System.out.println("------------");
                System.out.println("Group " + i + ": " + matcher.group(i));
            }
            System.out.println();
        }

    }

    public static void main(String[] args) {
        parseInput("123 - 23\n32 - 321\n3 - 0\n99 - 55\n");
    }
}

回答by marczoid

One more remark about the answer of Mike Caron: the program will not work if you simple replace "if" with "while" and use "find" instead of "match". You should also change the regular expression: the last group with the "+" should be removed, because you want to search for multiple occurrences of this pattern, and not for one occurrence of a (..)+ group.

关于 Mike Caron 的回答的另一个评论:如果您简单地将“if”替换为“while”并使用“find”而不是“match”,则该程序将无法运行。您还应该更改正则表达式:应该删除最后一个带有“+”的组,因为您要搜索此模式的多次出现,而不是搜索 (..)+ 组的一次出现。

For clarity, this is the final program that works:

为清楚起见,这是最终有效的程序:

class Test {
    private static final Pattern pattern = Pattern.compile("(\d+) - (\d+)\n");

    public static void parseInput(String input) {

        Matcher matcher = pattern.matcher(input);

        while (matcher.find()) {

            for (int i = 0; i <= matcher.groupCount(); i++) {
                System.out.println("------------");
                System.out.println("Group " + i + ": " + matcher.group(i));
            }
            System.out.println();
        }
    }

    public static void main(String[] args) {
        parseInput("123 - 23\n32 - 321\n3 - 0\n99 - 55\n");
    }
}

It will give you three groups for each line, where the first group is the entire line and the two following groups each contain a number. This is a good tutorial that helped me to understand it better: http://tutorials.jenkov.com/java-regex/matcher.html

它将为每行提供三组,其中第一组是整行,接下来的两个组各包含一个数字。这是一个很好的教程,帮助我更好地理解它:http: //tutorials.jenkov.com/java-regex/matcher.html

回答by Mike Caron

IfI'm notmistaken (a distinct possibility), then every time you call matcher.matches(), it updates with the next match.So, basically, change the if (matcher.matches())into a while (matcher.find()), and you're ready to go.

如果没有弄错(一个明显的可能性),那么每次您调用 时matcher.matches(),它都会在下一场比赛中更新。所以,基本上,将 更改if (matcher.matches())while (matcher.find()),您就可以开始了。

EDIT: Actually, it's not matches, it's findthat does this:

编辑:实际上,它不是matches,它是find这样做的:

http://download.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#find%28%29

http://download.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#find%28%29

Here's an example of using it:

下面是一个使用它的例子:

http://download.oracle.com/javase/tutorial/essential/regex/test_harness.html

http://download.oracle.com/javase/tutorial/essential/regex/test_harness.html

回答by Mrki

You're trying to match each line separately?

你想分别匹配每一行?

Remove the + to match only one line and change:

删除 + 以仅匹配一行并更改:

   if (matcher.matches()) {

to:

到:

   while (matcher.matches()) {

and it will loop once for each match and automatically skip any unmatched text between the matches.

它将为每个匹配循环一次,并自动跳过匹配之间的任何不匹配文本。

Note that matcher.group(0) returns the whole match. Actual groups start with 1.

请注意 matcher.group(0) 返回整个匹配项。实际组从 1 开始。