java 查找全局模式匹配

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32618365/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 20:30:11  来源:igfitidea点击:

Find global pattern matches

javaregex

提问by user2917629

I have a pattern like this:

我有一个这样的模式:

String pattern = "(media:\s\d+)"

I want to match a substring variation of

我想匹配的子串变体

"media:" + space/no space + X

...where Xis a set of numbers. The pattern can appear anywhere in text and followed by anything.

...其中X是一组数字。模式可以出现在文本中的任何地方,后面可以跟着任何东西。

Here's the example:

这是示例:

"The Moment of Impact text: Camera captures the deadly explosions and chaos near the marathon's finish line.media: 18962980Video shows runner ... falling as a result of the blast media: 18967421A bystander films the chaos of people positioned in between the two explosions."

“撞击瞬间文本:相机捕捉到马拉松终点线附近致命的爆炸和混乱。媒体:18962980视频显示跑步者……因爆炸而摔倒媒体:18967421旁观者拍摄了位于马拉松终点线附近的人们的混乱两次爆炸。”

For this my pattern returns only the first occurrence instead of all. Here is the code I'm using:

为此,我的模式只返回第一次出现而不是全部。这是我正在使用的代码:

String pattern = "(media:\s\d+)"; 
Pattern media = Pattern.compile(pattern,Pattern.MULTILINE);
java.util.regex.Matcher m = media.matcher(text);        
if(m.find()) {
    logger.info("-- group:"+m.group());     
}

回答by Makoto

This is a case of replacing the ifwith a while. So long as the matcher isn't reset, Matcher#findwill continue to match tokens until it exhausts the string.

这是更换的情况下ifwhile。只要匹配器没有重置,Matcher#find就会继续匹配令牌,直到用完字符串。

You will also need to adjust the regex since you may or may not match spaces. Use the expression \\s?, which either does or does not match a single space.

您还需要调整正则表达式,因为您可能匹配也可能不匹配空格。使用表达式\\s?,它匹配或不匹配单个空格。

As a general tip, Pattern.MULTILINEonly makes sense with anchors (^and $), and since you don't have any, you can safely remove it. It's not doing any damage as is, but it will actively make your code less readable.

作为一般提示,Pattern.MULTILINE仅对锚点 (^$)有意义,并且由于您没有锚点,因此可以安全地将其删除。它不会造成任何损害,但它会主动降低您的代码的可读性。

String pattern = "media:\s?\d+"; 
Pattern media = Pattern.compile(pattern);
java.util.regex.Matcher m = media.matcher(text);        
while(m.find()) {
    logger.info("-- group:"+m.group());     
}

回答by james jelo4kul

The reason why it didn't repeat or loop is because you didn't use a while statement. For it to work, change your if statement to while.

它没有重复或循环的原因是因为您没有使用 while 语句。要使其正常工作,请将您的 if 语句更改为 while。

while(m.find()) {
    logger.info("-- group:"+m.group());     
}

Use this modification to your regex pattern:

将此修改用于您的正则表达式模式:

String pattern = "(media:\s?\d+)"

The reason for the change ie \\s?is to enable you match the pattern even if there is no space there. Hope this helps!

更改 ie 的原因\\s?是使您能够匹配模式,即使那里没有空间。希望这可以帮助!