java 合并两个正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12858116/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 10:30:56  来源:igfitidea点击:

merge two regular expressions

javaregex

提问by KristianMedK

I have two regular expressions, one pulling out usernames from a csv string, and the other pulling out emails.

我有两个正则表达式,一个从 csv 字符串中提取用户名,另一个提取电子邮件。

the string format is like this:

字符串格式是这样的:

String s = "name lastname (username) <[email protected]>; name lastname (username) <[email protected]>; name lastname (username) <[email protected]>";

the code for my regular expressions are like this.

我的正则表达式的代码是这样的。

Pattern pattern = Pattern.compile("(?<=\()[^\)]+");
Matcher matcher = pattern.matcher(s);
Pattern pattern2 = Pattern.compile("((?<=<)[^>]+)");
Matcher matcher2 = pattern2.matcher(s);

while (matcher.find() && matcher2.find()) {
    System.out.println(matcher.group() + " " + matcher2.group());
}

I've found several qeustions about merging regexes, but from the answers i haven't been able to figure out how to merge mine.

我发现了几个关于合并正则表达式的问题,但从答案中我无法弄清楚如何合并我的。

my printouts show:

我的打印输出显示:

"username [email protected]"

would I be able to print out the samefrom a single matcher, using one regex?

我可以使用一个正则表达式从单个匹配器中打印出相同的内容吗?

obs: this is a school assignment, which means i do not "need" to merge them or do any more, but i'd like to know if it is possible, and how difficult it would be.

obs:这是一项学校作业,这意味着我不需要“合并”它们或做更多事情,但我想知道是否有可能,以及这会有多困难。

回答by Rohit Jain

You can just use an Pipe (|)in between your multiple Regex, to match all of them : -

您可以Pipe (|)在 , 之间使用一个inmultiple Regex来匹配所有这些:-

    String s = "name lastname (username) <[email protected]>; name lastname
            (username) <[email protected]>; name lastname 
            (username) <[email protected]>;";

    // Matches (?<=\()[^\)]+  or  ((?<=<)[^>]+)
    Pattern pattern = Pattern.compile("(?<=\()[^\)]+|((?<=<)[^>]+)");
    Matcher matcher = pattern.matcher(s);

    while (matcher.find()) {
        System.out.println(matcher.group());
    }

OUTPUT: -

输出:-

username
[email protected]
username
[email protected]
username
[email protected]

UPDATE: -

更新:-

If you want to print usernameand emailonly when they both exists, then you need to split your string on ;and then apply the below Regex on each of them.

如果您想打印username并且email仅当它们都存在时,那么您需要拆分您的字符串;,然后在每个字符串上应用以下正则表达式。

Here's the code: -

这是代码: -

    String s = "name lastname (username) ; 
                name lastname (username) <[email protected]>; 
                name lastname (username) <[email protected]>;";

    String [] strArr = s.split(";");

    for (String str: strArr) {

        Pattern pattern = Pattern.compile("\(([^\)]+)(?:\))\s(?:\<)((?<=<)[^>]+)");
        Matcher matcher = pattern.matcher(str);

        while (matcher.find()) {
            System.out.print(matcher.group(1) + " " + matcher.group(2));
        }
        System.out.println();
    }

OUTPUT: -

输出:-

username [email protected]
username [email protected] // Only the last two have both username and email

回答by brimborium

The following code will extract your pairs. The regex is quite short, but I am almost sure, there is a more elegant way (there always is with regex!). ;)

以下代码将提取您的配对。正则表达式很短,但我几乎可以肯定,有一种更优雅的方式(正则表达式总是存在的!)。;)

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {

    public static void main(String[] args) {
        String s = "name1 lastname1 (user1); name2 lastname2 (username2) <[email protected]>; name3 lastname3 (username3) <[email protected]>;";

        Pattern pattern = Pattern.compile("\(([^\)]+)\)\s<([^>]+)>");
        Matcher matcher = pattern.matcher(s);

        while (matcher.find()) {
            System.out.println(matcher.group(1) + " " + matcher.group(2));
        }
    }
}

Output:

输出:

username2 [email protected]
username3 [email protected]

用户名2 [email protected]
用户名3 [email protected]

Explanation for the regex "\\(([^\\)]+)\\)\\s<([^>]+)>":

正则表达式的解释"\\(([^\\)]+)\\)\\s<([^>]+)>"

  • \\(([^\\)]+)\\): A group of non-)characters enclosed by (and )
  • \\s: A space in between
  • <([^>]+)>: A group of non->characters enclosed by <and >
  • \\(([^\\)]+)\\):)(和包围的一组非字符)
  • \\s: 中间有空格
  • <([^>]+)>:><和包围的一组非字符>