Java 中的正则表达式命名组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/415580/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 14:21:18  来源:igfitidea点击:

Regex Named Groups in Java

javaregex

提问by Dan

It is my understanding that the java.regexpackage does not have support for named groups (http://www.regular-expressions.info/named.html) so can anyone point me towards a third-party library that does?

据我了解,该java.regex软件包不支持命名组(http://www.regular-expressions.info/named.html),所以任何人都可以向我指出支持的第三方库吗?

I've looked at jregexbut its last release was in 2002 and it didn't work for me (admittedly I only tried briefly) under java5.

我看过jregex但它的最后一个版本是在 2002 年,它在 java5 下对我不起作用(诚然我只是简单地尝试过)。

采纳答案by VonC

(Update: August 2011)

更新2011 年 8 月

As geofflanementions in his answer, Java 7 now support named groups.
tchristpoints out in the comment that the support is limited.
He details the limitations in his great answer "Java Regex Helper"

正如geofflane他的回答中提到的,Java 7 现在支持命名组
tchrist在评论中指出支持是有限的。
在他的好答案“ Java Regex Helper”中详细说明了局限性

Java 7 regex named group support was presented back in September 2010in Oracle's blog.

Java 7 regex 命名组支持早在20109 月就在 Oracle 的博客中介绍过

In the official release of Java 7, the constructs to support the named capturing group are:

在 Java 7 的官方版本中,支持命名捕获组的构造是:

  • (?<name>capturing text)to define a named group "name"
  • \k<name>to backreference a named group "name"
  • ${name}to reference to captured group in Matcher's replacement string
  • Matcher.group(String name)to return the captured input subsequence by the given "named group".
  • (?<name>capturing text)定义一个命名组“name”
  • \k<name>反向引用命名组“名称”
  • ${name}在 Matcher 的替换字符串中引用捕获的组
  • Matcher.group(String name)通过给定的“命名组”返回捕获的输入子序列。


Other alternatives for pre-Java 7were:

Java 7 之前的其他替代方案是:



(Original answer: Jan 2009, with the next two links now broken)

原始答案2009 年 1 月,接下来的两个链接现已断开)

You can not refer to named group, unless you code your own version of Regex...

您不能引用命名组,除非您编写自己的 Regex 版本...

That is precisely what Gorbush2 did in this thread.

这正是Gorbush2 在此线程中所做的

Regex2

正则表达式2

(limited implementation, as pointed out again by tchrist, as it looks only for ASCII identifiers. tchrist details the limitation as:

(有限的实现,正如tchrist再次指出的,因为它只查找 ASCII 标识符。tchrist 将限制详细说明为:

only being able to have one named group per same name (which you don't always have control over!) and not being able to use them for in-regex recursion.

只能有一个同名的命名组(您并不总是可以控制!)并且不能将它们用于正则表达式中的递归。

Note: You can find true regex recursion examples in Perl and PCRE regexes, as mentioned in Regexp Power, PCRE specsand Matching Strings with Balanced Parenthesesslide)

注意:您可以在 Perl 和 PCRE regex 中找到真正的 regex 递归示例,如Regexp PowerPCRE 规范Matching Strings with Balanced Parentheses幻灯片中所述)

Example:

例子:

String:

细绳:

"TEST 123"

RegExp:

正则表达式:

"(?<login>\w+) (?<id>\d+)"

Access

使用权

matcher.group(1) ==> TEST
matcher.group("login") ==> TEST
matcher.name(1) ==> login

Replace

代替

matcher.replaceAll("aaaaa__sssss_____") ==> aaaaa_TEST_sssss_123____
matcher.replaceAll("aaaaa_${login}_sssss_${id}____") ==> aaaaa_TEST_sssss_123____ 


(extract from the implementation)

(摘自实现)

public final class Pattern
    implements java.io.Serializable
{
[...]
    /**
     * Parses a group and returns the head node of a set of nodes that process
     * the group. Sometimes a double return system is used where the tail is
     * returned in root.
     */
    private Node group0() {
        boolean capturingGroup = false;
        Node head = null;
        Node tail = null;
        int save = flags;
        root = null;
        int ch = next();
        if (ch == '?') {
            ch = skip();
            switch (ch) {

            case '<':   // (?<xxx)  look behind or group name
                ch = read();
                int start = cursor;
[...]
                // test forGroupName
                int startChar = ch;
                while(ASCII.isWord(ch) && ch != '>') ch=read();
                if(ch == '>'){
                    // valid group name
                    int len = cursor-start;
                    int[] newtemp = new int[2*(len) + 2];
                    //System.arraycopy(temp, start, newtemp, 0, len);
                    StringBuilder name = new StringBuilder();
                    for(int i = start; i< cursor; i++){
                        name.append((char)temp[i-1]);
                    }
                    // create Named group
                    head = createGroup(false);
                    ((GroupTail)root).name = name.toString();

                    capturingGroup = true;
                    tail = root;
                    head.next = expr(tail);
                    break;
                }

回答by John Hardy

Yes but its messy hacking the sun classes. There is a simpler way:

是的,但它的乱砍sun 类。有一个更简单的方法:

http://code.google.com/p/named-regexp/

http://code.google.com/p/named-regexp/

named-regexp is a thin wrapper for the standard JDK regular expressions implementation, with the single purpose of handling named capturing groups in the .net style : (?...).

It can be used with Java 5 and 6 (generics are used).

Java 7 will handle named capturing groups , so this project is not meant to last.

named-regexp 是标准 JDK 正则表达式实现的瘦包装器,其唯一目的是处理 .net 风格的命名捕获组:(?...)。

它可以与 Java 5 和 6 一起使用(使用泛型)。

Java 7 将处理命名捕获组,因此该项目不会持续下去。

回答by Brian Clozel

What kind of problem do you get with jregex? It worked well for me under java5 and java6.

你用jregex 遇到什么样的问题?它在 java5 和 java6 下对我来说效果很好。

Jregex does the job well (even if the last version is from 2002), unless you want to wait for javaSE 7.

Jregex 可以很好地完成这项工作(即使最后一个版本是 2002 年的),除非您想等待 javaSE 7

回答by geofflane

For people coming to this late: Java 7 adds named groups. Matcher.group(String groupName) documentation.

对于迟到的人:Java 7 添加了命名组。Matcher.group(String groupName) 文档。

回答by Ryan Smith

For those running pre-java7, named groups are supported by joni(Java port of the Onigurumaregexp library). Documentation is sparse, but it has worked well for us.
Binaries are available via Maven (http://repository.codehaus.org/org/jruby/joni/joni/).

对于那些运行 java7 之前的版本,joniOnigurumaregexp 库的Java 端口)支持命名组。文档很少,但对我们来说效果很好。
二进制文件可通过 Maven ( http://repository.codehaus.org/org/jruby/joni/joni/) 获得。

回答by Henrik Hofmeister

A bit old question but I found myself needing this also and that the suggestions above were inaduquate - and as such - developed a thin wrapper myself: https://github.com/hofmeister/MatchIt

一个有点老的问题,但我发现自己也需要这个,而且上面的建议不充分 - 因此 - 自己开发了一个薄包装:https: //github.com/hofmeister/MatchIt