Java 中的正则表达式命名组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/415580/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex Named Groups in Java
提问by Dan
It is my understanding that the java.regex
package does not have support for named groups (http://www.regular-expressions.info/named.html) so can anyone point me towards a third-party library that does?
据我了解,该java.regex
软件包不支持命名组(http://www.regular-expressions.info/named.html),所以任何人都可以向我指出支持的第三方库吗?
I've looked at jregexbut its last release was in 2002 and it didn't work for me (admittedly I only tried briefly) under java5.
我看过jregex但它的最后一个版本是在 2002 年,它在 java5 下对我不起作用(诚然我只是简单地尝试过)。
采纳答案by VonC
(Update: August 2011)
(更新:2011 年 8 月)
As geofflanementions in his answer, Java 7 now support named groups.
tchristpoints out in the comment that the support is limited.
He details the limitations in his great answer "Java Regex Helper"
正如geofflane在他的回答中提到的,Java 7 现在支持命名组。
tchrist在评论中指出支持是有限的。
他在他的好答案“ Java Regex Helper”中详细说明了局限性
Java 7 regex named group support was presented back in September 2010in Oracle's blog.
Java 7 regex 命名组支持早在2010年9 月就在 Oracle 的博客中介绍过。
In the official release of Java 7, the constructs to support the named capturing group are:
在 Java 7 的官方版本中,支持命名捕获组的构造是:
(?<name>capturing text)
to define a named group "name"\k<name>
to backreference a named group "name"${name}
to reference to captured group in Matcher's replacement stringMatcher.group(String name)
to return the captured input subsequence by the given "named group".
(?<name>capturing text)
定义一个命名组“name”\k<name>
反向引用命名组“名称”${name}
在 Matcher 的替换字符串中引用捕获的组Matcher.group(String name)
通过给定的“命名组”返回捕获的输入子序列。
Other alternatives for pre-Java 7were:
Java 7 之前的其他替代方案是:
- Google named-regex(see John Hardy's answer)
Gábor Liptákmentions (November 2012) that this project might not be active (with several outstanding bugs), and its GitHub forkcould be considered instead. - jregex(See Brian Clozel's answer)
- Google named-regex(参见John Hardy的回答)
Gábor Lipták提到(2012 年 11 月)这个项目可能不活跃(有几个突出的错误),可以考虑使用它的GitHub 分支。 - jregex(见Brian Clozel的回答)
(Original answer: Jan 2009, with the next two links now broken)
(原始答案:2009 年 1 月,接下来的两个链接现已断开)
You can not refer to named group, unless you code your own version of Regex...
您不能引用命名组,除非您编写自己的 Regex 版本...
That is precisely what Gorbush2 did in this thread.
(limited implementation, as pointed out again by tchrist, as it looks only for ASCII identifiers. tchrist details the limitation as:
(有限的实现,正如tchrist再次指出的,因为它只查找 ASCII 标识符。tchrist 将限制详细说明为:
only being able to have one named group per same name (which you don't always have control over!) and not being able to use them for in-regex recursion.
只能有一个同名的命名组(您并不总是可以控制!)并且不能将它们用于正则表达式中的递归。
Note: You can find true regex recursion examples in Perl and PCRE regexes, as mentioned in Regexp Power, PCRE specsand Matching Strings with Balanced Parenthesesslide)
注意:您可以在 Perl 和 PCRE regex 中找到真正的 regex 递归示例,如Regexp Power、PCRE 规范和Matching Strings with Balanced Parentheses幻灯片中所述)
Example:
例子:
String:
细绳:
"TEST 123"
RegExp:
正则表达式:
"(?<login>\w+) (?<id>\d+)"
Access
使用权
matcher.group(1) ==> TEST
matcher.group("login") ==> TEST
matcher.name(1) ==> login
Replace
代替
matcher.replaceAll("aaaaa__sssss_____") ==> aaaaa_TEST_sssss_123____
matcher.replaceAll("aaaaa_${login}_sssss_${id}____") ==> aaaaa_TEST_sssss_123____
(extract from the implementation)
(摘自实现)
public final class Pattern
implements java.io.Serializable
{
[...]
/**
* Parses a group and returns the head node of a set of nodes that process
* the group. Sometimes a double return system is used where the tail is
* returned in root.
*/
private Node group0() {
boolean capturingGroup = false;
Node head = null;
Node tail = null;
int save = flags;
root = null;
int ch = next();
if (ch == '?') {
ch = skip();
switch (ch) {
case '<': // (?<xxx) look behind or group name
ch = read();
int start = cursor;
[...]
// test forGroupName
int startChar = ch;
while(ASCII.isWord(ch) && ch != '>') ch=read();
if(ch == '>'){
// valid group name
int len = cursor-start;
int[] newtemp = new int[2*(len) + 2];
//System.arraycopy(temp, start, newtemp, 0, len);
StringBuilder name = new StringBuilder();
for(int i = start; i< cursor; i++){
name.append((char)temp[i-1]);
}
// create Named group
head = createGroup(false);
((GroupTail)root).name = name.toString();
capturingGroup = true;
tail = root;
head.next = expr(tail);
break;
}
回答by John Hardy
Yes but its messy hacking the sun classes. There is a simpler way:
是的,但它的乱砍sun 类。有一个更简单的方法:
http://code.google.com/p/named-regexp/
http://code.google.com/p/named-regexp/
named-regexp is a thin wrapper for the standard JDK regular expressions implementation, with the single purpose of handling named capturing groups in the .net style : (?...).
It can be used with Java 5 and 6 (generics are used).
Java 7 will handle named capturing groups , so this project is not meant to last.
named-regexp 是标准 JDK 正则表达式实现的瘦包装器,其唯一目的是处理 .net 风格的命名捕获组:(?...)。
它可以与 Java 5 和 6 一起使用(使用泛型)。
Java 7 将处理命名捕获组,因此该项目不会持续下去。
回答by Brian Clozel
What kind of problem do you get with jregex? It worked well for me under java5 and java6.
你用jregex 遇到什么样的问题?它在 java5 和 java6 下对我来说效果很好。
Jregex does the job well (even if the last version is from 2002), unless you want to wait for javaSE 7.
Jregex 可以很好地完成这项工作(即使最后一个版本是 2002 年的),除非您想等待 javaSE 7。
回答by geofflane
For people coming to this late: Java 7 adds named groups. Matcher.group(String groupName) documentation.
对于迟到的人:Java 7 添加了命名组。Matcher.group(String groupName) 文档。
回答by Ryan Smith
For those running pre-java7, named groups are supported by joni(Java port of the Onigurumaregexp library). Documentation is sparse, but it has worked well for us.
Binaries are available via Maven (http://repository.codehaus.org/org/jruby/joni/joni/).
对于那些运行 java7 之前的版本,joni(Onigurumaregexp 库的Java 端口)支持命名组。文档很少,但对我们来说效果很好。
二进制文件可通过 Maven ( http://repository.codehaus.org/org/jruby/joni/joni/) 获得。
回答by Henrik Hofmeister
A bit old question but I found myself needing this also and that the suggestions above were inaduquate - and as such - developed a thin wrapper myself: https://github.com/hofmeister/MatchIt
一个有点老的问题,但我发现自己也需要这个,而且上面的建议不充分 - 因此 - 自己开发了一个薄包装:https: //github.com/hofmeister/MatchIt