java 如何在Java中匹配括号内(嵌套)的字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17759004/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-01 14:53:38  来源:igfitidea点击:

How to match string within parentheses (nested) in Java?

javaregexnested

提问by Li Dong

I would like to match a string within parentheses like:

我想匹配括号内的字符串,例如:

(i, j, k(1))
^^^^^^^^^^^^

The string can contain closed parentheses too. How to match it with regular expression in Java without writing a parser, since this is a small part of my project. Thanks!

该字符串也可以包含闭括号。如何在不编写解析器的情况下将其与 Java 中的正则表达式匹配,因为这是我项目的一小部分。谢谢!

Edit:

编辑:

I want to search out a string block and find something like u(i, j, k), u(i, j, k(1))or just u(<anything within this paired parens>), and replace them to __u%array(i, j, k)and __u%array(i, j, k(1))for my Fortran translating application.

我想搜索一个字符串块并找到类似u(i, j, k),u(i, j, k(1))或只是u(<anything within this paired parens>), 并将它们替换为__u%array(i, j, k)__u%array(i, j, k(1))用于我的 Fortran 翻译应用程序。

回答by acdcjunior

As I said, contrary to popular belief (don't believe everything people say) matching nested brackets ispossible with regex.

正如我所说,与流行的看法相反(不要相信人们所说的一切)使用正则表达式匹配嵌套括号可能的。

The downside of using it is that you can only up to a fixed level of nesting. And for every additional level you wish to support, your regex will be bigger and bigger.

使用它的缺点是您只能达到固定级别的嵌套。对于您希望支持的每个额外级别,您的正则表达式将越来越大。

But don't take my word for it. Let me show you. The regex:

但不要相信我的话。我来给你展示。正则表达式:

\([^()]*\)

Matches one level. For up to two levels, you'd need:

匹配一级。对于最多两个级别,您需要:

\(([^()]*|\([^()]*\))*\)

And so on. To keep adding levels, all you have to do is change the middle (second) [^()]*part to ([^()]*|\([^()]*\))*(check three levels here). As I said, it will get bigger and bigger.

等等。要继续添加级别,您所要做的就是将中间(第二个)[^()]*部分更改为([^()]*|\([^()]*\))*在此处检查三个级别)。正如我所说,它会越来越大。

Your problem:

你的问题:

For your case, two levels may be enough. So the Java code for it would be:

对于您的情况,两个级别可能就足够了。所以它的Java代码是:

String fortranCode = "code code u(i, j, k) code code code code u(i, j, k(1)) code code code u(i, j, k(m(2))) should match this last 'u', but it doesnt.";
String regex = "(\w+)(\(([^()]*|\([^()]*\))*\))"; // (\w+)(\(([^()]*|\([^()]*\))*\))
System.out.println(fortranCode.replaceAll(regex, "__%array"));

Input:

输入:

code code u(i, j, k) code code code code u(i, j, k(1)) code code code u(i, j, k(m(2))) should match this last 'u', but it doesnt.

Output:

输出:

code code __u%array(i, j, k) code code code code __u%array(i, j, k(1)) code code code u(i, j, __k%array(m(2))) should match this last 'u', but it doesnt.

Bottom line:

底线:

In the general case, parserswill do a better job - that's why people get so pissy about it. But for simple applications, regexes can pretty much be enough.

在一般情况下,解析器会做得更好 - 这就是人们对它如此生气的原因。但是对于简单的应用程序,正则表达式就足够了。

Note:Some flavors of regex support the nesting operator R(Java doesn't, PCRE engines like PHP and Perl do), which allows you to nest arbitrary number of levels. With them, you could do: \(([^()]|(?R))*\).

注意:某些正则表达式支持嵌套运算符R(Java 不支持,像 PHP 和 Perl 这样的 PCRE 引擎支持),它允许您嵌套任意数量的级别。有了他们,你可以这样做:\(([^()]|(?R))*\)

回答by fge

Separate your job. Have the regex be:

分开你的工作。让正则表达式为:

([a-z]+)\((.*)\)

The first group will contain the identifier, the second the parameters. Then proceeed as such:

第一组将包含标识符,第二组将包含参数。然后继续这样:

private static final Pattern PATTERN = Pattern.compile("([a-z]+)\((.*)\)");

// ...

final Matcher m = Pattern.matcher(input);

if (!m.matches())
    // No match! Deal with it.

// If match, then:

final String identifier = m.group(1);
final String params = m.group(2);

// Test if there is a paren
params.indexOf('(') != -1;

Replace [a-z]+with whatever an identifier can be in Fortran.

替换[a-z]+为 Fortran 中可以包含的任何标识符。

回答by lpiepiora

Please check this answer as it does basically what you try to do (in short it's not really possible with regexps)

请检查此答案,因为它基本上可以完成您尝试做的事情(简而言之,使用正则表达式不太可能)

Regular Expression to match outer brackets

正则表达式匹配外括号