Java 如何使用正则表达式匹配括号内的文本?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1337289/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 08:23:47  来源:igfitidea点击:

How do I match text within parentheses using regex?

javaregex

提问by northpole

I have the following pattern:

我有以下模式:

(COMPANY) -277.9887 (ASP,) -277.9887 (INC.) 

I want the final output to be:

我希望最终输出是:

COMPANY ASP, INC.

公司 ASP, INC.

Currently I have the following code and it keeps returning the original pattern ( I assume because the group all falls between the first '(' and last ')'

目前我有以下代码,它不断返回原始模式(我假设是因为该组都落在第一个 '(' 和最后一个 ')' 之间

Pattern p = Pattern.compile("((.*))",Pattern.DOTALL);
Matcher matcher = p.matcher(eName);
while(matcher.find())
{
    System.out.println("found match:"+matcher.group(1));
}

I am struggling to get the results I need and appreciate any help. I am not worried about concatenating the results after I get each group, just need to get each group.

我正在努力获得我需要的结果并感谢任何帮助。我不担心在我得到每个组后连接结果,只需要得到每个组。

采纳答案by chaos

Pattern p = Pattern.compile("\((.*?)\)",Pattern.DOTALL);

回答by Oliver

Not a direct answer to your question but I recommend you use RegxTesterto get to the answer and any future question quickly. It allows you to test in realtime.

不是您问题的直接答案,但我建议您使用RegxTester快速获得答案和任何未来的问题。它允许您进行实时测试。

回答by ptomli

Your .* quantifier is 'greedy', so yes, it's grabbing everything between the first and last available parenthesis. As chaos says, tersely :), using the .*? is a non-greedy quantifier, so it will grab as little as possible while still maintaining the match.

您的 .* 量词是“贪婪的”,所以是的,它抓取了第一个和最后一个可用括号之间的所有内容。正如混乱所说,简洁:),使用 .*? 是一个非贪婪的量词,所以它会在保持匹配的同时尽可能少地抓取。

And you need to escape the parenthesis within the regex, otherwise it becomes another group. That's assuming there are literal parenthesis in your string. I suspect what you referred to in the initial question as your pattern is in fact your string.

并且您需要对正则表达式中的括号进行转义,否则它将成为另一组。那是假设您的字符串中有文字括号。我怀疑您在最初的问题中提到的内容,因为您的模式实际上是您的字符串。

Query: are "COMPANY", "ASP," and "INC." required?

查询:是“COMPANY”、“ASP”和“INC”。需要吗?

If you must have values for them, then you want to use + instead of *, the + is 1-or-more, the * is zero-or-more, so a * would match the literal string "()"

如果您必须为它们指定值,那么您想使用 + 而不是 *,+ 是 1 个或多个,* 是零个或多个,因此 * 将匹配文字字符串“()”

eg: "((.+?))"

例如:“((.+?))”

回答by Brent Writes Code

If your strings are always going to look like that, you could get away with just using a couple calls to replaceAll instead. This seems to work for me:

如果您的字符串总是看起来像那样,您可以只使用几个调用来代替 replaceAll。这似乎对我有用:

String eName = "(COMPANY) -277.9887 (ASP,) -277.9887 (INC.)";
        String eNameEdited = eName.replaceAll("\).*?\("," ").replaceAll("\(|\)","");
        System.out.println(eNameEdited);

Probably not the most efficient thing in the world, but fairly simple.

可能不是世界上最有效的东西,但相当简单。

回答by Chetan Laddha

Tested with Java 8: /** * Below Pattern returns the string inside Parenthesis.

用 Java 8 测试:/** * 下面的模式返回括号内的字符串。

* Description about casting regular expression: \(+\s*([^\s)]+)\s*\)+

* \(+ : Exactly matches character "(" at least once
* \s* : matches zero to any number white character.
* ( : Start of Capturing group
* [^\s)]+: match any number of character except ^, ) and spaces.
* ) : Closing of capturing group.
* \s*: matches any white character(0 to any number of character)
* \)*: Exactly matches character ")" at least once.


private static Pattern REGULAR_EXPRESSION = Pattern.compile("\(+\s*([^\s)]+)\s*\)+");