使用Java使用正则表达式查找更大字符串的子字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/600733/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using Java to find substring of a bigger string using Regular Expression
提问by digiarnie
If I have a string like this:
如果我有这样的字符串:
FOO[BAR]
I need a generic way to get the "BAR" string out of the string so that no matter what string is between the square brackets it would be able to get the string.
我需要一种通用的方法来从字符串中取出“BAR”字符串,这样无论方括号之间是什么字符串,它都能得到字符串。
e.g.
例如
FOO[DOG] = DOG
FOO[CAT] = CAT
采纳答案by Bryan Kyle
You should be able to use non-greedy quantifiers, specifically *?. You're going to probably want the following:
您应该能够使用非贪婪量词,特别是 *?。您可能需要以下内容:
Pattern MY_PATTERN = Pattern.compile("\[(.*?)\]");
This will give you a pattern that will match your string and put the text within the square brackets in the first group. Have a look at the Pattern API Documentationfor more information.
这将为您提供一个与您的字符串匹配的模式,并将文本放在第一组的方括号内。查看模式 API 文档了解更多信息。
To extract the string, you could use something like the following:
要提取字符串,您可以使用以下内容:
Matcher m = MY_PATTERN.matcher("FOO[BAR]");
while (m.find()) {
String s = m.group(1);
// s now contains "BAR"
}
回答by zaczap
the non-regex way:
非正则表达式方式:
String input = "FOO[BAR]", extracted;
extracted = input.substring(input.indexOf("["),input.indexOf("]"));
alternatively, for slightly better performance/memory usage (thanks Hosam):
或者,为了稍微更好的性能/内存使用(感谢 Hosam):
String input = "FOO[BAR]", extracted;
extracted = input.substring(input.indexOf('['),input.lastIndexOf(']'));
回答by Kevin Lacquement
I think your regular expression would look like:
我认为您的正则表达式如下所示:
/FOO\[(.+)\]/
Assuming that FOO going to be constant.
假设 FOO 将保持不变。
So, to put this in Java:
所以,把它放在Java中:
Pattern p = Pattern.compile("FOO\[(.+)\]");
Matcher m = p.matcher(inputLine);
回答by Manu
assuming that no other closing square bracket is allowed within, /FOO\[([^\]]*)\]/
假设在 /FOO\[([^\]]*)\]/ 中不允许使用其他右方括号
回答by Fabian Steeg
I'd define that I want a maximum number of non-] characters between [
and ]
. These need to be escaped with backslashes (and in Java, these need to be escaped again), and the definition of non-] is a character class, thus inside [
and ]
(i.e. [^\\]]
). The result:
我会定义我希望在[
和之间有最大数量的非] 字符]
。这些需要用反斜杠转义(在Java中,这些需要再次转义),非]的定义是一个字符类,因此在[
and ]
(即[^\\]]
)里面。结果:
FOO\[([^\]]+)\]
回答by Renaud Bompuis
If you simply need to get whatever is between []
, the you can use \[([^\]]*)\]
like this:
如果您只需要获取 之间的任何内容[]
,您可以\[([^\]]*)\]
像这样使用:
Pattern regex = Pattern.compile("\[([^\]]*)\]");
Matcher m = regex.matcher(str);
if (m.find()) {
result = m.group();
}
If you need it to be of the form identifier + [ + content + ]
then you can limit extracting the content only when the identifier is a alphanumerical:
如果您需要它的形式,identifier + [ + content + ]
那么您可以限制仅当标识符是字母数字时提取内容:
[a-zA-Z][a-z-A-Z0-9_]*\s*\[([^\]]*)\]
This will validate things like Foo [Bar]
, or myDevice_123["input"]
for instance.
这将验证诸如Foo [Bar]
, 或myDevice_123["input"]
例如。
Main issue
主要问题
The main problem is when you want to extract the content of something like this:
主要问题是当你想提取这样的内容时:
FOO[BAR[CAT[123]]+DOG[FOO]]
The Regex won't work and will return BAR[CAT[123
and FOO
.
If we change the Regex to \[(.*)\]
then we're OK but then, if you're trying to extract the content from more complex things like:
正则表达式不起作用,将返回BAR[CAT[123
和FOO
。
如果我们将 Regex 更改为,\[(.*)\]
那么我们就可以了,但是,如果您尝试从更复杂的事物中提取内容,例如:
FOO[BAR[CAT[123]]+DOG[FOO]] = myOtherFoo[BAR[5]]
None of the Regexes will work.
没有一个正则表达式会起作用。
The most accurate Regex to extract the proper content in all cases would be a lot more complex as it would need to balance []
pairs and give you they content.
在所有情况下提取正确内容的最准确正则表达式会复杂得多,因为它需要平衡[]
对并为您提供内容。
A simpler solution
更简单的解决方案
If your problems is getting complex and the content of the []
arbitrary, you could instead balance the pairs of []
and extract the string using plain old code rathe than a Regex:
如果您的问题变得复杂并且内容[]
随意,您可以[]
使用普通的旧代码而不是正则表达式来平衡字符串对并提取字符串:
int i;
int brackets = 0;
string c;
result = "";
for (i = input.indexOf("["); i < str.length; i++) {
c = str.substring(i, i + 1);
if (c == '[') {
brackets++;
} else if (c == ']') {
brackets--;
if (brackets <= 0)
break;
}
result = result + c;
}
This is more pseudo-code than real code, I'm not a Java coder so I don't know if the syntax is correct, but it should be easy enough to improve upon.
What count is that this code should work and allow you to extract the content of the []
, however complex it is.
这比真正的代码更像是伪代码,我不是 Java 编码员,所以我不知道语法是否正确,但应该很容易改进。
重要的是这段代码应该可以工作并允许您提取 的内容[]
,无论它有多复杂。
回答by amit
String input = "FOO[BAR]";
String result = input.substring(input.indexOf("[")+1,input.lastIndexOf("]"));
This will return the value between first '[' and last ']'
这将返回第一个 '[' 和最后一个 ']' 之间的值
Foo[Bar] => Bar
Foo[Bar] => 酒吧
Foo[Bar[test]] => Bar[test]
Foo[Bar[test]] => Bar[test]
Note: You should add error checking if the input string is not well formed.
注意:如果输入字符串格式不正确,您应该添加错误检查。
回答by Djahid Bekka
This is a working example :
这是一个工作示例:
RegexpExample.java
正则表达式示例.java
package org.regexp.replace;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexpExample
{
public static void main(String[] args)
{
String string = "var1[value1], var2[value2], var3[value3]";
Pattern pattern = Pattern.compile("(\[)(.*?)(\])");
Matcher matcher = pattern.matcher(string);
List<String> listMatches = new ArrayList<String>();
while(matcher.find())
{
listMatches.add(matcher.group(2));
}
for(String s : listMatches)
{
System.out.println(s);
}
}
}
It displays :
它显示:
value1
value2
value3
回答by Djahid Bekka
Like this its work if you want to parse some string which is coming from mYearInDB.toString() =[2013] it will give 2013
像这样它的工作,如果你想解析一些来自 mYearInDB.toString() =[2013] 的字符串,它将给出 2013
Matcher n = MY_PATTERN.matcher("FOO[BAR]"+mYearInDB.toString());
while (n.find()) {
extracredYear = n.group(1);
// s now contains "BAR"
}
System.out.println("Extrated output is : "+extracredYear);
回答by dansalmo
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public static String get_match(String s, String p) {
// returns first match of p in s for first group in regular expression
Matcher m = Pattern.compile(p).matcher(s);
return m.find() ? m.group(1) : "";
}
get_match("FOO[BAR]", "\[(.*?)\]") // returns "BAR"
public static List<String> get_matches(String s, String p) {
// returns all matches of p in s for first group in regular expression
List<String> matches = new ArrayList<String>();
Matcher m = Pattern.compile(p).matcher(s);
while(m.find()) {
matches.add(m.group(1));
}
return matches;
}
get_matches("FOO[BAR] FOO[CAT]", "\[(.*?)\]")) // returns [BAR, CAT]