使用 Java 正则表达式模式解析字符串?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45175606/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parse string using Java Regex Pattern?
提问by Bharath Reddy
I have the below java string in the below format.
我有以下格式的java字符串。
String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:"
Using the java.util.regex package matter and pattern classes I have to get the output string int the following format:
使用 java.util.regex 包问题和模式类,我必须获得以下格式的输出字符串 int :
Output: [NYK:1100][CLT:2300][KTY:3540]
Can you suggest a RegEx pattern which can help me get the above output format?
你能建议一个可以帮助我获得上述输出格式的 RegEx 模式吗?
回答by YCF_L
You can use this regex \[name:([A-Z]+)\]\[distance:(\d+)\]
with Pattern like this :
您可以\[name:([A-Z]+)\]\[distance:(\d+)\]
像这样将此正则表达式与 Pattern 一起使用:
String regex = "\[name:([A-Z]+)\]\[distance:(\d+)\]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);
StringBuilder result = new StringBuilder();
while (matcher.find()) {
result.append("[");
result.append(matcher.group(1));
result.append(":");
result.append(matcher.group(2));
result.append("]");
}
System.out.println(result.toString());
Output
输出
[NYK:1100][CLT:2300][KTY:3540]
- regex demo
\[name:([A-Z]+)\]\[distance:(\d+)\]
mean get two groups one the upper letters after the\[name:([A-Z]+)\]
the second get the number after\[distance:(\d+)\]
- 正则表达式演示
\[name:([A-Z]+)\]\[distance:(\d+)\]
意思是得到两个组,一个是大写字母\[name:([A-Z]+)\]
,第二个是后面的数字\[distance:(\d+)\]
Another solution from @tradeJmarkyou can use this regex :
@tradeJmark 的另一个解决方案您可以使用这个正则表达式:
String regex = "\[name:(?<name>[A-Z]+)\]\[distance:(?<distance>\d+)\]";
So you can easily get the results of each group by the name of group instead of the index like this :
因此,您可以通过组名而不是像这样的索引轻松获取每个组的结果:
while (matcher.find()) {
result.append("[");
result.append(matcher.group("name"));
//----------------------------^^
result.append(":");
result.append(matcher.group("distance"));
//------------------------------^^
result.append("]");
}
回答by Wiktor Stribi?ew
If the format of the string is fixed, and you always have just 3 [...]
groups inside to deal with, you may define a block that matches [name:...]
and captures the 2 parts into separate groups and use a quite simple code with .replaceAll
:
如果字符串的格式是固定的,并且[...]
内部总是只有 3 个组要处理,则可以定义一个块来匹配[name:...]
并将这 2 个部分捕获到单独的组中,并使用非常简单的代码.replaceAll
:
String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:";
String matchingBlock = "\s*\[name:([A-Z]+)]\[distance:(\d+)]";
String res = s.replaceAll(String.format(".*%1$s%1$s%1$s.*", matchingBlock),
"[:][:][:]");
System.out.println(res); // [NYK:1100][CLT:2300][KTY:3540]
See the Java demoand a regex demo.
The block pattern matches:
块模式匹配:
\\s*
- 0+ whitespaces\\[name:
- a literal[name:
substring([A-Z]+)
- Group n capturing 1 or more uppercase ASCII chars (\\w+
can also be used)]\\[distance:
- a literal][distance:
substring(\\d+)
- Group m capturing 1 or more digits]
- a]
symbol.
\\s*
- 0+ 个空格\\[name:
- 文字[name:
子串([A-Z]+)
- 组 n 捕获 1 个或多个大写 ASCII 字符(\\w+
也可以使用)]\\[distance:
- 文字][distance:
子串(\\d+)
- 组 m 捕获 1 个或多个数字]
- 一个]
符号。
In the .*%1$s%1$s%1$s.*
pattern, the groups will have 1 to 6 IDs (referred to with $1
- $6
backreferences from the replacement pattern) and the leading and final .*
will remove start and end of the string (add (?s)
at the start of the pattern if the string can contain line breaks).
在.*%1$s%1$s%1$s.*
模式中,组将有 1 到 6 个 ID(用$1
-$6
来自替换模式的反向引用引用),前导.*
和结尾将删除字符串的开头和结尾((?s)
如果字符串可以包含行,则在模式的开头添加断)。