使用 Java 正则表达式匹配器查找最后一个匹配项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6417435/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find the last match with Java regex matcher
提问by kireol
I'm trying to get the last result of a match without having to cycle through .find()
我正在尝试获得比赛的最后结果,而不必循环通过 .find()
Here's my code:
这是我的代码:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num '([0-9]+) ");
Matcher m = p.matcher(in);
if (m.find()) {
in = m.group(1);
}
That will give me the first result. How do I find the LAST match without cycling through a potentionally huge list?
这会给我第一个结果。如何在不循环浏览可能很大的列表的情况下找到最后一场比赛?
采纳答案by Bart Kiers
You could prepend .*
to your regex, which will greedilyconsume all characters up to the last match:
你可以.*
在你的正则表达式前面加上,它会贪婪地消耗直到最后一次匹配的所有字符:
import java.util.regex.*;
class Test {
public static void main (String[] args) {
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile(".*num ([0-9]+)");
Matcher m = p.matcher(in);
if(m.find()) {
System.out.println(m.group(1));
}
}
}
Prints:
印刷:
2134
You could also reverse the string as well as change your regex to match the reverse instead:
您还可以反转字符串以及更改您的正则表达式以匹配反向:
import java.util.regex.*;
class Test {
public static void main (String[] args) {
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("([0-9]+) mun");
Matcher m = p.matcher(new StringBuilder(in).reverse());
if(m.find()) {
System.out.println(new StringBuilder(m.group(1)).reverse());
}
}
}
But neither solution is better than just looping through all matches using while (m.find())
, IMO.
但是这两种解决方案都比使用while (m.find())
, IMO循环遍历所有匹配项更好。
回答by Mark Peters
Java does not provide such a mechanism. The only thing I can suggest would be a binary search for the last index.
Java 没有提供这样的机制。我唯一能建议的是对最后一个索引进行二分搜索。
It would be something like this:
它会是这样的:
N = haystack.length();
if ( matcher.find(N/2) ) {
recursively try right side
else
recursively try left side
Edit
编辑
And here's code that does it since I found it to be an interesting problem:
这是执行此操作的代码,因为我发现这是一个有趣的问题:
import org.junit.Test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import static org.junit.Assert.assertEquals;
public class RecursiveFind {
@Test
public void testFindLastIndexOf() {
assertEquals(0, findLastIndexOf("abcdddddd", "abc"));
assertEquals(1, findLastIndexOf("dabcdddddd", "abc"));
assertEquals(4, findLastIndexOf("aaaaabc", "abc"));
assertEquals(4, findLastIndexOf("aaaaabc", "a+b"));
assertEquals(6, findLastIndexOf("aabcaaabc", "a+b"));
assertEquals(2, findLastIndexOf("abcde", "c"));
assertEquals(2, findLastIndexOf("abcdef", "c"));
assertEquals(2, findLastIndexOf("abcd", "c"));
}
public static int findLastIndexOf(String haystack, String needle) {
return findLastIndexOf(0, haystack.length(), Pattern.compile(needle).matcher(haystack));
}
private static int findLastIndexOf(int start, int end, Matcher m) {
if ( start > end ) {
return -1;
}
int pivot = ((end-start) / 2) + start;
if ( m.find(pivot) ) {
//recurse on right side
return findLastIndexOfRecurse(end, m);
} else if (m.find(start)) {
//recurse on left side
return findLastIndexOfRecurse(pivot, m);
} else {
//not found at all between start and end
return -1;
}
}
private static int findLastIndexOfRecurse(int end, Matcher m) {
int foundIndex = m.start();
int recurseIndex = findLastIndexOf(foundIndex + 1, end, m);
if ( recurseIndex == -1 ) {
return foundIndex;
} else {
return recurseIndex;
}
}
}
I haven't found a breaking test case yet.
我还没有找到破坏性的测试用例。
回答by Garrett Hall
Why not keep it simple?
为什么不保持简单?
in.replaceAll(".*[^\d](\d+).*", "")
回答by krico
Java patterns are greedy by default, the following should do it.
Java 模式默认是贪婪的,下面应该这样做。
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile( ".*num ([0-9]+).*$" );
Matcher m = p.matcher( in );
if ( m.matches() )
{
System.out.println( m.group( 1 ));
}
回答by yingted
Regular expressions are greedy:
正则表达式是贪婪的:
Matcher m=Pattern.compile(".*num '([0-9]+) ",Pattern.DOTALL).matcher("num 123 num 1 num 698 num 19238 num 2134");
will give you a Matcher
for the last match, and you can apply it to most regexes by prepending ".*". Of course, if you can't use DOTALL
, you might want to use (?:\d|\D)
or something similar as your wildcard.
将为您Matcher
提供最后一场比赛的结果,您可以通过添加“.*”将其应用于大多数正则表达式。当然,如果您不能使用DOTALL
,您可能想要使用(?:\d|\D)
或类似于您的通配符的东西。
回答by Bradley M Handy
This seems like a more equally plausible approach.
这似乎是一种更合理的方法。
public class LastMatchTest {
public static void main(String[] args) throws Exception {
String target = "num 123 num 1 num 698 num 19238 num 2134";
Pattern regex = Pattern.compile("(?:.*?num.*?(\d+))+");
Matcher regexMatcher = regex.matcher(target);
if (regexMatcher.find()) {
System.out.println(regexMatcher.group(1));
}
}
}
The .*?
is a reluctant match so it won't gobble up everything. The ?:
forces a non-capturing group so the inner group is group 1. Matching multiples in a greedy fashion causes it to match across the entire string until all matches are exhausted leaving group 1 with the value of your last match.
这.*?
是一场不情愿的比赛,所以它不会吞噬一切。该?:
部队非捕获组,使内组群1.贪婪地匹配倍数使其在整个字符串匹配,直到所有的比赛都用尽离开组1的最后一场比赛的价值。
回答by Norman Se?ler
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num '([0-9]+) ");
Matcher m = p.matcher(in);
String result = "";
while (m.find())
{
result = m.group(1);
}
回答by araut
To get the last match even this works and not sure why this was not mentioned earlier:
为了获得最后一场比赛,即使这有效,也不知道为什么之前没有提到这一点:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num '([0-9]+) ");
Matcher m = p.matcher(in);
if (m.find()) {
in= m.group(m.groupCount());
}
回答by dhalsim2
Use negative lookahead:
使用负前瞻:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("num (\d+)(?!.*num \d+)");
Matcher m = p.matcher(in);
if (m.find()) {
in= m.group(1);
}
The regular expression reads as "num followed by one space and at least one digit without any (num followed by one space and at least one digit) at any point after it".
正则表达式读作“num 后跟一个空格和至少一位数字,在其后的任何一点都没有任何(num 后跟一个空格和至少一位数字)”。
You can get even fancier by combining it with positive lookbehind:
通过将其与积极的后视相结合,您可以变得更漂亮:
String in = "num 123 num 1 num 698 num 19238 num 2134";
Pattern p = Pattern.compile("(?<=num )\d+(?!.*num \d+)");
Matcher m = p.matcher(in);
if (m.find()) {
in = m.group();
}
That one reads as "at least one digit preceded by (num and one space) and not followed by (num followed by one space and at least one digit) at any point after it".
That way you don't have to mess with grouping and worry about the potential IndexOutOfBoundsException
thrown from Matcher.group(int)
.
那个读作“至少一位数字前面是(num 和一个空格),并且在它之后的任何一点都不跟(num 后面跟一个空格和至少一位数字)”。这样你就不必搞乱分组,也不必担心IndexOutOfBoundsException
从Matcher.group(int)
.
回答by necromancer
Compared to the currently accepted answer, this one does not blindly discard elements of the list using the".*"
prefix. Instead, it uses "(element delimiter)*(element)"
to pick out the last element using .group(2)
. See the function magic_last
in code below.
与当前接受的答案相比,这个答案并没有盲目地使用".*"
前缀丢弃列表中的元素。相反,它使用"(element delimiter)*(element)"
来挑选最后一个元素.group(2)
。请参阅magic_last
下面代码中的函数。
To demonstrate the benefit of this approach I have also included a function to pick out the n-th element which is robust enough to accept a list that has fewer than n elements. See the function magic
in code below.
为了演示这种方法的好处,我还包含了一个函数来挑选第 n 个元素,该函数足够健壮以接受少于 n 个元素的列表。请参阅magic
下面代码中的函数。
Filtering out the "num " text and only getting the number is left as an exercise for the reader (just add an extra group around the digits pattern: ([0-9]+)
and pick out group 4 instead of group 2).
过滤掉“num”文本并只获取数字作为练习留给读者(只需在数字模式周围添加一个额外的组:([0-9]+)
并选择第 4 组而不是第 2 组)。
package com.example;
import static java.lang.System.out;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Foo {
public static void main (String [] args) {
String element = "num [0-9]+";
String delimiter = ", ";
String input;
input = "here is a num bro: num 001; hope you like it";
magic_last(input, element, delimiter);
magic(1, input, element, delimiter);
magic(2, input, element, delimiter);
magic(3, input, element, delimiter);
input = "here are some nums bro: num 001, num 002, num 003, num 004, num 005, num 006; hope you like them";
magic_last(input, element, delimiter);
magic(1, input, element, delimiter);
magic(2, input, element, delimiter);
magic(3, input, element, delimiter);
magic(4, input, element, delimiter);
magic(5, input, element, delimiter);
magic(6, input, element, delimiter);
magic(7, input, element, delimiter);
magic(8, input, element, delimiter);
}
public static void magic_last (String input, String element, String delimiter) {
String regexp = "(" + element + delimiter + ")*(" + element + ")";
Pattern pattern = Pattern.compile(regexp);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
out.println(matcher.group(2));
}
}
public static void magic (int n, String input, String element, String delimiter) {
String regexp = "(" + element + delimiter + "){0," + (n - 1) + "}(" + element + ")(" + delimiter + element + ")*";
Pattern pattern = Pattern.compile(regexp);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
out.println(matcher.group(2));
}
}
}
Output:
输出:
num 001
num 001
num 001
num 001
num 006
num 001
num 002
num 003
num 004
num 005
num 006
num 006
num 006