java Java如何根据输入检查多个正则表达式模式?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42988414/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java how to check multiple regex patterns against an input?
提问by SuperCow
(If I'm taking the complete wrong direction let me know if there is a better way I should be approaching this)
(如果我的方向完全错误,请告诉我是否有更好的方法来解决这个问题)
I have a Java program that will have multiple patterns that I want to compare against an input. If one of the patterns matches then I want to save that value in a String. I can get it to work with a single pattern but I'd like to be able to check against many.
我有一个 Java 程序,它将有多个模式,我想与输入进行比较。如果其中一个模式匹配,那么我想将该值保存在一个字符串中。我可以让它与单一模式一起工作,但我希望能够检查许多模式。
Right now I have this to check if an input matches one pattern:
现在我有这个来检查输入是否匹配一个模式:
Pattern pattern = Pattern.compile("TST\w{1,}");
Matcher match = pattern.matcher(input);
String ID = match.find()?match.group():null;
So, if the input was TST1234 or abcTST1234 then ID = "TST1234"
所以,如果输入是 TST1234 或 abcTST1234 那么 ID = "TST1234"
I want to have multiple patterns like:
我想要多种模式,例如:
Pattern pattern = Pattern.compile("TST\w{1,}");
Pattern pattern = Pattern.compile("TWT\w{1,}");
...
and then to a collection and then check each one against the input:
然后到一个集合,然后根据输入检查每个集合:
List<Pattern> rxs = new ArrayList<Pattern>();
rxs.add(pattern);
rxs.add(pattern2);
String ID = null;
for (Pattern rx : rxs) {
if (rx.matcher(requestEnt).matches()){
ID = //???
}
}
I'm not sure how to set ID to what I want. I've tried
我不确定如何将 ID 设置为我想要的。我试过了
ID = rx.matcher(requestEnt).group();
and
和
ID = rx.matcher(requestEnt).find()?rx.matcher(requestEnt).group():null;
Not really sure how to make this work or where to go from here though. Any help or suggestions are appreciated. Thanks.
不太确定如何使这项工作或从这里去哪里。任何帮助或建议表示赞赏。谢谢。
EDIT: Yes the patterns will change over time. So The patten list will grow.
编辑:是的,模式会随着时间的推移而改变。所以模式列表将会增长。
I just need to get the string of the match...ie if the input is abcTWT123 it will first check against "TST\w{1,}", then move on to "TWT\w{1,}" and since that matches the ID String will be set to "TWT123".
我只需要获取匹配的字符串...即,如果输入是 abcTWT123,它将首先检查“TST\w{1,}”,然后转到“TWT\w{1,}”,从那以后匹配的 ID 字符串将被设置为“TWT123”。
回答by sprinter
To collect the matched string in the result you may need to create a group in your regexp if you are matching less than the entire string:
要在结果中收集匹配的字符串,如果您匹配的字符串少于整个字符串,您可能需要在正则表达式中创建一个组:
List<Pattern> patterns = new ArrayList<>();
patterns.add(Pattern.compile("(TST\w+)");
...
Optional<String> result = Optional.empty();
for (Pattern pattern: patterns) {
Matcher matcher = pattern.match();
if (matcher.matches()) {
result = Optional.of(matcher.group(1));
break;
}
}
Or, if you are familiar with streams:
或者,如果您熟悉流:
Optional<String> result = patterns.stream()
.map(Pattern::match).filter(Matcher::matches)
.map(m -> m.group(1)).findFirst();
The alternative is to use find(as in @Raffaele's answer) that implicitly creates a group.
另一种方法是使用find(如@Raffaele 的回答)隐式创建一个组。
Another alternative you may want to consider is to put all your matches into a single pattern.
您可能要考虑的另一种选择是将所有匹配项放入一个模式中。
Pattern pattern = Pattern.compile("(TST\w+|TWT\w+|...");
Then you can match and group in a single operation. However this might might it harder to change the matches over time.
然后,您可以在单个操作中进行匹配和分组。然而,随着时间的推移,这可能更难改变匹配。
Group 1 is the first matched group (i.e. the match inside the first set of parentheses). Group 0 is the entire match. So if you want the entire match (I wasn't sure from your question) then you could perhaps use group 0.
第 1 组是第一个匹配的组(即第一组括号内的匹配)。第 0 组是整场比赛。因此,如果您想要整个比赛(我不确定您的问题),那么您也许可以使用组 0。
回答by Raffaele
Maybe you just need to end the loop when the first pattern matches:
也许您只需要在第一个模式匹配时结束循环:
// TST\w{1,}
// TWT\w{1,}
private List<Pattern> patterns;
public String findIdOrNull(String input) {
for (Pattern p : patterns) {
Matcher m = p.matcher(input);
// First match. If the whole string must match use .matches()
if (m.find()) {
return m.group(0);
}
}
return null; // Or throw an Exception if this should never happen
}
回答by Bohemian
Use an alternation |(a regex OR):
使用交替|(正则表达式 OR):
Pattern pattern = Pattern.compile("TST\w+|TWT\w+|etc");
Then just check the pattern once.
然后只需检查一次模式。
Note also that {1,}can be replaced with +.
另请注意,{1,}可以替换为+.
回答by Stephen P
If your patterns are all going to be simple prefixes like your examples TSTand TWTyou can define all of those at once, and user regex alternation |so you won't need to loop over the patterns.
如果你的模式都是简单的前缀,比如你的例子TST和TWT,你可以一次定义所有这些,并且用户正则表达式交替,|这样你就不需要循环模式。
An example:
一个例子:
String prefixes = "TWT|TST|WHW";
String regex = "(" + prefixes + ")\w+";
Pattern pattern = Pattern.compile(regex);
String input = "abcTST123";
Matcher match = pattern.matcher(input);
String ID = match.find() ? match.group() : null;
// given this, ID will come out as "TST123"
Now prefixescould be read in from a java .propertiesfile, or a simple text file; or passed as a parameter to the method that does this.
You could also define the prefixes as a comma-separated list or one-per-line in a file then process that to turn them into one|two|three|etcbefore passing it on.
现在prefixes可以从 java.properties文件或简单的文本文件中读取;或作为参数传递给执行此操作的方法。
您还可以将前缀定义为逗号分隔的列表或文件中的每行一个,然后one|two|three|etc在传递之前对其进行处理以将其转换。
You may be looping over several inputs, and then you would want to create the regexand patternvariables only once, creating only the Matcher for each separate input.
您可能会遍历多个输入,然后您只想创建regex和pattern变量一次,仅为每个单独的输入创建匹配器。

