java Java如何根据输入检查多个正则表达式模式?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42988414/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java how to check multiple regex patterns against an input?
提问by SuperCow
(If I'm taking the complete wrong direction let me know if there is a better way I should be approaching this)
(如果我的方向完全错误,请告诉我是否有更好的方法来解决这个问题)
I have a Java program that will have multiple patterns that I want to compare against an input. If one of the patterns matches then I want to save that value in a String. I can get it to work with a single pattern but I'd like to be able to check against many.
我有一个 Java 程序,它将有多个模式,我想与输入进行比较。如果其中一个模式匹配,那么我想将该值保存在一个字符串中。我可以让它与单一模式一起工作,但我希望能够检查许多模式。
Right now I have this to check if an input matches one pattern:
现在我有这个来检查输入是否匹配一个模式:
Pattern pattern = Pattern.compile("TST\w{1,}");
Matcher match = pattern.matcher(input);
String ID = match.find()?match.group():null;
So, if the input was TST1234 or abcTST1234 then ID = "TST1234"
所以,如果输入是 TST1234 或 abcTST1234 那么 ID = "TST1234"
I want to have multiple patterns like:
我想要多种模式,例如:
Pattern pattern = Pattern.compile("TST\w{1,}");
Pattern pattern = Pattern.compile("TWT\w{1,}");
...
and then to a collection and then check each one against the input:
然后到一个集合,然后根据输入检查每个集合:
List<Pattern> rxs = new ArrayList<Pattern>();
rxs.add(pattern);
rxs.add(pattern2);
String ID = null;
for (Pattern rx : rxs) {
if (rx.matcher(requestEnt).matches()){
ID = //???
}
}
I'm not sure how to set ID to what I want. I've tried
我不确定如何将 ID 设置为我想要的。我试过了
ID = rx.matcher(requestEnt).group();
and
和
ID = rx.matcher(requestEnt).find()?rx.matcher(requestEnt).group():null;
Not really sure how to make this work or where to go from here though. Any help or suggestions are appreciated. Thanks.
不太确定如何使这项工作或从这里去哪里。任何帮助或建议表示赞赏。谢谢。
EDIT: Yes the patterns will change over time. So The patten list will grow.
编辑:是的,模式会随着时间的推移而改变。所以模式列表将会增长。
I just need to get the string of the match...ie if the input is abcTWT123 it will first check against "TST\w{1,}", then move on to "TWT\w{1,}" and since that matches the ID String will be set to "TWT123".
我只需要获取匹配的字符串...即,如果输入是 abcTWT123,它将首先检查“TST\w{1,}”,然后转到“TWT\w{1,}”,从那以后匹配的 ID 字符串将被设置为“TWT123”。
回答by sprinter
To collect the matched string in the result you may need to create a group in your regexp if you are matching less than the entire string:
要在结果中收集匹配的字符串,如果您匹配的字符串少于整个字符串,您可能需要在正则表达式中创建一个组:
List<Pattern> patterns = new ArrayList<>();
patterns.add(Pattern.compile("(TST\w+)");
...
Optional<String> result = Optional.empty();
for (Pattern pattern: patterns) {
Matcher matcher = pattern.match();
if (matcher.matches()) {
result = Optional.of(matcher.group(1));
break;
}
}
Or, if you are familiar with streams:
或者,如果您熟悉流:
Optional<String> result = patterns.stream()
.map(Pattern::match).filter(Matcher::matches)
.map(m -> m.group(1)).findFirst();
The alternative is to use find
(as in @Raffaele's answer) that implicitly creates a group.
另一种方法是使用find
(如@Raffaele 的回答)隐式创建一个组。
Another alternative you may want to consider is to put all your matches into a single pattern.
您可能要考虑的另一种选择是将所有匹配项放入一个模式中。
Pattern pattern = Pattern.compile("(TST\w+|TWT\w+|...");
Then you can match and group in a single operation. However this might might it harder to change the matches over time.
然后,您可以在单个操作中进行匹配和分组。然而,随着时间的推移,这可能更难改变匹配。
Group 1 is the first matched group (i.e. the match inside the first set of parentheses). Group 0 is the entire match. So if you want the entire match (I wasn't sure from your question) then you could perhaps use group 0.
第 1 组是第一个匹配的组(即第一组括号内的匹配)。第 0 组是整场比赛。因此,如果您想要整个比赛(我不确定您的问题),那么您也许可以使用组 0。
回答by Raffaele
Maybe you just need to end the loop when the first pattern matches:
也许您只需要在第一个模式匹配时结束循环:
// TST\w{1,}
// TWT\w{1,}
private List<Pattern> patterns;
public String findIdOrNull(String input) {
for (Pattern p : patterns) {
Matcher m = p.matcher(input);
// First match. If the whole string must match use .matches()
if (m.find()) {
return m.group(0);
}
}
return null; // Or throw an Exception if this should never happen
}
回答by Bohemian
Use an alternation |
(a regex OR):
使用交替|
(正则表达式 OR):
Pattern pattern = Pattern.compile("TST\w+|TWT\w+|etc");
Then just check the pattern once.
然后只需检查一次模式。
Note also that {1,}
can be replaced with +
.
另请注意,{1,}
可以替换为+
.
回答by Stephen P
If your patterns are all going to be simple prefixes like your examples TSTand TWTyou can define all of those at once, and user regex alternation |
so you won't need to loop over the patterns.
如果你的模式都是简单的前缀,比如你的例子TST和TWT,你可以一次定义所有这些,并且用户正则表达式交替,|
这样你就不需要循环模式。
An example:
一个例子:
String prefixes = "TWT|TST|WHW";
String regex = "(" + prefixes + ")\w+";
Pattern pattern = Pattern.compile(regex);
String input = "abcTST123";
Matcher match = pattern.matcher(input);
String ID = match.find() ? match.group() : null;
// given this, ID will come out as "TST123"
Now prefixes
could be read in from a java .properties
file, or a simple text file; or passed as a parameter to the method that does this.
You could also define the prefixes as a comma-separated list or one-per-line in a file then process that to turn them into one|two|three|etc
before passing it on.
现在prefixes
可以从 java.properties
文件或简单的文本文件中读取;或作为参数传递给执行此操作的方法。
您还可以将前缀定义为逗号分隔的列表或文件中的每行一个,然后one|two|three|etc
在传递之前对其进行处理以将其转换。
You may be looping over several inputs, and then you would want to create the regex
and pattern
variables only once, creating only the Matcher for each separate input.
您可能会遍历多个输入,然后您只想创建regex
和pattern
变量一次,仅为每个单独的输入创建匹配器。