Java模式匹配除给定列表之外的任何字符序列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/672440/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java Pattern to match any sequence of characters except a given list
提问by Mario
How do I write a Pattern (Java) to match any sequence of characters except a given list of words?
如何编写模式(Java)以匹配除给定单词列表之外的任何字符序列?
I need to find if a given code has any text surrounded by tags like besides a given list of words. For example, I want to check if there are any other words besides "one" and "two" surrounded by the tag .
我需要查找给定代码中是否有任何文本被标签包围,比如除了给定的单词列表之外。例如,我想检查标签周围除了“一”和“二”之外是否还有其他单词。
"This is the first tag <span>one</span> and this is the third <span>three</span>"
The pattern should match the above string because the word "three" is surrounded by the tag and is not part of the list of given words ("one", "two").
该模式应该与上面的字符串匹配,因为单词“three”被标记包围,而不是给定单词列表(“one”、“two”)的一部分。
回答by Sarel Botha
Use this:
用这个:
if (!Pattern.matches(".*(word1|word2|word3).*", "word1")) {
System.out.println("We're good.");
};
You're checking that the pattern does notmatch the string.
您正在检查模式是否与字符串不匹配。
回答by Tomalak
Look-ahead can do this:
前瞻可以做到这一点:
\b(?!your|given|list|of|exclusions)\w+\b
Matches
火柴
- a word boundary (start-of-word)
- not followed by any of "your", "given", "list", "of", "exclusions"
- followed by multiple word characters
- followed by a word boundary (end-of-word)
- 单词边界(单词开头)
- 后面没有任何“你的”、“给定的”、“列表”、“的”、“排除的”
- 后跟多个单词字符
- 后跟词边界(词尾)
In effect, this matches any word that is not excluded.
实际上,这匹配任何未排除的单词。
回答by Lieven Keersmaekers
This should get you started.
这应该让你开始。
import java.util.regex.*;
// >(?!one<|two<)(\w+)/
//
// Match the character “>” literally ?>?
// Assert that it is impossible to match the regex below starting at this position (negative lookahead) ?(?!one|two)?
// Match either the regular expression below (attempting the next alternative only if this one fails) ?one?
// Match the characters “one<” literally ?one?
// Or match regular expression number 2 below (the entire group fails if this one fails to match) ?two?
// Match the characters “two<” literally ?two?
// Match the regular expression below and capture its match into backreference number 1 ?(\w+)?
// Match a single character that is a “word character” (letters, digits, etc.) ?\w+?
// Between one and unlimited times, as many times as possible, giving back as needed (greedy) ?+?
// Match the characters “/” literally ?</?
List<String> matchList = new ArrayList<String>();
try {
Pattern regex = Pattern.compile(">(?!one<|two<)(\w+)/");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group(1));
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}