Java模式匹配除给定列表之外的任何字符序列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/672440/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 17:41:42  来源:igfitidea点击:

Java Pattern to match any sequence of characters except a given list

javaregex

提问by Mario

How do I write a Pattern (Java) to match any sequence of characters except a given list of words?

如何编写模式(Java)以匹配除给定单词列表之外的任何字符序列?

I need to find if a given code has any text surrounded by tags like besides a given list of words. For example, I want to check if there are any other words besides "one" and "two" surrounded by the tag .

我需要查找给定代码中是否有任何文本被标签包围,比如除了给定的单词列表之外。例如,我想检查标签周围除了“一”和“二”之外是否还有其他单词。

"This is the first tag <span>one</span> and this is the third <span>three</span>"

The pattern should match the above string because the word "three" is surrounded by the tag and is not part of the list of given words ("one", "two").

该模式应该与上面的字符串匹配,因为单词“three”被标记包围,而不是给定单词列表(“one”、“two”)的一部分。

回答by Sarel Botha

Use this:

用这个:

if (!Pattern.matches(".*(word1|word2|word3).*", "word1")) {
    System.out.println("We're good.");
};

You're checking that the pattern does notmatch the string.

您正在检查模式是否与字符串匹配。

回答by Tomalak

Look-ahead can do this:

前瞻可以做到这一点:

\b(?!your|given|list|of|exclusions)\w+\b

Matches

火柴

  • a word boundary (start-of-word)
  • not followed by any of "your", "given", "list", "of", "exclusions"
  • followed by multiple word characters
  • followed by a word boundary (end-of-word)
  • 单词边界(单词开头)
  • 后面没有任何“你的”、“给定的”、“列表”、“的”、“排除的”
  • 后跟多个单词字符
  • 后跟词边界(词尾)

In effect, this matches any word that is not excluded.

实际上,这匹配任何未排除的单词。

回答by Lieven Keersmaekers

This should get you started.

这应该让你开始。

import java.util.regex.*;

// >(?!one<|two<)(\w+)/
// 
// Match the character “>” literally ?>?
// Assert that it is impossible to match the regex below starting at this position (negative lookahead) ?(?!one|two)?
//    Match either the regular expression below (attempting the next alternative only if this one fails) ?one?
//       Match the characters “one<” literally ?one?
//    Or match regular expression number 2 below (the entire group fails if this one fails to match) ?two?
//       Match the characters “two<” literally ?two?
// Match the regular expression below and capture its match into backreference number 1 ?(\w+)?
//    Match a single character that is a “word character” (letters, digits, etc.) ?\w+?
//       Between one and unlimited times, as many times as possible, giving back as needed (greedy) ?+?
// Match the characters “/” literally ?</?
List<String> matchList = new ArrayList<String>();
try {
    Pattern regex = Pattern.compile(">(?!one<|two<)(\w+)/");
    Matcher regexMatcher = regex.matcher(subjectString);
    while (regexMatcher.find()) {
        matchList.add(regexMatcher.group(1));
    } 
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
}