C# 正则表达式“或”运算符避免重复

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14740661/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 12:53:14  来源:igfitidea点击:

Regex 'or' operator avoid repetition

c#.netregex

提问by Tono Nam

How can I use the oroperator while not allowing repetition? In other words the regex:

如何or在不允许重复的情况下使用运算符?换句话说,正则表达式:

(word1|word2|word3)+

will match word1word2but will also match word1word1which I don't want that because the word word1 is being repeated. How can I avoid repetition?

会匹配, word1word2但也会匹配word1word1我不想要的,因为单词 word1 被重复了。如何避免重复?

In summary I will like the following subjects to match:

总之,我希望以下主题相匹配:

word1word2word3
word1
word2
word3word2

Note all of them match cause there is no repetition. And I will like the following subjects to fail:

请注意,所有这些都匹配,因为没有重复。我希望以下科目失败:

word1word2word1
word2word2
word3word1word2word2

Edit

编辑

Thanks to @MarkI know have:

感谢@Mark,我知道:

(?xi)

(?:  
        (?<A>word1|word2)(?!  .*  \k<A> )      # match for word1 or word2 but make sure that if you capture it it does not follow what it was just captured
    |   (?<B>word3|word4)(?!  .*  \k<B> )
)+

because I am interested in seeing if something was captured in group A or B.

因为我有兴趣查看是否在 A 组或 B 组中捕获了某些东西。

采纳答案by Mark Byers

You could use negative lookaheads:

您可以使用负前瞻

^(?:word1(?!.*word1)|word2(?!.*word2)|word3(?!.*word3))+$

See it working online: rubular

查看它在线工作:rubular

回答by Qtax

The lookahead solutions will not work in several cases, you can solve this properly, without lookarounds, by using a construct like this:

前瞻解决方案在某些情况下不起作用,您可以使用如下结构正确解决此问题,无需环顾:

(?:(?(1)(?!))(word1)|(?(2)(?!))(word2)|(?(3)(?!))(word3))+

This works even if some words are substrings of others and will also work if you just want to find the matching substrings of a larger string (and not only match whole string).

即使某些单词是其他单词的子字符串,这也有效,并且如果您只想找到较大字符串的匹配子字符串(而不仅仅是匹配整个字符串),这也适用。

Live demo.

现场演示

It simply works by failing the alteration if it has been matched previously, done by (?(1)(?!)). (?(1)foo)is a conditional, and will match fooif group 1has previously matched. (?!)always fails.

如果之前已匹配,则它只是通过使更改失败来工作,由(?(1)(?!)). (?(1)foo)是一个条件,foo如果 group1之前匹配过,则将匹配。(?!)总是失败。

回答by ΩmegaMan

Byers' solution is too hard coded and gets quite cumbersome after the letters increases.. Why not simply have the regex look for duplicate match?

Byers 的解决方案太硬编码并且在字母增加后变得非常麻烦.. 为什么不简单地让正则表达式寻找重复匹配?

([^\d]+\d)+(?=.*)

If that matches, that match signifies that a repetition has been found in the pattern. If the match doesn't work you have a valid set of data.

如果匹配,则该匹配表示在模式中发现重复。如果匹配不起作用,则您有一组有效的数据。

回答by MikeM

You could use a negative look-ahead containing a back reference:

您可以使用包含反向引用的否定前瞻:

^(?:(word1|word2|word3)(?!.*))+$

where \1refers to the match of the capture group (word1|word2|word3).

where\1指的是捕获组的匹配(word1|word2|word3)

Note that this assumes word2cannot be formed by appending characters to word1, and that word3cannot be formed by appending characters to word1or word2.

请注意,这假定word2不能通过将字符附加到 来形成word1,并且word3不能通过将字符附加到word1或来形成word2