C# 正则表达式“或”运算符避免重复
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14740661/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex 'or' operator avoid repetition
提问by Tono Nam
How can I use the or
operator while not allowing repetition? In other words the regex:
如何or
在不允许重复的情况下使用运算符?换句话说,正则表达式:
(word1|word2|word3)+
will match word1word2
but will also match word1word1
which I don't want that because the word word1 is being repeated. How can I avoid repetition?
会匹配, word1word2
但也会匹配word1word1
我不想要的,因为单词 word1 被重复了。如何避免重复?
In summary I will like the following subjects to match:
总之,我希望以下主题相匹配:
word1word2word3
word1
word2
word3word2
Note all of them match cause there is no repetition. And I will like the following subjects to fail:
请注意,所有这些都匹配,因为没有重复。我希望以下科目失败:
word1word2word1
word2word2
word3word1word2word2
Edit
编辑
Thanks to @MarkI know have:
感谢@Mark,我知道:
(?xi)
(?:
(?<A>word1|word2)(?! .* \k<A> ) # match for word1 or word2 but make sure that if you capture it it does not follow what it was just captured
| (?<B>word3|word4)(?! .* \k<B> )
)+
because I am interested in seeing if something was captured in group A or B.
因为我有兴趣查看是否在 A 组或 B 组中捕获了某些东西。
采纳答案by Mark Byers
You could use negative lookaheads:
您可以使用负前瞻:
^(?:word1(?!.*word1)|word2(?!.*word2)|word3(?!.*word3))+$
See it working online: rubular
查看它在线工作:rubular
回答by Qtax
The lookahead solutions will not work in several cases, you can solve this properly, without lookarounds, by using a construct like this:
前瞻解决方案在某些情况下不起作用,您可以使用如下结构正确解决此问题,无需环顾:
(?:(?(1)(?!))(word1)|(?(2)(?!))(word2)|(?(3)(?!))(word3))+
This works even if some words are substrings of others and will also work if you just want to find the matching substrings of a larger string (and not only match whole string).
即使某些单词是其他单词的子字符串,这也有效,并且如果您只想找到较大字符串的匹配子字符串(而不仅仅是匹配整个字符串),这也适用。
现场演示。
It simply works by failing the alteration if it has been matched previously, done by (?(1)(?!))
. (?(1)foo)
is a conditional, and will match foo
if group 1
has previously matched. (?!)
always fails.
如果之前已匹配,则它只是通过使更改失败来工作,由(?(1)(?!))
. (?(1)foo)
是一个条件,foo
如果 group1
之前匹配过,则将匹配。(?!)
总是失败。
回答by ΩmegaMan
Byers' solution is too hard coded and gets quite cumbersome after the letters increases.. Why not simply have the regex look for duplicate match?
Byers 的解决方案太硬编码并且在字母增加后变得非常麻烦.. 为什么不简单地让正则表达式寻找重复匹配?
([^\d]+\d)+(?=.*)
If that matches, that match signifies that a repetition has been found in the pattern. If the match doesn't work you have a valid set of data.
如果匹配,则该匹配表示在模式中发现重复。如果匹配不起作用,则您有一组有效的数据。
回答by MikeM
You could use a negative look-ahead containing a back reference:
您可以使用包含反向引用的否定前瞻:
^(?:(word1|word2|word3)(?!.*))+$
where \1
refers to the match of the capture group (word1|word2|word3)
.
where\1
指的是捕获组的匹配(word1|word2|word3)
。
Note that this assumes word2
cannot be formed by appending characters to word1
, and that word3
cannot be formed by appending characters to word1
or word2
.
请注意,这假定word2
不能通过将字符附加到 来形成word1
,并且word3
不能通过将字符附加到word1
或来形成word2
。