C# 正则表达式“或”运算符避免重复

Question

提问by Tono Nam

How can I use the oroperator while not allowing repetition? In other words the regex:

如何or在不允许重复的情况下使用运算符？换句话说，正则表达式：

(word1|word2|word3)+

will match word1word2but will also match word1word1which I don't want that because the word word1 is being repeated. How can I avoid repetition?

会匹配， word1word2但也会匹配word1word1我不想要的，因为单词 word1 被重复了。如何避免重复？

In summary I will like the following subjects to match:

总之，我希望以下主题相匹配：

word1word2word3
word1
word2
word3word2

Note all of them match cause there is no repetition. And I will like the following subjects to fail:

请注意，所有这些都匹配，因为没有重复。我希望以下科目失败：

word1word2word1
word2word2
word3word1word2word2

Edit

编辑

Thanks to @MarkI know have:

感谢@Mark，我知道：

(?xi)

(?:  
        (?<A>word1|word2)(?!  .*  \k<A> )      # match for word1 or word2 but make sure that if you capture it it does not follow what it was just captured
    |   (?<B>word3|word4)(?!  .*  \k<B> )
)+

because I am interested in seeing if something was captured in group A or B.

因为我有兴趣查看是否在 A 组或 B 组中捕获了某些东西。

Answer 1

采纳答案by Mark Byers

You could use negative lookaheads:

您可以使用负前瞻：

^(?:word1(?!.*word1)|word2(?!.*word2)|word3(?!.*word3))+$

See it working online: rubular

查看它在线工作：rubular

Answer 2

回答by Qtax

The lookahead solutions will not work in several cases, you can solve this properly, without lookarounds, by using a construct like this:

前瞻解决方案在某些情况下不起作用，您可以使用如下结构正确解决此问题，无需环顾：

(?:(?(1)(?!))(word1)|(?(2)(?!))(word2)|(?(3)(?!))(word3))+

This works even if some words are substrings of others and will also work if you just want to find the matching substrings of a larger string (and not only match whole string).

即使某些单词是其他单词的子字符串，这也有效，并且如果您只想找到较大字符串的匹配子字符串（而不仅仅是匹配整个字符串），这也适用。

Live demo.

现场演示。

It simply works by failing the alteration if it has been matched previously, done by (?(1)(?!)). (?(1)foo)is a conditional, and will match fooif group 1has previously matched. (?!)always fails.

如果之前已匹配，则它只是通过使更改失败来工作，由(?(1)(?!)). (?(1)foo)是一个条件，foo如果 group1之前匹配过，则将匹配。(?!)总是失败。

Answer 3

回答by ΩmegaMan

Byers' solution is too hard coded and gets quite cumbersome after the letters increases.. Why not simply have the regex look for duplicate match?

Byers 的解决方案太硬编码并且在字母增加后变得非常麻烦.. 为什么不简单地让正则表达式寻找重复匹配？

([^\d]+\d)+(?=.*)

If that matches, that match signifies that a repetition has been found in the pattern. If the match doesn't work you have a valid set of data.

如果匹配，则该匹配表示在模式中发现重复。如果匹配不起作用，则您有一组有效的数据。

Answer 4

回答by MikeM

You could use a negative look-ahead containing a back reference:

您可以使用包含反向引用的否定前瞻：

^(?:(word1|word2|word3)(?!.*))+$

where \1refers to the match of the capture group (word1|word2|word3).

where\1指的是捕获组的匹配(word1|word2|word3)。

Note that this assumes word2cannot be formed by appending characters to word1, and that word3cannot be formed by appending characters to word1or word2.

请注意，这假定word2不能通过将字符附加到来形成word1，并且word3不能通过将字符附加到word1或来形成word2。

C# 正则表达式“或”运算符避免重复

提问by Tono Nam

Edit

编辑

采纳答案by Mark Byers

回答by Qtax

回答by ΩmegaMan

回答by MikeM

相关推荐

最近更新

标签

C# 正则表达式“或”运算符避免重复

提问by Tono Nam

Edit

编辑

采纳答案by Mark Byers

回答by Qtax

回答by ΩmegaMan

回答by MikeM

相关推荐

如何在 C# 中在运行时向类添加属性？

C# 如何将带有 IP 地址的连接字符串放入 web.config 文件中？

C# 一个实体对象不能被多个 IEntityChangeTracker 实例引用

C# 存在同名数据库，或无法打开指定文件，或位于 UNC 共享

相关推荐

最近更新

标签