C# 正则表达式非贪婪(懒惰)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13844168/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 09:58:17  来源:igfitidea点击:

Regex Non-Greedy (Lazy)

c#regexhtml-tablenon-greedy

提问by steventnorris

I'm attempting to non-greedily parse out TD tags. I'm starting with something like this:

我试图非贪婪地解析出 TD 标签。我从这样的事情开始:

<TD>stuff<TD align="right">More stuff<TD align="right>Other stuff<TD>things<TD>more things

I'm using the below as my regex:

我使用以下作为我的正则表达式:

Regex.Split(tempS, @"\<TD[.\s]*?\>");

The records return as below:

记录返回如下:

""
"stuff<TD align="right">More stuff<TD align="right>Other stuff"
"things"
"more things"

Why is it not splitting that first full result (the one starting with "stuff")? How can I adjust the regex to split on all instances of the TD tag with or without parameters?

为什么不拆分第一个完整结果(以“stuff”开头的结果)?如何调整正则表达式以在带或不带参数的 TD 标记的所有实例上拆分?

采纳答案by Chris Seymour

The regex you want is <TD[^>]*>:

你想要的正则表达式是<TD[^>]*>

<     # Match opening tag
TD    # Followed by TD
[^>]* # Followed by anything not a > (zero or more)
>     # Closing tag

Note: .matches anything (including whitespace) so [.\s]*?is redundant and wrong as [.]matches a literal .so use .*?.

注意:.匹配任何内容(包括空格),因此与文字匹配[.\s]*?是多余和错误的,因此请使用.[.]..*?

回答by Jason

For non greedy match, try this <TD.*?>

对于非贪婪匹配,试试这个 <TD.*?>

回答by Bastien Vandamme

From https://regex101.com/

来自https://regex101.com/

  • *Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  • *?Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy)
  • *量词 - 在零次和无限次之间匹配,尽可能多次,根据需要回馈(贪婪)
  • *?量词 - 在零次和无限次之间匹配,尽可能少,根据需要扩展(懒惰)