正则表达式使用 C# 从 CDATA 解析出 html
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/812303/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex to parse out html from CDATA with C#
提问by Little Larry Sellers
I would like to parse out any HTML data that is returned wrapped in CDATA.
我想解析返回的任何包含在 CDATA 中的 HTML 数据。
As an example <![CDATA[<table><tr><td>Approved</td></tr></table>]]>
举个例子 <![CDATA[<table><tr><td>Approved</td></tr></table>]]>
Thanks!
谢谢!
采纳答案by Ron Harlev
The expression to handle your example would be
处理您的示例的表达式是
\<\!\[CDATA\[(?<text>[^\]]*)\]\]\>
Where the group "text" will contain your HTML.
“文本”组将包含您的 HTML。
The C# code you need is:
您需要的 C# 代码是:
using System.Text.RegularExpressions;
RegexOptions options = RegexOptions.None;
Regex regex = new Regex(@"\<\!\[CDATA\[(?<text>[^\]]*)\]\]\>", options);
string input = @"<![CDATA[<table><tr><td>Approved</td></tr></table>]]>";
// Check for match
bool isMatch = regex.IsMatch(input);
if( isMatch )
Match match = regex.Match(input);
string HTMLtext = match.Groups["text"].Value;
end if
The "input" variable is in there just to use the sample input you provided
“输入”变量在那里只是为了使用您提供的示例输入
回答by Scott Anderson
I know this might seem incredibly simple, but have you tried string.Replace()?
我知道这可能看起来非常简单,但是您尝试过 string.Replace() 吗?
string x = "<![CDATA[<table><tr><td>Approved</td></tr></table>]]>";
string y = x.Replace("<![CDATA[", string.Empty).Replace("]]>", string.Empty);
There are probably more efficient ways to handle this, but it might be that you want something that easy...
可能有更有效的方法来处理这个问题,但可能是你想要一些简单的东西......
回答by Chad Birch
Not much detail, but a very simple regex should match it if there isn't complexity that you didn't describe:
没有太多细节,但是如果没有您没有描述的复杂性,那么一个非常简单的正则表达式应该匹配它:
/<!\[CDATA\[(.*?)\]\]>/
回答by Tomalak
The regex to find CDATA sections would be:
查找 CDATA 部分的正则表达式是:
(?:<!\[CDATA\[)(.*?)(?:\]\]>)
回答by patjbs
Regex r = new Regex("(?<=<!\[CDATA\[).*?(?=\]\])");
回答by Adren
Why do you want to use Regex for such a simple task? Try this one:
为什么要使用 Regex 来完成如此简单的任务?试试这个:
str = str.Trim().Substring(9);
str = str.Substring(0, str.Length-3);