C# 从 htmldocument 中删除 html 节点:HTMLAgilityPack
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12106280/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
remove html node from htmldocument :HTMLAgilityPack
提问by Priya
In my code, I want to remove the img tag which doesn't have src value. I am using HTMLAgilitypack's HtmlDocumentobject. I am finding the img which doesn't have src value and trying to remove it.. but it gives me error Collection was modified; enumeration operation may not execute. Can anyone help me for this? The code which I have used is:
在我的代码中,我想删除没有 src 值的 img 标签。我正在使用HTMLAgilitypack 的 HtmlDocument对象。我发现没有 src 值的 img 并试图删除它.. 但它给了我错误 Collection was modified; 枚举操作可能无法执行。任何人都可以帮助我吗?我使用的代码是:
foreach (HtmlNode node in doc.DocumentNode.DescendantNodes())
{
if (node.Name.ToLower() == "img")
{
string src = node.Attributes["src"].Value;
if (string.IsNullOrEmpty(src))
{
node.ParentNode.RemoveChild(node, false);
}
}
else
{
..........// i am performing other operations on document
}
}
采纳答案by Priya
What I have done is:
我所做的是:
List<string> xpaths = new List<string>();
foreach (HtmlNode node in doc.DocumentNode.DescendantNodes())
{
if (node.Name.ToLower() == "img")
{
string src = node.Attributes["src"].Value;
if (string.IsNullOrEmpty(src))
{
xpaths.Add(node.XPath);
continue;
}
}
}
foreach (string xpath in xpaths)
{
doc.DocumentNode.SelectSingleNode(xpath).Remove();
}
回答by Alex
It seems you're modifying the collection during the enumeration by using HtmlNode.RemoveChildmethod.
您似乎正在使用HtmlNode.RemoveChild方法在枚举期间修改集合。
To fix this you need is to copy your nodes to a separate list/array by calling e.g. Enumerable.ToList<T>()or Enumerable.ToArray<T>().
要解决此问题,您需要通过调用 egEnumerable.ToList<T>()或将您的节点复制到单独的列表/数组Enumerable.ToArray<T>()。
var nodesToRemove = doc.DocumentNode
.SelectNodes("//img[not(string-length(normalize-space(@src)))]")
.ToList();
foreach (var node in nodesToRemove)
node.Remove();
If I'm right, the problem will disappear.
如果我是对的,问题就会消失。
回答by Krzysztof Radzimski
var emptyImages = doc.DocumentNode
.Descendants("img")
.Where(x => x.Attributes["src"] == null || x.Attributes["src"].Value == String.Empty)
.Select(x => x.XPath)
.ToList();
emptyImages.ForEach(xpath => {
var node = doc.DocumentNode.SelectSingleNode(xpath);
if (node != null) { node.Remove(); }
});
回答by MOHAMMAD026
var emptyElements = doc.DocumentNode
.Descendants("a")
.Where(x => x.Attributes["src"] == null || x.Attributes["src"].Value == String.Empty)
.ToList();
emptyElements.ForEach(node => {
if (node != null){ node.Remove();}
});

