C# Html Agility Pack,从节点中选择节点

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10583926/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-09 14:22:57  来源:igfitidea点击:

Html Agility Pack, SelectNodes from a node

c#.nethtml-agility-pack

提问by thatsIT

Why does this pick all of my <li>elements in my document?

为什么这会选择<li>我文档中的所有元素?

HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(url);

var travelList = new List<Page>();
var liOfTravels = doc.DocumentNode.SelectSingleNode("//div[@id='myTrips']")
                     .SelectNodes("//li");

What I want is to get all <li>elements in the <div>with an idof "myTrips".

我想要的是使用“myTrips”获取所有<li>元素。<div>id

采纳答案by ChristiaanV

It's a bit confusing because you're expecting that it would do a selectNodes on only the div with id "myTrips", however if you do another SelectNodes("//li") it will performn another search from the top of the document.

这有点令人困惑,因为您期望它仅在 id 为“myTrips”的 div 上执行 selectNodes,但是如果您执行另一个 SelectNodes("//li"),它将从文档顶部执行另一次搜索。

I fixed this by combining the statement into one, but that would only work on a webpage where you have only one div with an id "mytrips". The query would look like this:

我通过将语句合并为一个来解决此问题,但这仅适用于只有一个 div 且 ID 为“mytrips”的网页。查询将如下所示:

doc.DocumentNode.SelectNodes("//div[@id='myTrips'] //li");

doc.DocumentNode.SelectNodes("//div[@id='myTrips'] //li");

回答by vfportero

You can do this with a Linq query:

您可以使用 Linq 查询执行此操作:

HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(url);

var travelList = new List<HtmlNode>();
foreach (var matchingDiv in doc.DocumentNode.DescendantNodes().Where(n=>n.Name == "div" && n.Id == "myTrips"))
{
    travelList.AddRange(matchingDiv.DescendantNodes().Where(n=> n.Name == "li"));
}

I hope it helps

我希望它有帮助

回答by Paul

This seems counter intuitive to me aswell, if you run a selectNodesmethod on a particular node I thought it would only search for stuff underneath that node, not in the document in general.

这对我来说似乎也违反直觉,如果您selectNodes在特定节点上运行一个方法,我认为它只会搜索该节点下的内容,而不是一般的文档。

Anyway OP if you change this line :

无论如何,如果您更改此行,请执行以下操作:

var liOfTravels = 
doc.DocumentNode.SelectSingleNode("//div[@id='myTrips']").SelectNodes("//li");

TO:

到:

var liOfTravels = 
doc.DocumentNode.SelectSingleNode("//div[@id='myTrips']").SelectNodes("li");

I think you'll be ok, i've just had the same issue and that fixed it for me. Im not sure though if the li would have to be a direct child of the node you have.

我想你会没事的,我刚刚遇到了同样的问题,并且为我解决了这个问题。我不确定 li 是否必须是您拥有的节点的直接子节点。

回答by greenoldman

var liOfTravels = doc.DocumentNode.SelectSingleNode("//div[@id='myTrips']")
                 .SelectNodes(".//li");

Note the dot in the second line. Basically in this regard HTMLAgitilityPack completely relies on XPath syntax, however the result is non-intuitive, because those queries are effectively the same:

注意第二行中的点。基本上在这方面 HTMLAgitilityPack 完全依赖于 XPath 语法,但是结果是不直观的,因为这些查询实际上是相同的:

doc.DocumentNode.SelectNodes("//li");
some_deeper_node.SelectNodes("//li");

回答by Rob

Creating a new node can be beneficial in some situations and lets you use the xpaths more intuitively. I've found this useful in a couple of places.

在某些情况下,创建一个新节点可能是有益的,并且可以让您更直观地使用 xpath。我发现这在几个地方很有用。

var myTripsDiv = doc.DocumentNode.SelectSingleNode("//div[@id='myTrips']");
var myTripsNode = HtmlNode.CreateNode(myTripsDiv.InnerHtml);
var liOfTravels = myTripsNode.SelectNodes("//li");