xml 如何使用 XPath 获取节点值/innerHTML?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10898035/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 13:30:55  来源:igfitidea点击:

How to get node value / innerHTML with XPath?

xmlparsingxpathhtml-parsing

提问by Tomasz Smykowski

I have a XPath to select to a class I want: //div[@class='myclass']. But it returns me the whole div (with the <div class='myclass'>also, but I would like to return only the contents of this tag without the tag itself. How can I do it?

我有一个 XPath 可以选择我想要的类://div[@class='myclass']. 但它返回了我整个 div(<div class='myclass'>还有,但我只想返回这个标签的内容,而不是标签本身。我该怎么做?

回答by Nikola Bogdanovi?

node() = innerXml

text() = innerText

both are arrays, so text()[1] is a first children text node...

两者都是数组,所以 text()[1] 是第一个子文本节点......

回答by jos

With xpath, the thing you will get returned is the last thing in the path that is not a condition. What that means? Well, conditions are the stuff between []'s (but you already knew that) and yours reads like pathElement[that has a 'class' attribute with value 'my class']. The pathElement comes directly before the [.

使用 xpath,您将返回的内容是路径中最后一个不是条件的内容。那是什么意思?好吧,条件是[]'s(但您已经知道)和您的类似pathElement[具有值为 'my class']的 'class' 属性之间的东西。pathElement 直接出现在[.

All the stuff outside of []'s then is the path, so in //a/b/c[@blah='bleh']/da, b, cand dare all path elements, blahis an attribute and bleha literal value. If this path matches it will return you a d, the last non-condition thing.

[]'s then之外的所有内容都是路径,因此在//a/b/c[@blah='bleh']/dabcd中都是路径元素,blah是属性,bleh是文字值。如果这条路径匹配,它会返回一个d,最后一个非条件的东西。

Your particular path returns a (series of) div, being the last thing in your xpath's path. This return value thus includes the top-level node(s), divin your case, and underneath it (them) all its (their) children. Nodes can be elements or text (or comments, processing instructions, ...).

您的特定路径返回一个(系列)div,这是您的 xpath 路径中的最后一件事。因此,此返回值包括顶级节点,在您的案例中为div,以及在其下方(它们)的所有(它们的)子节点。节点可以是元素或文本(或注释、处理指令等)。

Underneath a node there can be multiple text nodes, hence the array pOcHa talks about. x/text()returns all text that is a direct child of x, x/node()returns all child nodes, includingtext.

在一个节点下面可以有多个文本节点,因此数组 pOcHa 谈到。x/text()返回作为 x 直接x/node()子节点的所有文本,返回所有子节点,包括文本。

Hope this helps.

希望这可以帮助。

回答by kjhughes

New answer to an old, frequently asked question:

一个旧的、常见的问题的新答案:

For this XML

对于这个 XML

<div class="myclass">content</div>

you can use XPath to select just contentin one of two ways:

您可以使用 XPath 以content两种方式之一进行选择:

  1. Text Node Selection

    This XPath,

    //div[@class='myclass']/text()
    

    will select the text node children of the targeted divelement, content, as requested.

  2. String Value of an Element

    This XPath,

    string(//div[@class='myclass'])
    

    will return string-valueof the targeted divelement, content, again as requested.

    Further information: Here's a noteexplaining the string-valuesof elements:

    The string-valueof an element node is the concatenation of the string-values of all text node descendantsof the element node in document order.

  1. 文本节点选择

    这个XPath,

    //div[@class='myclass']/text()
    

    将根据要求选择目标div元素 的文本节点子节点content

  2. 元素的字符串值

    这个XPath,

    string(//div[@class='myclass'])
    

    将再次根据请求返回目标元素的 字符串值divcontent

    更多信息:这是解释元素字符串值注释

    元素节点的字符串值是元素节点的所有文本节点后代字符串值按文档顺序的串联 。

回答by sajith

You can try

你可以试试

//div[@class='myclass']/child::*

//div[@class='myclass']/child::*

child::* selects all element children of the context node see details

child::* 选择上下文节点的所有元素子节点查看详细信息