Html 从 div id 中检索 xpath 内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9289579/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 22:33:54  来源:igfitidea点击:

retrieve xpath content from div id

htmlxpath

提问by shadow

How do I retrieved the text inside article-field1?

如何检索 article-field1 中的文本?

<title>Testing</title>
  <link>http://example.org</link>
  <description>Description</description>
  <language>en-us</language>
  <lastBuildDate>Mon, 13 Feb 2012 00:00:00 +0000</lastBuildDate>

  <item>
    <title>Title Here</title>
    <link>http://example.org/2012/03/27/</link>
    <description><![CDATA[
        <div id="article-field1"><a href="http://example.org/test1">Test 1</a></div>
        <div id="article-field2">123</div>
    <pubDate>Tue, 2 Mar 2012 00:00:00 +0000</pubDate>
  </item>

I've tried to use

我试过用

//description/div[@id="article-field1"]/text()

Any advise?

有什么建议吗?

Thanks

谢谢

回答by Olivier.Roger

From what I see your data are in a CDATA tag. This prevents parsing its content.

据我所知,您的数据位于 CDATA 标签中。这可以防止解析其内容。

See How do I retrieve element text inside CDATA markup via XPath?for more details.

请参阅如何通过 XPath 检索 CDATA 标记中的元素文本?更多细节。

回答by ingyhere

//description/div[@id="article-field1"]/a/text() 

If the malformed CDATAtag is removed, a root element is added and the corresponding 'description' tag is closed. This assumes an error of partially pasting the original XML, which is all that makes sense given the expression. Basically, the original query was missing the aelement.

如果CDATA删除了格式错误的标签,则会添加一个根元素并关闭相应的“描述”标签。这假定部分粘贴原始 XML 的错误,这是给定表达式的全部意义。基本上,原始查询缺少该a元素。

This can be verified at http://www.xpathtester.com/.

这可以在http://www.xpathtester.com/ 上进行验证。

回答by Sean B. Durkin

You can't do it with a single call of plain-vanilla XPATH processor.

您不能通过一次调用普通的 XPATH 处理器来完成。

You have two choices:

你有两个选择:

  1. Uses a specific XPATH processor that implements the dyn:evaluate()function (and this begs the question: What processor and version are you using?); OR
  2. Use two calls. The first go get the text value of the /title/item/description node. The second, after loading the result of the first as a new XML document (with a few tweeks to convert the xml fragment into a proper xml document), is div[@id="article-field1"] .
  1. 使用实现dyn:evaluate()函数的特定 XPATH 处理器(这就引出了一个问题:您使用的是什么处理器和版本?);或者
  2. 使用两个调用。首先去获取/title/item/description节点的文本值。第二个,在将第一个的结果加载为一个新的 XML 文档(用几个星期将 xml 片段转换为正确的 xml 文档)之后,是 div[@id="article-field1"] 。