xml XPath 定位具有特定文本解析 HTML 表的单元格

Question

提问by David Brown

Hope someone out there can quickly point me in the right direction with my XPath difficulties.

希望有人能在我的 XPath 困难中迅速指出我正确的方向。

Current I've got to the point where I'm identifying the correct table i need in my HTML source but then I need to process only the rows that have the text 'Chapter' somewhere in the DOM.

目前我已经到了在我的 HTML 源代码中识别我需要的正确表格的地步，但随后我只需要处理在 DOM 某处具有文本“Chapter”的行。

My last attempt was to do this :

我的最后一次尝试是这样做：

// get the correct table
HtmlTable table = page.getFirstByXPath("//table[2]");

// now the failing bit....
def rows = table.getByXPath("*/td[contains(text(),'Chapter')]")

I thought the xpath above would represent, get me all elements that have a following child element of 'td' that somewhere in its dom contains the text 'Chapter'

我认为上面的 xpath 将代表，让我所有具有以下子元素的元素 'td' 在其 dom 中的某处包含文本 'Chapter'

An example of a matching row from my source is :

我的来源中匹配行的一个示例是：

<tr valign="top">
  <td nowrap="" align="Right">
   <font face="Verdana">
   <a href="index.cfm?a=1">Chapter 1</a>
   </font>
  </td>
  <td class="ChapterT">
    <font face="Verdana">DEFINITIONS</font>
  </td>
  <td>&nbsp;</td>
</tr>

Any help / pointers greatly appreciated.

非常感谢任何帮助/指示。

Thanks,

谢谢，

Answer 1

回答by Kirill Polishchuk

Use this XPath:

使用这个 XPath：

//td[contains(., 'Chapter')]

Answer 2

回答by Dimitre Novatchev

You want all tds under your current node -- not- all in the documentas the currently accepted answer selects.

你希望所有td当前的节点下的S -不-文档中所有作为目前公认的答案选择。

Use:

使用：

.//td[.//text()[contains(., 'Chapter')]]

This selects all tddescendants of the current node that are named tdthat have at least one text node descendant, whose string value contains the string "Chapter".

这将选择td当前节点的所有已命名的td后代，这些后代至少具有一个文本节点后代，其字符串值包含字符串"Chapter"。

If it is known in advance that any tdunder this tableonly has a single text node, this can be simplified to just:

如果事先知道tdthis 下的anytable只有一个文本节点，则可以简化为：

.//td[contains(., 'Chapter')]

Answer 3

回答by William Walseth

Your on the right "path".
The contains() function is limited the a specific element, not text in any of the children. Try this XPath, which you could read as follows: - get every tr/td with any sub element that contains the text 'Chapter'

您走在正确的“道路”上。
contains() 函数仅限于特定元素，而不是任何子元素中的文本。试试这个 XPath，你可以阅读如下： - 使用包含文本 'Chapter' 的任何子元素获取每个 tr/td

tr/td[contains(*,"Chapter")]

Good luck

祝你好运

xml XPath 定位具有特定文本解析 HTML 表的单元格

提问by David Brown

回答by Kirill Polishchuk

回答by Dimitre Novatchev

回答by William Walseth

相关推荐

最近更新

标签

xml XPath 定位具有特定文本解析 HTML 表的单元格

提问by David Brown

回答by Kirill Polishchuk

回答by Dimitre Novatchev

回答by William Walseth

相关推荐

xml 如何在 maven pom 属性值中转义“&”

如何使用 Xpath 1.0 从 XML 文档中查找 max 属性

基于属性值的条件（XML Schema）

XML 架构：根元素

相关推荐

最近更新

标签