xml 使用 XPath,如何根据节点的文本内容和属性值选择节点?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1982624/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-06 12:54:14  来源:igfitidea点击:

Using XPath, How do I select a node based on its text content and value of an attribute?

xmlxpathxquery

提问by marc esher

Given this XML:

鉴于此 XML:

<DocText>
<WithQuads>
    <Page pageNumber="3">
        <Word>
            July
            <Quad>
                <P1 X="84" Y="711.25" />
                <P2 X="102.062" Y="711.25" />
                <P3 X="102.062" Y="723.658" />
                <P4 X="84.0" Y="723.658" />
            </Quad>
        </Word>
        <Word>
        </Word>
        <Word>
            30,
            <Quad>
                <P1 X="104.812" Y="711.25" />
                <P2 X="118.562" Y="711.25" />
                <P3 X="118.562" Y="723.658" />
                <P4 X="104.812" Y="723.658" />
            </Quad>
        </Word>
    </Page>
</WithQuads>

I'd like to find the nodes that have text of 'July' and a Quad/P1/X attribute Greater than 90. Thus, in this case, it should not return any matches. However, if I use GT (>) or LT (<), I get a match on the first Word element. If I use eq (=), I get no match.

我想找到文本为“July”且 Quad/P1/X 属性大于 90 的节点。因此,在这种情况下,它不应返回任何匹配项。但是,如果我使用 GT (>) 或 LT (<),我会在第一个 Word 元素上获得匹配项。如果我使用 eq (=),则没有匹配项。

So:

所以:

//Word[text()='July' and //P1[@X < 90]]

will return true, as will

将返回真,也将如此

//Word[text()='July' and //P1[@X > 90]]

How do I constrain this properly on the P1@X attribute?

我如何在 P1@X 属性上正确地限制它?

In addition, imagine I have multiple Page elements, for different page numbers. How would I additionally constrain the above search to find Nodes with text()='July', P1@X < 90, and Page@pageNumber=3?

另外,假设我有多个 Page 元素,用于不同的页码。我将如何另外限制上述搜索以查找带有text()='July', P1@X < 90, 和 Page 的节点@pageNumber=3

回答by AnthonyWJones

Generally I would consider the use of an unprefixed // as a bad smell in an XPath.

通常我会认为在 XPath 中使用不带前缀的 // 是一种不好的味道。

Try this:-

尝试这个:-

/DocText/WithQuads/Page/Word[text()='July' and Quad/P1/@X > 90]

Your problem is that you use the //P1[@X < 90]which starts back at the beginning of the document and starts hunting any P1hence it will always be true. Similarly //P1[@X > 90]is always true.

您的问题是您使用了//P1[@X < 90]从文档开头开始并开始搜索任何内容的 ,P1因此它始终是正确的。同样//P1[@X > 90]总是如此。

回答by Michael Kay

Apart form the "//" issue, this XML is a very weird use of mixed content. The predicate text()='July'will match the element if any child text node is exactly equal to July, which isn't true in your example because of surrounding whitespace. Depending on the exact definition of the source XML, I would go for [text()[normalize-space(.)='July'] and Quad/P1/@X > 90]

除了“//”问题,这个 XML 是混合内容的一个非常奇怪的使用。text()='July'如果任何子文本节点完全等于七月,则谓词将匹配元素,这在您的示例中不正确,因为周围有空格。根据源 XML 的确切定义,我会选择[text()[normalize-space(.)='July'] and Quad/P1/@X > 90]