Python 如何使用 lxml 通过文本查找元素？

Question

提问by user1973386

Assume we have the following html:

假设我们有以下 html：

<html>
    <body>
        <a href="/1234.html">TEXT A</a>
        <a href="/3243.html">TEXT B</a>
        <a href="/7445.html">TEXT C</a>
    <body>
</html>

How do I make it find the element "a", which contains "TEXT A"?

如何让它找到包含“TEXT A”的元素“a”？

So far I've got:

到目前为止，我有：

root = lxml.hmtl.document_fromstring(the_html_above)
e = root.find('.//a')

I've tried:

我试过了：

e = root.find('.//a[@text="TEXT A"]')

but that didn't work, as the "a" tags have no attribute "text".

但这不起作用，因为“a”标签没有“text”属性。

Is there any way I can solve this in a similar fashion to what I've tried?

有什么办法可以以与我尝试过的类似的方式解决这个问题吗？

Answer 1

采纳答案by unutbu

You are very close. Use text()=rather than @text(which indicates an attribute).

你很亲近。使用text()=而不是@text（表示一个属性）。

e = root.xpath('.//a[text()="TEXT A"]')

Or, if you know only that the text contains "TEXT A",

或者，如果您只知道文本包含“TEXT A”，

e = root.xpath('.//a[contains(text(),"TEXT A")]')

Or, if you know only that text starts with "TEXT A",

或者，如果您只知道文本以“TEXT A”开头，

e = root.xpath('.//a[starts-with(text(),"TEXT A")]')

See the docsfor more on the available string functions.

有关可用字符串函数的更多信息，请参阅文档。

For example,

例如，

import lxml.html as LH

text = '''\
<html>
    <body>
        <a href="/1234.html">TEXT A</a>
        <a href="/3243.html">TEXT B</a>
        <a href="/7445.html">TEXT C</a>
    <body>
</html>'''

root = LH.fromstring(text)
e = root.xpath('.//a[text()="TEXT A"]')
print(e)

yields

产量

[<Element a at 0xb746d2cc>]

Answer 2

回答by ToonAlfrink

Another way that looks more straightforward to me:

另一种对我来说看起来更直接的方法：

results = []
root = lxml.hmtl.fromstring(the_html_above)
for tag in root.iter():
    if "TEXT A" in tag.text
        results.append(tag)

Python 如何使用 lxml 通过文本查找元素？

提问by user1973386

采纳答案by unutbu

回答by ToonAlfrink

相关推荐

最近更新

标签

Python 如何使用 lxml 通过文本查找元素？

提问by user1973386

采纳答案by unutbu

回答by ToonAlfrink

相关推荐

Python 创建随机二进制文件

Python：Unicode 和 ElementTree.parse

Python 重定向回 Flask

在python中合并子列表

相关推荐

最近更新

标签