Python 使用 Beautiful Soup 查找下一个出现的标签及其包含的文本

Question

提问by PSeUdocode

I'm trying to parse text between the tag <blockquote>. When I type soup.blockquote.get_text().

我正在尝试解析 tag 之间的文本<blockquote>。当我输入soup.blockquote.get_text().

I get the result I want for the first occurring blockquote in the HTML file. How do I find the next and sequential <blockquote>tag in the file? Maybe I'm just tired and can't find it in the documentation.

我得到了我想要的 HTML 文件中第一个出现的块引用的结果。如何找到文件中的下一个和顺序<blockquote>标签？也许我只是累了，在文档中找不到它。

Example HTML file:

示例 HTML 文件：

<html>
<head>header
</head>
<blockquote>I can get this text
</blockquote>
<p>eiaoiefj</p>
<blockquote>trying to capture this next
</blockquote>
<p></p><strong>do not capture this</strong>
<blockquote>
capture this too but separately after "capture this next"
</blockquote>
</html>

the simple python code:

简单的python代码：

from bs4 import BeautifulSoup

html_doc = open("example.html")
soup = BeautifulSoup(html_doc)
print.(soup.blockquote.get_text())
# how to get the next blockquote???

Answer 1

采纳答案by falsetru

Use find_next_sibling(If it not a sibling, use find_nextinstead)

使用find_next_sibling（如果它不是兄弟姐妹，请find_next改用）

>>> html = '''
... <html>
... <head>header
... </head>
... <blockquote>blah blah
... </blockquote>
... <p>eiaoiefj</p>
... <blockquote>capture this next
... </blockquote>
... <p></p><strong>don'tcapturethis</strong>
... <blockquote>
... capture this too but separately after "capture this next"
... </blockquote>
... </html>
... '''

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(html)
>>> quote1 = soup.blockquote
>>> quote1.text
u'blah blah\n'
>>> quote2 = quote1.find_next_siblings('blockquote')
>>> quote2.text
u'capture this next\n'

Python 使用 Beautiful Soup 查找下一个出现的标签及其包含的文本

提问by PSeUdocode

采纳答案by falsetru

相关推荐

最近更新

标签

Python 使用 Beautiful Soup 查找下一个出现的标签及其包含的文本

提问by PSeUdocode

采纳答案by falsetru

相关推荐

Python Pandas：获取列匹配特定值的行的索引

Python Pandas df.to_csv("file.csv" encode="utf-8") 仍然为减号提供垃圾字符

Python django - 如何按名称字段的第一个字母按字母顺序对对象进行排序

Python：如何使用 PyQt 调整光栅图像的大小

相关推荐

最近更新

标签