Python 如何查找特定 <ul> 类中的所有 <li>?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17246963/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:52:36  来源:igfitidea点击:

How to find all <li>'s within a specific <ul> class?

pythonpython-2.7beautifulsoup

提问by user1063287

Environment:

环境:

Beautiful Soup 4

美汤 4

Python 2.7.5

蟒蛇 2.7.5

Logic:

逻辑:

'find_all' <li>instances that are within a <ul>with a class of my_classeg:

'find_all'<li>实例在<ul>一个类中,my_class例如:

<ul class='my_class'>
<li>thing one</li>
<li>thing two</li>
</ul>

Clarification: Just get the 'text' between the <li>tags.

说明:只需获取<li>标签之间的“文本”即可。

Python Code:

蟒蛇代码:

(The find_all below is not correct, I am just putting it in context)

(下面的 find_all 不正确,我只是把它放在上下文中)

from bs4 import BeautifulSoup, Comment
import re

# open original file
fo = open('file.php', 'r')
# convert to string
fo_string = fo.read()
# close original file
fo.close()
# create beautiful soup object from fo_string
bs_fo_string = BeautifulSoup(fo_string, "lxml")
# get rid of html comments
my_comments = bs_fo_string.findAll(text=lambda text:isinstance(text, Comment))
[my_comment.extract() for my_comment in my_comments]

my_li_list = bs_fo_string.find_all('ul', 'my_class')

print my_li_list

采纳答案by TerryA

This?

这个?

>>> html = """<ul class='my_class'>
... <li>thing one</li>
... <li>thing two</li>
... </ul>"""
>>> from bs4 import BeautifulSoup as BS
>>> soup = BS(html)
>>> for ultag in soup.find_all('ul', {'class': 'my_class'}):
...     for litag in ultag.find_all('li'):
...             print litag.text
... 
thing one
thing two

Explanation:

解释:

soup.find_all('ul', {'class': 'my_class'})finds all the ultags with a class of my_class.

soup.find_all('ul', {'class': 'my_class'})查找ul类为 的所有标签my_class

We then find all the litags in those ultags, and print the content of the tag.

然后我们找到这些li标签中的所有ul标签,并打印标签的内容。

回答by sberry

This does the trick with BeautifulSoup3, don't have 4 on this machine.

这对 BeautifulSoup3 有用,这台机器上没有 4。

>>> [li.string for li in bs_fo_string.find('ul', {'class': 'my_class'}).findAll('li')]
[u'thing one', u'thing two']

The idea is to search first for the ul with 'my_class' class, and then findAll of the li's within that ul.

这个想法是首先搜索带有“my_class”类的 ul,然后在该 ul 中查找所有 li。

If you had additional ul's with the same class you might want to use a findAll on the ul search as well, and change the list comprehension to be nested.

如果您有同一个类的其他 ul,您可能还想在 ul 搜索中使用 findAll,并将列表理解更改为嵌套。