Python BeautifulSoup 4、findNext() 函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15771424/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:57:10  来源:igfitidea点击:

BeautifulSoup 4, findNext() function

pythonpython-2.7beautifulsoup

提问by nutship

I'm playing with BeautifulSoup 4 and I have this html code:

我在玩 BeautifulSoup 4,我有这个 html 代码:

</tr>
          <tr>
<td id="freistoesse">Giraffe</td>
<td>14</td>
<td>7</td>
</tr>

I want to match both values between <td>tags so here 14 and 7.

我想匹配<td>标签之间的两个值,所以这里是 14 和 7。

I tried this:

我试过这个:

giraffe = soup.find(text='Giraffe').findNext('td').text

but this only matches 14. How can I match both values with this function?

但这仅匹配14. 如何将这两个值与此函数匹配?

采纳答案by unutbu

Use find_allinstead of findNext:

使用find_all代替findNext

import bs4 as bs
content = '''\
<tr>
<td id="freistoesse">Giraffe</td>
<td>14</td>
<td>7</td>
</tr>'''
soup = bs.BeautifulSoup(content)

for td in soup.find('td', text='Giraffe').parent.find_all('td'):
    print(td.text)

yields

产量

Giraffe
14
7


Or, you could use find_next_siblings(also known as fetchNextSiblings):

或者,您可以使用find_next_siblings(也称为fetchNextSiblings):

for td in soup.find(text='Giraffe').parent.find_next_siblings():
    print(td.text)

yields

产量

14
7


Explanation:

解释:

Note that soup.find(text='Giraffe')returns a NavigableString.

请注意,soup.find(text='Giraffe')返回一个 NavigableString。

In [30]: soup.find(text='Giraffe')
Out[30]: u'Giraffe'

To get the associated tdtag, use

要获取关联的td标签,请使用

In [31]: soup.find('td', text='Giraffe')
Out[31]: <td id="freistoesse">Giraffe</td>

or

或者

In [32]: soup.find(text='Giraffe').parent
Out[32]: <td id="freistoesse">Giraffe</td>

Once you have the tdtag, you could use find_next_siblings:

获得td标签后,您可以使用find_next_siblings

In [35]: soup.find(text='Giraffe').parent.find_next_siblings()
Out[35]: [<td>14</td>, <td>7</td>]


PS. BeautifulSoup has added method names that use underscores instead of CamelCase. They do the same thing, but comform to the PEP8 style guide recommendations. Thus, prefer find_next_siblingsover fetchNextSiblings.

附注。BeautifulSoup 添加了使用下划线代替 CamelCase 的方法名称。他们做同样的事情,但符合 PEP8 风格指南的建议。因此,喜欢find_next_siblingsfetchNextSiblings