Python 使用 BeautifulSoup 删除具有特定类的 div

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32063985/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:56:54  来源:igfitidea点击:

Deleting a div with a particlular class using BeautifulSoup

pythonpython-2.7beautifulsoup

提问by Riken Shah

I want to delete the specific divfrom soupobject.
I am using python 2.7and bs4.

我想divsoup对象中删除特定的内容。
我正在使用python 2.7bs4

According to documentation we can use div.decompose().

根据文档,我们可以使用div.decompose().

But that would delete all the div. How can I delete a divwith specific class?

但这会删除所有div. 如何删除div具有特定类的对象?

采纳答案by lemonhead

Sure, you can just select, find, or find_allthe divs of interest in the usual way, and then call decompose()on those divs.

当然,你可以selectfindfind_alldiv在通常的方式兴趣s,然后叫decompose()上这些div。

For instance, if you want to remove all divs with class sidebar, you could do that with

例如,如果你想删除所有带有 class 的 div sidebar,你可以用

# replace with `soup.findAll` if you are using BeautifulSoup3
for div in soup.find_all("div", {'class':'sidebar'}): 
    div.decompose()

If you want to remove a div with a specific id, say main-content, you can do that with

如果你想删除一个带有特定的 div id,比如说main-content,你可以用

soup.find('div', id="main-content").decompose()

回答by 3ppps

    from BeautifulSoup import BeautifulSoup
    >>> soup = BeautifulSoup('<body><div>1</div><div class="comment"><strong>2</strong></div></body>')
    >>> for div in soup.findAll('div', 'comment'):
    ...   div.extract()
    ... 
    <div class="comment"><strong>2</strong></div>
    >>> soup
    <body><div>1</div></body>

回答by Vineet Kumar Doshi

This will help you:

这将帮助您:

from bs4 import BeautifulSoup

markup = '<a>This is not div <div class="1">This is div 1</div><div class="2">This is div 2</div></a>'
soup = BeautifulSoup(markup,"html.parser")
a_tag = soup

soup.find('div',class_='2').decompose()

print a_tag

Output:

输出:

<a>This is not div <div class="1">This is div 1</div></a>

Let me know if it helps

让我知道它是否有帮助

回答by david euler

Hope it help:

希望有帮助:

from bs4 import BeautifulSoup
from bs4.element import Tag

markup = '<a>This is not div <div class="1">This is div 1</div><div class="2">This is div 2</div></a>'
soup = BeautifulSoup(markup,"html.parser")

for tag in soup.select('div.1'):
  tag.decompose()

print(soup)