Python 美丽的汤只需获取标签内的值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22003302/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:03:56  来源:igfitidea点击:

beautiful soup just get the value inside the tag

pythonbeautifulsoup

提问by user1357015

The following command:

以下命令:

volume = soup.findAll("span", {"id": "volume"})[0]

gives:

给出:

<span class="gr_text1" id="volume">16,103.3</span>

when I issue a print(volume).

当我发出打印(卷)时。

How do I get just the number?

我怎么只得到号码?

采纳答案by isedev

Extract the string from the element:

从元素中提取字符串:

volume = soup.findAll("span", {"id": "volume"})[0].string

回答by falsetru

Using css selector:

使用css 选择器

>>> soup.select('span#volume')[0].text
u'16,103.3'

回答by Sanjay

Just to add , I also found the .stringdosn't do well when there is <br>in the text.

补充一下,我也发现文中存在的.string时候效果不好<br>

EG:

例如:

 <div class = "Lines">
    <span> First Line <br> Second Line <br> Third Line </span>
  </div>

If we do a soup.find("div",attrs={"class":"Lines}).span.stringwe get a None

如果我们做一个,soup.find("div",attrs={"class":"Lines}).span.string我们会得到一个None

But a soup.find("div",attrs={"class":"Lines}).span.textwe get

但是soup.find("div",attrs={"class":"Lines}).span.text我们得到

First Line
Second Line
Third Line
First Line
Second Line
Third Line

I think the .stringgives a NavigatableStringobject and .textgives a unicode object.

我认为.string给出了一个NavigatableString对象并.text给出了一个 unicode 对象。