Python 美丽的汤只需获取标签内的值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22003302/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
beautiful soup just get the value inside the tag
提问by user1357015
The following command:
以下命令:
volume = soup.findAll("span", {"id": "volume"})[0]
gives:
给出:
<span class="gr_text1" id="volume">16,103.3</span>
when I issue a print(volume).
当我发出打印(卷)时。
How do I get just the number?
我怎么只得到号码?
采纳答案by isedev
Extract the string from the element:
从元素中提取字符串:
volume = soup.findAll("span", {"id": "volume"})[0].string
回答by falsetru
回答by cinv3
回答by Sanjay
Just to add , I also found the .stringdosn't do well when there is <br>in the text.
补充一下,我也发现文中存在的.string时候效果不好<br>。
EG:
例如:
<div class = "Lines">
<span> First Line <br> Second Line <br> Third Line </span>
</div>
If we do a soup.find("div",attrs={"class":"Lines}).span.stringwe get a None
如果我们做一个,soup.find("div",attrs={"class":"Lines}).span.string我们会得到一个None
But a soup.find("div",attrs={"class":"Lines}).span.textwe get
但是soup.find("div",attrs={"class":"Lines}).span.text我们得到
First Line Second Line Third Line
First Line Second Line Third Line
I think the .stringgives a NavigatableStringobject and .textgives a unicode object.
我认为.string给出了一个NavigatableString对象并.text给出了一个 unicode 对象。

