Python:AttributeError:'NoneType'对象没有属性'findNext'
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21421417/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: AttributeError: 'NoneType' object has no attribute 'findNext'
提问by user3247140
I am trying to scrape a website with BeautifulSoup but am having a problem. I was following a tutorial done in python 2.7 and it had exactly the same code in it and had no problems.
我正在尝试使用 BeautifulSoup 抓取网站,但遇到了问题。我正在学习在 python 2.7 中完成的教程,其中包含完全相同的代码并且没有任何问题。
import urllib.request
from bs4 import *
htmlfile = urllib.request.urlopen("http://en.wikipedia.org/wiki/Steve_Jobs")
htmltext = htmlfile.read()
soup = BeautifulSoup(htmltext)
title = (soup.title.text)
body = soup.find("Born").findNext('td')
print (body.text)
If I try to run the program I get,
如果我尝试运行我得到的程序,
Traceback (most recent call last):
File "C:\Users\USER\Documents\Python Programs\World Population.py", line 13, in <module>
body = soup.find("Born").findNext('p')
AttributeError: 'NoneType' object has no attribute 'findNext'
Is this a problem with python 3 or am i just too naive?
这是python 3的问题还是我太天真了?
采纳答案by paxdiablo
The findand find_allmethods do not search for arbitrary text in the document, they search for HTML tags.The documentation makes that clear (my italics):
该find和find_all方法不文档中搜索任意文本,他们搜索的HTML标签。文档清楚地说明了这一点(我的斜体):
Pass in a value for name and you'll tell Beautiful Soup to only consider tagswith certain names. Text strings will be ignored, as will tags whose names that don't match. This is the simplest usage:
传入 name 的值,您将告诉 Beautiful Soup 仅考虑具有特定名称的标签。文本字符串将被忽略,名称不匹配的标签也将被忽略。这是最简单的用法:
soup.find_all("title")
# [<title>The Dormouse's story</title>]
That's why your soup.find("Born")is returning Noneand hence why it complains about NoneType(the type of None) having no findNext()method.
这就是您soup.find("Born")返回的原因None,因此它抱怨NoneType(的类型None)没有findNext()方法。
That page you reference contains (at the time this answer was written) eight copies of the word "born", none of which are tags.
您引用的页面包含(在撰写此答案时)“出生”一词的八个副本,其中没有一个是标签。
Looking at the HTML source for that page, you'll find the best option may be to look for the correct span:
查看该页面的 HTML 源代码,您会发现最好的选择可能是寻找正确的跨度:
<th scope="row" style="text-align:left;">Born</th>
<td><span class="nickname">Steven Paul Jobs</span><br />
<span style="display:none">(<span class="bday">1955-02-24</span>)</span>February 24, 1955<br />
回答by Steinar Lima
The findmethod looks for tags, not text. To find the name, birthday and birthplace, you would have to look up the spanelements with the corresponding class name, and access the textattribute of that item:
该find方法查找标签,而不是文本。要查找姓名、生日和出生地,您必须查找span具有相应类名的元素,并访问该text项目的属性:
import urllib.request
from bs4 import *
soup = BeautifulSoup(urllib.request.urlopen("http://en.wikipedia.org/wiki/Steve_Jobs"))
title = soup.title.text
name = soup.find('span', {'class': 'nickname'}).text
bday = soup.find('span', {'class': 'bday'}).text
birthplace = soup.find('span', {'class': 'birthplace'}).text
print(name)
print(bday)
print(birthplace)
Output:
输出:
Steven Paul Jobs
1955-02-24
San Francisco, California, US
PS: You don't have to call readon urlopen, BS accept file-like objects.
PS:你不必叫read上urlopen,BS接受类文件对象。

