Python - AttributeError: 'NoneType' 对象没有属性 'findAll'

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18065768/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:50:41  来源:igfitidea点击:

Python - AttributeError: 'NoneType' object has no attribute 'findAll'

pythonattributesfindallnonetype

提问by Seymour Ducats

I have written my first bit of python code to scrape a website.

我已经编写了我的第一段 Python 代码来抓取网站。

import csv
import urllib2
from BeautifulSoup import BeautifulSoup

c = csv.writer(open("data.csv", "wb"))
soup = BeautifulSoup(urllib2.urlopen('http://www.kitco.com/kitco-gold-index.html').read())
table = soup.find('table', id="datatable_main")
rows = table.findAll('tr')[1:]

for tr in rows:
   cols = tr.findAll('td')
   text = []
   for td in cols:
       text.append(td.find(text=True))
   c.writerow(text)

When I test it locally in my ide called pyCharm it works good but when I try it out on my server which runs CentOS, I get the following error:

当我在名为 pyCharm 的 ide 中进行本地测试时,它运行良好,但是当我在运行 CentOS 的服务器上进行测试时,出现以下错误:

domainname.com [~/public_html/livegold]# python scraper.py
Traceback (most recent call last):
  File "scraper.py", line 8, in <module>
    rows = table.findAll('tr')[:]
AttributeError: 'NoneType' object has no attribute 'findAll'

I'm guessing I don't have a module installed remotely, I've been hung up on this for two days any help would be greatly appreciated! :)

我猜我没有远程安装模块,我已经被挂断了两天,任何帮助将不胜感激!:)

采纳答案by Wessie

You are ignoring any errors that could occur in urllib2.urlopen, if for some reason you are getting an error trying to get that page on your server, which you don't get testing locally you are effectively passing in an empty string ('') or a page you don't expect (such as a 404 page) to BeautifulSoup.

您忽略了 中可能发生的任何错误urllib2.urlopen,如果由于某种原因您在尝试在服务器上获取该页面时遇到错误,而您没有在本地进行测试,那么您实际上是在传递一个空字符串 ( '') 或一个页面不要期望(例如 404 页面)到BeautifulSoup.

Which in turn makes your soup.find('table', id="datatable_main")return Nonesince the document is something you don't expect.

这反过来又使您soup.find('table', id="datatable_main")返回,None因为该文件是您不期望的。

You should either make sure you can get the page you are trying to get on your server, or handle exceptions properly.

您应该确保可以在服务器上获取您尝试访问的页面,或者正确处理异常。

回答by RichieHindle

There is no tablewith iddatatable_mainin the page that the script read.

脚本读取的页面中没有tablewith iddatatable_main

Try printing the returned page to the terminal - perhaps your script is failing to contact the web server? Sometimes hosting services prevent outgoing HTTP connections.

尝试将返回的页面打印到终端 - 也许您的脚本无法联系 Web 服务器?有时托管服务会阻止传出 HTTP 连接。