Python - AttributeError: 'NoneType' 对象没有属性 'findAll'

Question

提问by Seymour Ducats

I have written my first bit of python code to scrape a website.

我已经编写了我的第一段 Python 代码来抓取网站。

import csv
import urllib2
from BeautifulSoup import BeautifulSoup

c = csv.writer(open("data.csv", "wb"))
soup = BeautifulSoup(urllib2.urlopen('http://www.kitco.com/kitco-gold-index.html').read())
table = soup.find('table', id="datatable_main")
rows = table.findAll('tr')[1:]

for tr in rows:
   cols = tr.findAll('td')
   text = []
   for td in cols:
       text.append(td.find(text=True))
   c.writerow(text)

When I test it locally in my ide called pyCharm it works good but when I try it out on my server which runs CentOS, I get the following error:

当我在名为 pyCharm 的 ide 中进行本地测试时，它运行良好，但是当我在运行 CentOS 的服务器上进行测试时，出现以下错误：

domainname.com [~/public_html/livegold]# python scraper.py
Traceback (most recent call last):
  File "scraper.py", line 8, in <module>
    rows = table.findAll('tr')[:]
AttributeError: 'NoneType' object has no attribute 'findAll'

I'm guessing I don't have a module installed remotely, I've been hung up on this for two days any help would be greatly appreciated! :)

我猜我没有远程安装模块，我已经被挂断了两天，任何帮助将不胜感激！:)

Answer 1

采纳答案by Wessie

You are ignoring any errors that could occur in urllib2.urlopen, if for some reason you are getting an error trying to get that page on your server, which you don't get testing locally you are effectively passing in an empty string ('') or a page you don't expect (such as a 404 page) to BeautifulSoup.

您忽略了中可能发生的任何错误urllib2.urlopen，如果由于某种原因您在尝试在服务器上获取该页面时遇到错误，而您没有在本地进行测试，那么您实际上是在传递一个空字符串 ( '') 或一个页面不要期望（例如 404 页面）到BeautifulSoup.

Which in turn makes your soup.find('table', id="datatable_main")return Nonesince the document is something you don't expect.

这反过来又使您soup.find('table', id="datatable_main")返回，None因为该文件是您不期望的。

You should either make sure you can get the page you are trying to get on your server, or handle exceptions properly.

您应该确保可以在服务器上获取您尝试访问的页面，或者正确处理异常。

Answer 2

回答by RichieHindle

There is no tablewith iddatatable_mainin the page that the script read.

脚本读取的页面中没有tablewith iddatatable_main。

Try printing the returned page to the terminal - perhaps your script is failing to contact the web server? Sometimes hosting services prevent outgoing HTTP connections.

尝试将返回的页面打印到终端 - 也许您的脚本无法联系 Web 服务器？有时托管服务会阻止传出 HTTP 连接。

Python - AttributeError: 'NoneType' 对象没有属性 'findAll'

提问by Seymour Ducats

采纳答案by Wessie

回答by RichieHindle

相关推荐

最近更新

标签

Python - AttributeError: 'NoneType' 对象没有属性 'findAll'

提问by Seymour Ducats

采纳答案by Wessie

回答by RichieHindle

相关推荐

使用 Excel 工作表中的数据在 python 中绘制图形

Python Django - 仅从 datetime.strptime 获取日期

Python sklearn 分类器获取 ValueError：输入形状错误

Python 尝试打开/写入文件时语法无效

相关推荐

最近更新

标签