Python urllib2 URLError 异常?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1290142/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 21:51:45  来源:igfitidea点击:

Python urllib2 URLError exception?

pythonnetworkingurllib2

提问by Donal.Lynch.Msc

I installed Python 2.6.2 earlier on a Windows XP machine and run the following code:

我之前在 Windows XP 机器上安装了 Python 2.6.2 并运行以下代码:

import urllib2
import urllib

page = urllib2.Request('http://www.python.org/fish.html')
urllib2.urlopen( page )

I get the following error.

我收到以下错误。

Traceback (most recent call last):<br>
  File "C:\Python26\test3.py", line 6, in <module><br>
    urllib2.urlopen( page )<br>
  File "C:\Python26\lib\urllib2.py", line 124, in urlopen<br>
    return _opener.open(url, data, timeout)<br>
  File "C:\Python26\lib\urllib2.py", line 383, in open<br>
    response = self._open(req, data)<br>
  File "C:\Python26\lib\urllib2.py", line 401, in _open<br>
    '_open', req)<br>
  File "C:\Python26\lib\urllib2.py", line 361, in _call_chain<br>
    result = func(*args)<br>
  File "C:\Python26\lib\urllib2.py", line 1130, in http_open<br>
    return self.do_open(httplib.HTTPConnection, req)<br>
  File "C:\Python26\lib\urllib2.py", line 1105, in do_open<br>
    raise URLError(err)<br>
URLError: <urlopen error [Errno 11001] getaddrinfo failed><br><br><br>

回答by mcandre

import urllib2
response = urllib2.urlopen('http://www.python.org/fish.html')
html = response.read()

You're doing it wrong.

你这样做是错的。

回答by krawyoti

Have a look in the urllib2 source, at the line specified by the traceback:

查看 urllib2 源代码,在回溯指定的行:

File "C:\Python26\lib\urllib2.py", line 1105, in do_open
raise URLError(err)

There you'll see the following fragment:

在那里你会看到以下片段:

    try:
        h.request(req.get_method(), req.get_selector(), req.data, headers)
        r = h.getresponse()
    except socket.error, err: # XXX what error?
        raise URLError(err)

So, it looks like the source is a socket error, not an HTTP protocol related error. Possible reasons: you are not on line, you are behind a restrictive firewall, your DNS is down,...

因此,看起来来源是套接字错误,而不是与 HTTP 协议相关的错误。可能的原因:您不在线,您在限制性防火墙后面,您的 DNS 已关闭,...

All this aside from the fact, as mcandrepointed out, that your code is wrong.

除了事实之外,正如mcandre指出的那样,您的代码是错误的。

回答by sleblanc

Name resolution error.

名称解析错误。

getaddrinfois used to resolve the hostname (python.org)in your request. If it fails, it means that the name could not be resolved because:

getaddrinfo用于解析python.org请求中的主机名 ( )。如果失败,则意味着无法解析名称,因为:

  1. It does not exist, or the records are outdated (unlikely; python.org is a well-established domain name)
  2. Your DNS server is down (unlikely; if you can browse other sites, you should be able to fetch that page through Python)
  3. A firewall is blocking Python or your script from accessing the Internet (most likely; Windows Firewall sometimes does not ask you if you want to allow an application)
  4. You live on an ancient voodoo cemetery. (unlikely; if that is the case, you should move out)
  1. 它不存在,或者记录已经过时(不太可能;python.org 是一个完善的域名)
  2. 您的 DNS 服务器已关闭(不太可能;如果您可以浏览其他站点,您应该能够通过 Python 获取该页面)
  3. 防火墙阻止 Python 或您的脚本访问 Internet(很可能;Windows 防火墙有时不会询问您是否允许应用程序)
  4. 你住在一个古老的巫毒墓地。(不太可能;如果是这样,你应该搬出去)

回答by hughdbrown

Windows Vista, python 2.6.2

Windows Vista,python 2.6.2

It's a 404 page, right?

这是一个 404 页面,对吧?

>>> import urllib2
>>> import urllib
>>>
>>> page = urllib2.Request('http://www.python.org/fish.html')
>>> urllib2.urlopen( page )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python26\lib\urllib2.py", line 124, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python26\lib\urllib2.py", line 389, in open
    response = meth(req, response)
  File "C:\Python26\lib\urllib2.py", line 502, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python26\lib\urllib2.py", line 427, in error
    return self._call_chain(*args)
  File "C:\Python26\lib\urllib2.py", line 361, in _call_chain
    result = func(*args)
  File "C:\Python26\lib\urllib2.py", line 510, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
>>>

回答by Jim Dennis

DJ

DJ

First, I see no reason to import urllib; I've only ever seen urllib2 used to replace urllib entirely and I know of no functionality that's useful from urllib and yet is missing from urllib2.

首先,我认为没有理由导入 urllib;我只见过 urllib2 用于完全替换 urllib,我知道 urllib 中没有有用的功能,但 urllib2 中没有。

Next, I notice that http://www.python.org/fish.htmlgives a 404 error to me. (That doesn't explain the backtrace/exception you're seeing. I get urllib2.HTTPError: HTTP Error 404: Not Found

接下来,我注意到http://www.python.org/fish.html给了我一个 404 错误。(这并不能解释您看到的回溯/异常。我明白了urllib2.HTTPError: HTTP Error 404: Not Found

Normally if you just want to do a default fetch of a web pages (without adding special HTTP headers, doing doing any sort of POST, etc) then the following suffices:

通常,如果您只想对网页进行默认获取(不添加特殊的 HTTP 标头、执行任何类型的 POST 等),那么以下内容就足够了:

req = urllib2.urlopen('http://www.python.org/')
html = req.read()
# and req.close() if you want to be pedantic