Python urrlib2.urlopen:在没有互联网连接的情况下启动脚本时,“名称或服务未知”仍然存在

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21356781/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 22:39:58  来源:igfitidea点击:

urrlib2.urlopen: "Name or service not known" persists when starting script without internet connection

pythonurllib2

提问by Ben Ruijl

I have this simple minimal 'working' example below that opens a connection to google every two seconds. When I run this script when I have a working internet connection, I get the Success message, and when I then disconnect, I get the Fail message and when I reconnect again I get the Success again. So far, so good.

我在下面有这个简单的最小“工作”示例,它每两秒钟打开一个与谷歌的连接。当我在互联网连接正常时运行此脚本时,我收到成功消息,然后当我断开连接时,我收到失败消息,当我再次重新连接时,我又收到成功消息。到现在为止还挺好。

However, when I start the script when the internet is disconnected, I get the Fail messages, and when I connect later, I never get the Success message. I keep getting the error:

但是,当我在互联网断开连接时启动脚本时,我收到失败消息,而当我稍后连接时,我从未收到成功消息。我不断收到错误:

urlopen error [Errno -2] Name or service not known

urlopen 错误 [Errno -2] 名称或服务未知

What is going on?

到底是怎么回事?

import urllib2, time

while True:
    try:
        print('Trying')
        response = urllib2.urlopen('http://www.google.com')
        print('Success')
        time.sleep(2)
    except Exception, e:
        print('Fail ' + str(e))
        time.sleep(2)

采纳答案by insecure

This happens because the DNS name "www.google.com" cannot be resolved. If there is no internet connection the DNS server is probably not reachable to resolve this entry.

这是因为无法解析 DNS 名称“www.google.com”。如果没有 Internet 连接,则可能无法访问 DNS 服务器来解析此条目。

It seems I misread your question the first time. The behaviour you describe is, on Linux, a peculiarity of glibc. It only reads "/etc/resolv.conf" once, when loading. glibc can be forced to re-read "/etc/resolv.conf" via the res_init()function.

似乎我第一次误读了您的问题。您描述的行为在 Linux 上是 glibc 的一个特性。它在加载时只读取一次“/etc/resolv.conf”。可以通过该res_init()函数强制 glibc 重新读取“/etc/resolv.conf” 。

One solution would be to wrap the res_init()function and call it before calling getaddrinfo()(which is indirectly used by urllib2.urlopen().

一种解决方案是包装res_init()函数并在调用之前调用它getaddrinfo()(由urllib2.urlopen().

You might try the following (still assuming you're using Linux):

您可以尝试以下操作(仍然假设您使用的是 Linux):

import ctypes
libc = ctypes.cdll.LoadLibrary('libc.so.6')
res_init = libc.__res_init
# ...
res_init()
response = urllib2.urlopen('http://www.google.com')

This might of course be optimized by waiting until "/etc/resolv.conf" is modified before calling res_init().

这当然可以通过在调用 .conf 之前等待修改“/etc/resolv.conf”来优化res_init()

Another solution would be to install e.g. nscd (name service cache daemon).

另一种解决方案是安装例如 nscd(名称服务缓存守护进程)。

回答by Sridhar Thiagarajan

For me, it was a proxy problem. Running the following before import urllib.request helped

对我来说,这是一个代理问题。在导入 urllib.request 之前运行以下帮助

import os
os.environ['http_proxy']=''
response = urllib.request.urlopen('http://www.google.com')