python 告诉 urllib2 使用自定义 DNS

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2236498/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-04 00:08:22  来源:igfitidea点击:

Tell urllib2 to use custom DNS

pythondnsurllib2dnspythonurlopen

提问by Attila O.

I'd like to tell urllib2.urlopen(or a custom opener) to use 127.0.0.1(or ::1) to resolve addresses. I wouldn't change my /etc/resolv.conf, however.

我想告诉urllib2.urlopen(或自定义开启程序)使用127.0.0.1(或::1)来解析地址。但是,我不会更改我的/etc/resolv.conf.

One possible solution is to use a tool like dnspythonto query addresses and httplibto build a custom url opener. I'd prefer telling urlopento use a custom nameserver though. Any suggestions?

一种可能的解决方案是使用诸如dnspython查询地址和httplib构建自定义 url 打开器之类的工具。不过,我更愿意告诉urlopen使用自定义名称服务器。有什么建议?

回答by MattH

Looks like name resolution is ultimately handled by socket.create_connection.

看起来名称解析最终由socket.create_connection.

-> urllib2.urlopen
-> httplib.HTTPConnection
-> socket.create_connection

Though once the "Host:" header has been set, you can resolve the host and pass on the IP address through down to the opener.

尽管一旦设置了“Host:”标头,您就可以解析主机并将 IP 地址向下传递到开启程序。

I'd suggest that you subclass httplib.HTTPConnection, and wrap the connectmethod to modify self.hostbefore passing it to socket.create_connection.

我建议您子类化httplib.HTTPConnection,并在将connect方法self.host传递给socket.create_connection.

Then subclass HTTPHandler(and HTTPSHandler) to replace the http_openmethod with one that passes your HTTPConnectioninstead of httplib's own to do_open.

然后子类化HTTPHandler(和HTTPSHandler) 以将http_open方法替换为将您的HTTPConnection而不是 httplib 自己的方法传递给do_open.

Like this:

像这样:

import urllib2
import httplib
import socket

def MyResolver(host):
  if host == 'news.bbc.co.uk':
    return '66.102.9.104' # Google IP
  else:
    return host

class MyHTTPConnection(httplib.HTTPConnection):
  def connect(self):
    self.sock = socket.create_connection((MyResolver(self.host),self.port),self.timeout)
class MyHTTPSConnection(httplib.HTTPSConnection):
  def connect(self):
    sock = socket.create_connection((MyResolver(self.host), self.port), self.timeout)
    self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)

class MyHTTPHandler(urllib2.HTTPHandler):
  def http_open(self,req):
    return self.do_open(MyHTTPConnection,req)

class MyHTTPSHandler(urllib2.HTTPSHandler):
  def https_open(self,req):
    return self.do_open(MyHTTPSConnection,req)

opener = urllib2.build_opener(MyHTTPHandler,MyHTTPSHandler)
urllib2.install_opener(opener)

f = urllib2.urlopen('http://news.bbc.co.uk')
data = f.read()
from lxml import etree
doc = etree.HTML(data)

>>> print doc.xpath('//title/text()')
['Google']

Obviously there are certificate issues if you use the HTTPS, and you'll need to fill out MyResolver...

显然,如果您使用 HTTPS,则存在证书问题,您需要填写 MyResolver...

回答by Taha Jahangir

Another (dirty) way is monkey-patching socket.getaddrinfo.

另一种(肮脏的)方式是monkey-patching socket.getaddrinfo

For example this code adds a (unlimited) cache for dns lookups.

例如,此代码为 dns 查找添加了(无限)缓存。

import socket
prv_getaddrinfo = socket.getaddrinfo
dns_cache = {}  # or a weakref.WeakValueDictionary()
def new_getaddrinfo(*args):
    try:
        return dns_cache[args]
    except KeyError:
        res = prv_getaddrinfo(*args)
        dns_cache[args] = res
        return res
socket.getaddrinfo = new_getaddrinfo

回答by speakman

You will need to implement your own dns lookup client (or using dnspython as you said). The name lookup procedure in glibc is pretty complex to ensure compatibility with other non-dns name systems. There's for example no way to specify a particular DNS server in the glibc library at all.

您将需要实现自己的 dns 查找客户端(或如您所说使用 dnspython)。glibc 中的名称查找过程非常复杂,以确保与其他非 DNS 名称系统的兼容性。例如,根本无法在 glibc 库中指定特定的 DNS 服务器。