python 将 urllib2 与 SOCKS 代理一起使用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2537726/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-04 00:52:34  来源:igfitidea点击:

Using urllib2 with SOCKS proxy

pythonurllib2socks

提问by Fluffy

Is it possible to fetch pages with urllib2 through a SOCKS proxy on a one socks server per opener basic? I've seen the solution using setdefaultproxy method, but I need to have different socks in different openers.

是否可以通过每个 opener basic 的一台袜子服务器上的 SOCKS 代理使用 urllib2 获取页面?我已经看到使用 setdefaultproxy 方法的解决方案,但我需要在不同的开瓶器中使用不同的袜子。

So there is SocksiPy library, which works great, but it has to be used this way:

所以有 SocksiPy 库,效果很好,但必须这样使用:

import socks
import socket
socket.socket = socks.socksocket
import urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)

That is, it sets the same proxy for ALL urllib2 requests. How can I have different proxies for different openers?

也就是说,它为所有 urllib2 请求设置相同的代理。如何为不同的开启者设置不同的代理?

回答by systempuntoout

Try with pycurl:

尝试使用pycurl

import pycurl
c1 = pycurl.Curl()
c1.setopt(pycurl.URL, 'http://www.google.com')
c1.setopt(pycurl.PROXY, 'localhost')
c1.setopt(pycurl.PROXYPORT, 8080)
c1.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c2 = pycurl.Curl()
c2.setopt(pycurl.URL, 'http://www.yahoo.com')
c2.setopt(pycurl.PROXY, 'localhost')
c2.setopt(pycurl.PROXYPORT, 8081)
c2.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c1.perform() 
c2.perform() 

回答by sw.

Yes, you can. I repeat my answer on How can I use a SOCKS 4/5 proxy with urllib2?You need to create an opener for every proxy like you do with an http proxy. The code for adding this feature to SocksiPy is available in GitHub https://gist.github.com/869791and is as simple as:

是的你可以。我重复我的回答如何使用带有 urllib2 的 SOCKS 4/5 代理?您需要像使用 http 代理一样为每个代理创建一个开启器。将此功能添加到 SocksiPy 的代码可在 GitHub https://gist.github.com/869791中找到,并且非常简单:

opener = urllib2.build_opener(SocksiPyHandler(socks.PROXY_TYPE_SOCKS4, 'localhost', 9999))
print opener.open('http://www.whatismyip.com/automation/n09230945.asp').read()

For more information I've written an example running multiple Tor instances to behave like a rotating proxy: Distributed Scraping With Multiple Tor Circuits

有关更多信息,我编写了一个示例,运行多个 Tor 实例以使其表现得像一个旋转代理:Distributed Scraping With Multiple Tor Circuits

回答by ccpizza

A cumbersome but working solution for using a SOCKS proxy is to set up provixy with proxy chaining and then set the HTTP_PROXY provided by privoxy via system variable or any other way.

使用 SOCKS 代理的一个繁琐但有效的解决方案是设置代理链接,然后通过系统变量或任何其他方式设置由 privoxy 提供的 HTTP_PROXY。

回答by Shirkrin

== EDIT == (old HTTP-Proxy example was here..)

== 编辑 ==(旧的 HTTP 代理示例在这里..)

My fault.. urllib2 has no builtin support for SOCKS proxying..

我的错.. urllib2 没有对 SOCKS 代理的内置支持..

There are some 'hacks' adding SOCKS to urllib2 (or the socket object in general) here.
But I hardly suspect that this will work with multiple proxies like you require it.

有一些“黑客”,并称袜子的urllib2(或一般套接字对象)这里
但是我几乎不怀疑这会像您需要的那样与多个代理一起使用。

As long as you don't wan't to hook / subclass urllib2.ProxyHandler I would suggest to go with pycurl.

只要您不想挂钩/子类 urllib2.ProxyHandler 我建议使用 pycurl。

回答by Andrew

You have only one socket for all openers and implementing socks is in socket level. So, you can't.
I suggest you to use pycurl library, it much more flexible.

您只有一个用于所有开启器的套接字,并且实现袜子是在套接字级别。所以,你不能。
我建议你使用 pycurl 库,它更灵活。

回答by cryo

You might be able to use threading locks if there aren't too many connections being made at once, and you need to access from multiple threads:

如果一次没有太多连接,您可能可以使用线程锁,并且您需要从多个线程访问:

import socks
import socket
import thread
lock = thread.allocate_lock()
socket.socket = socks.socksocket

def GetConn():
    lock.acquire()
    import urllib2
    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)
    conn = urllib2.urlopen(ARGUMENTS HERE)
    lock.release()
    return conn

You might also be able to use something like this every time you need to get a connection:

您也可以在每次需要连接时使用类似的东西:

urllib2 = execfile('urllib2.py')
urllib2.socket = dummy_class() # dummy_class needs the socket module's methods

These are obviously not fantastic solutions, but I've put in my 2¢ anyway :-)

这些显然不是很好的解决方案,但无论如何我已经投入了我的 2¢ :-)

回答by Dmitry Kochkin

You could do you it by setting evironmental variable HTTP_PROXY in following format:

您可以通过以下格式设置环境变量 HTTP_PROXY 来实现:

user:pass@proxy:port

用户:pass@proxy:端口

or if you use bat/cmd, add before calling script:

或者,如果您使用 bat/cmd,请在调用脚本之前添加:

set HTTP_PROXY=user:pass@proxy:port

设置 HTTP_PROXY=user:pass@proxy:port

I am using such cmd-file to make easy_install work under proxy.

我正在使用这样的 cmd 文件使 easy_install 在代理下工作。