如何在 python webdriver 中为 phantomjs/ghostdriver 设置代理?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14699718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I set a proxy for phantomjs/ghostdriver in python webdriver?
提问by erikcw
I'm trying to figure out how to route my requests through an HTTP proxy.
我想弄清楚如何通过 HTTP 代理路由我的请求。
I'm initializing webdriver like this:
我正在像这样初始化 webdriver:
user_agent = 'my user agent 1.0'
DesiredCapabilities.PHANTOMJS['phantomjs.page.settings.userAgent'] = user_agent
driver = webdriver.PhantomJS()
I've gone through the docs and the source and can't seem to find a way to use a proxy server with phantomjs for through webdriver.
我已经浏览了文档和源代码,但似乎找不到通过 webdriver 使用带有 phantomjs 的代理服务器的方法。
Any suggestions?
有什么建议?
回答by Pykler
I dug a little and I found that the functionality is there, but it is not exposed. So it requires a handy monkey wrench to patch it up. Here is the solution that works for me until this functionality is fully exposed in the webdriver call.
我挖了一点,发现功能在那里,但没有暴露出来。所以它需要一个方便的活动扳手来修补它。这是在 webdriver 调用中完全公开此功能之前对我有用的解决方案。
EDIT: it seems the service_args are now exposed, you no longer need to monkey patch selenium to use the proxy ... see @alex-czech answer for how to use.
编辑:现在似乎 service_args 已经暴露了,你不再需要猴子补丁 selenium 来使用代理......请参阅@alex-czech 答案以了解如何使用。
from selenium import webdriver
from selenium.webdriver.phantomjs.service import Service as PhantomJSService
phantomjs_path = '/usr/lib/node_modules/phantomjs/lib/phantom/bin/phantomjs'
# monkey patch Service temporarily to include desired args
class NewService(PhantomJSService):
def __init__(self, *args, **kwargs):
service_args = kwargs.setdefault('service_args', [])
service_args += [
'--proxy=localhost:8080',
'--proxy-type=http',
]
super(NewService, self).__init__(*args, **kwargs)
webdriver.phantomjs.webdriver.Service = NewService
# init the webdriver
self.driver = webdriver.PhantomJS(phantomjs_path)
# undo monkey patch
webdriver.phantomjs.webdriver.Service = PhantomJSService
Also useful are the following settings, especially when using a proxy that may take a very long time to load.
以下设置也很有用,尤其是在使用可能需要很长时间加载的代理时。
max_wait = 60
self.driver.set_window_size(1024, 768)
self.driver.set_page_load_timeout(max_wait)
self.driver.set_script_timeout(max_wait)
回答by Alex Nik
Below is the example of how to set proxy for PhantomJs in Python. You may change proxy type: socks5/http.
下面是如何在 Python 中为 PhantomJs 设置代理的示例。您可以更改代理类型:socks5/http。
service_args = [
'--proxy=127.0.0.1:9999',
'--proxy-type=socks5',
]
browser = webdriver.PhantomJS('../path_to/phantomjs',service_args=service_args)
回答by Chiedo
The following is how to do the same with the Webdriver in Ruby. I couldn't find this anywhere online until I dug into the source code:
下面是如何用 Ruby 中的 Webdriver 做同样的事情。在我深入研究源代码之前,我无法在网上的任何地方找到它:
phantomjs_args = [ '--proxy=127.0.0.1:9999', '--proxy-type=socks5']
phantomjs_caps = { "phantomjs.cli.args" => phantomjs_args }
driver = Selenium::WebDriver.for(:phantomjs, :desired_capabilities => phantomjs_caps)
回答by Tom
I ended up needing to pass the credentials in both the service_args & as a proxy-auth header. I don't believe phantomjs passes the proxy auth onwards correctly.
我最终需要在 service_args 和作为代理身份验证标头中传递凭据。我不相信 phantomjs 正确地传递代理身份验证。
service_args = [
"--ignore-ssl-errors=true",
"--ssl-protocol=any",
"--proxy={}".format(proxy),
"--proxy-type=http",
]
caps = DesiredCapabilities.PHANTOMJS
authentication_token = "Basic " + base64.b64encode(b'{}:{}'.format(username, password))
caps['phantomjs.page.customHeaders.Proxy-Authorization'] = authentication_token
self.driver = webdriver.PhantomJS(
service_args=service_args,
desired_capabilities=caps,
executable_path="./phantomjs-2.1.1-linux-x86_64/bin/phantomjs")
Where proxy's structure is defined as http://username:password@domain:port
其中代理的结构定义为 http://username:password@domain:port
I'd hazard a guess that the first auth-parameters aren't passed as a header to the proxy, so you need to do both manually.
我猜测第一个身份验证参数不会作为标头传递给代理,因此您需要手动执行这两项操作。

