Python 中的 Selenium PhantomJS 自定义标头

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35666067/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 16:46:52  来源:igfitidea点击:

Selenium PhantomJS custom headers in Python

pythonseleniumphantomjscustom-headers

提问by Sumit Jha

I want to add "custom headers" to Selenium PhantomJS in python. These are the headers I wanna add.

我想在 python 中向 Selenium PhantomJS 添加“自定义标头”。这些是我想添加的标题。

headers = { 'Accept':'*/*',
            'Accept-Encoding':'gzip, deflate, sdch',
            'Accept-Language':'en-US,en;q=0.8',
            'Cache-Control':'max-age=0',
            'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'
          }

This is the code I am working with:

这是我正在使用的代码:

from selenium import webdriver

service_args = [
    '--proxy=127.0.0.1:9999',
    '--proxy-type=socks5',
    ]
driver = webdriver.PhantomJS(service_args=service_args)


driver.set_window_size(1120, 550)
driver.get("https://duckduckgo.com/")
driver.find_element_by_id('search_form_input_homepage').send_keys("realpython")
driver.find_element_by_id("search_button_homepage").click()
print driver.current_url
driver.quit()

How do I modify the code incorporating those custom headers ?

如何修改包含这些自定义标头的代码?

Please help.

请帮忙。

采纳答案by Andriy Ivaneyko

Setup headers in next way:

以下一种方式设置标题:

from selenium import webdriver


headers = { 'Accept':'*/*',
    'Accept-Encoding':'gzip, deflate, sdch',
    'Accept-Language':'en-US,en;q=0.8',
    'Cache-Control':'max-age=0',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'
}

for key, value in enumerate(headers):
    capability_key = 'phantomjs.page.customHeaders.{}'.format(key)
    webdriver.DesiredCapabilities.PHANTOMJS[capability_key] = value

Then start work with your driver:

然后开始使用您的驱动程序:

service_args = [
    '--proxy=127.0.0.1:9999',
    '--proxy-type=socks5',
]
driver = webdriver.PhantomJS(service_args=service_args)
# ............... 

回答by Simon482

from selenium import webdriver

headers = { 'Accept':'*/*',
    'Accept-Encoding':'gzip, deflate, sdch',
    'Accept-Language':'en-US,en;q=0.8',
    'Cache-Control':'max-age=0',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36' }

for key in headers:
    webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = headers[key]

回答by Mithril

Andriy Ivaneyko's method not work for me (PhantomJS 2.1.1 and Selenium 2.48.0).

Andriy Ivaneyko 的方法对我不起作用(PhantomJS 2.1.1 和 Selenium 2.48.0)。

I write a full example to set all headers, window size and proxy in Selenium PhantomJS:

我写了一个完整的例子来设置 Selenium PhantomJS 中的所有标题、窗口大小和代理:

from selenium import webdriver

def init_phantomjs_driver(*args, **kwargs):

    headers = { 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'Accept-Language':'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3',
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0',
        'Connection': 'keep-alive'
    }

    for key, value in headers.iteritems():
        webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = value

    webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.settings.userAgent'] = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'

    driver =  webdriver.PhantomJS(*args, **kwargs)
    driver.set_window_size(1400,1000)

    return driver


def main():
    service_args = [
        '--proxy=127.0.0.1:9999',
        '--proxy-type=http',
        '--ignore-ssl-errors=true'
        ]

    driver = init_phantomjs_driver(service_args=service_args)

    driver.get('http://cn.bing.com')

Note 1:

注 1:

userAgentis set in phantomjs.page.settings.userAgentinstead of phantomjs.page.customHeaders

userAgent设置在phantomjs.page.settings.userAgent而不是phantomjs.page.customHeaders

Note 2:

笔记2:

Andriy Ivaneyko use enumerateto build DesiredCapabilities.PHANTOMJS, the key is loop index, so the data become:

Andriy Ivaneyko 使用enumerate构建DesiredCapabilities.PHANTOMJS,关键是循环索引,所以数据变成:

{
 'browserName': 'phantomjs',
 'javascriptEnabled': True,
 'phantomjs.page.customHeaders.0': 'Accept-Language',
 'phantomjs.page.customHeaders.1': 'Accept-Encoding',
 'phantomjs.page.customHeaders.2': 'Accept',
 'phantomjs.page.customHeaders.3': 'User-Agent',
 'phantomjs.page.customHeaders.4': 'Connection',
 'phantomjs.page.customHeaders.5': 'Cache-Control',
 'platform': 'ANY',
 'version': ''
}

None of header attributes is set correctly.

没有正确设置标题属性。

回答by Berk Baytar

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = (
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/53 "
"(KHTML, like Gecko) Chrome/15.0.87")

driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.get("http://www.google.com")