Python 使用 Headless Chrome Webdriver 运行 Selenium

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/53657215/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:20:38  来源:igfitidea点击:

Running Selenium with Headless Chrome Webdriver

pythonseleniumgoogle-chromeselenium-chromedrivergoogle-chrome-headless

提问by Rhynden

So I'm trying some stuff out with selenium and I really want it to be quick.

所以我正在用硒尝试一些东西,我真的希望它很快。

So my thought is that running it with headless chrome would make my script faster.

所以我的想法是使用无头 chrome 运行它会使我的脚本更快。

First is that assumption correct, or does it not matter if i run my script with a headless driver?

首先,这个假设是否正确,或者我是否使用无头驱动程序运行我的脚本无关紧要?

Anyways I still want to get it to work to run headless, but I somehow can't, I tried different things and most suggested that it would work as said here in the October update

无论如何,我仍然想让它运行无头运行,但我不知何故不能,我尝试了不同的东西,大多数人建议它可以像 10 月更新中所说的那样工作

How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?

如何配置 ChromeDriver 以通过 Selenium 在 Headless 模式下启动 Chrome 浏览器?

But when I try that, I get weird console output and it still doesn't seem to work.

但是当我尝试这样做时,我得到了奇怪的控制台输出,它似乎仍然不起作用。

Any tipps appreciated.

任何提示表示赞赏。

回答by CONvid19

To run chrome-headless just add --headlessvia chrome_options.add_argument, i.e.:

要运行 chrome-headless 只需添加--headlessvia chrome_options.add_argument,即:

from selenium import webdriver from selenium.webdriver.chrome.options import Options
chrome_options = Options()
#chrome_options.add_argument("--disable-extensions")
#chrome_options.add_argument("--disable-gpu")
#chrome_options.add_argument("--no-sandbox") # linux only
chrome_options.add_argument("--headless")
# chrome_options.headless = True # also works
driver = webdriver.Chrome(options=chrome_options)
start_url = "https://duckgo.com"
driver.get(start_url)
print(driver.page_source.encode("utf-8"))
driver.quit()
# b'<!DOCTYPE html><html xmlns="http://www....


So my thought is that running it with headless chrome would make my script faster.

所以我的想法是使用无头 chrome 运行它会使我的脚本更快。

Try using chrome options like --disable-extensionsor --disable-gpuand benchmark it, but I wouldn't count with much improvement.

尝试使用 chrome 选项,如--disable-extensionsor--disable-gpu并对其进行基准测试,但我不会有太大的改进。



References: headless-chrome

参考资料:无头镀铬

Note:?As of today, when running chrome headless on Windows., you shouldinclude the? --disable-gpu?flag See crbug.com/737678

注意:?截至今天,在 Windows 上运行 chrome headless 时,您应该包括?--disable-gpu?flag 见crbug.com/737678

回答by Devdun

If you are using Linux environment, may be you have to add --no-sandboxas well and also specific window size settings. The --no-sandboxflag is no needed on Windows if you set user container properly.

如果您使用的是 Linux 环境,可能还需要添加--no-sandbox特定的窗口大小设置。--no-sandbox如果正确设置用户容器,则 Windows 上不需要该标志。

Use --disable-gpuonly on Windows. Other platforms no longer require it. The --disable-gpuflag is a temporary work around for a few bugs.

使用--disable-gpu仅适用于Windows。其他平台不再需要它。该--disable-gpu标志是一些错误的临时解决方法。

//Headless chrome browser and configure
            WebDriverManager.chromedriver().setup();
            ChromeOptions chromeOptions = new ChromeOptions();
            chromeOptions.addArguments("--no-sandbox");
            chromeOptions.addArguments("--headless");
            chromeOptions.addArguments("disable-gpu");
//          chromeOptions.addArguments("window-size=1400,2100"); // Linux should be activate
            driver = new ChromeDriver(chromeOptions);

回答by Serhii

from time import sleep

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")

driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
url = "https://stackoverflow.com/questions/53657215/running-selenium-with-headless-chrome-webdriver"
driver.get(url)

sleep(5)

h1 = driver.find_element_by_xpath("//h1[@itemprop='name']").text
print(h1)

Then I run script on our local machine

然后我在我们的本地机器上运行脚本

? python script.py
Running Selenium with Headless Chrome Webdriver

It is working and it is with headless Chrome.

它正在工作,并且与无头 Chrome 一起使用。

回答by Basj

Todo (tested on headless server Debian Linux 9.4):

Todo(在无头服务器 Debian Linux 9.4 上测试):

  1. Do this:

    # install chrome
    curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
    echo "deb [arch=amd64]  http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
    apt-get -y update
    apt-get -y install google-chrome-stable
    
    # install chrome driver
    wget https://chromedriver.storage.googleapis.com/77.0.3865.40/chromedriver_linux64.zip
    unzip chromedriver_linux64.zip
    mv chromedriver /usr/bin/chromedriver
    chown root:root /usr/bin/chromedriver
    chmod +x /usr/bin/chromedriver
    
  2. Install selenium:

    pip install selenium
    

    and run this Python code:

    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    options = Options()
    options.add_argument("no-sandbox")
    options.add_argument("headless")
    options.add_argument("start-maximized")
    options.add_argument("window-size=1900,1080"); 
    driver = webdriver.Chrome(chrome_options=options, executable_path="/usr/bin/chromedriver")
    driver.get("https://www.example.com")
    html = driver.page_source
    print(html)
    
  1. 做这个:

    # install chrome
    curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
    echo "deb [arch=amd64]  http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
    apt-get -y update
    apt-get -y install google-chrome-stable
    
    # install chrome driver
    wget https://chromedriver.storage.googleapis.com/77.0.3865.40/chromedriver_linux64.zip
    unzip chromedriver_linux64.zip
    mv chromedriver /usr/bin/chromedriver
    chown root:root /usr/bin/chromedriver
    chmod +x /usr/bin/chromedriver
    
  2. 安装硒:

    pip install selenium
    

    并运行此 Python 代码:

    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    options = Options()
    options.add_argument("no-sandbox")
    options.add_argument("headless")
    options.add_argument("start-maximized")
    options.add_argument("window-size=1900,1080"); 
    driver = webdriver.Chrome(chrome_options=options, executable_path="/usr/bin/chromedriver")
    driver.get("https://www.example.com")
    html = driver.page_source
    print(html)
    

回答by Max Malysh

Install & run containerized Chrome:

安装并运行容器化 Chrome:

docker pull selenium/standalone-chrome
docker run --rm -d -p 4444:4444 --shm-size=2g selenium/standalone-chrome

Connect using webdriver.Remote:

连接使用webdriver.Remote

driver = webdriver.Remote('http://localhost:4444/wd/hub', DesiredCapabilities.CHROME)
driver.set_window_size(1280, 1024)
driver.get('https://www.google.com')

回答by Nikunj Kakadiya

Once you have selenium and web driver installed. Below worked for me with headless Chrome on linux cluster :

一旦你安装了 selenium 和 web 驱动程序。下面在 linux 集群上使用无头 Chrome 为我工作:

from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-extensions")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--no-sandbox")
options.add_experimental_option("prefs",{"download.default_directory":"/databricks/driver"})
driver = webdriver.Chrome(chrome_options=options)