java 在 Chrome 中使用 Selenium 截取整页屏幕截图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45199076/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 08:35:18  来源:igfitidea点击:

Take full page screenshot in Chrome with Selenium

javaseleniumselenium-webdriverselenium-chromedriver

提问by Majid Laissi

I know this was not possible before, but now with the following update:

我知道以前这是不可能的,但现在有了以下更新:

https://developers.google.com/web/updates/2017/04/devtools-release-notes#screenshots

https://developers.google.com/web/updates/2017/04/devtools-release-notes#screenshots

this seems to be possible using Chrome Dev Tools.

这似乎可以使用 Chrome Dev Tools。

Is it possible now using Selenium in Java?

现在可以在 Java 中使用 Selenium 吗?

采纳答案by fjalvingh

To do this with Selenium Webdriver in Java takes a bit of work.. As hinted by Florent B. we need to change some classes uses by the default ChromeDriver to make this work. First we need to make a new DriverCommandExecutor which adds the new Chrome commands:

要使用 Java 中的 Selenium Webdriver 做到这一点,需要做一些工作。正如 Florent B 所暗示的那样。我们需要更改默认 ChromeDriver 使用的一些类才能使其工作。首先,我们需要创建一个新的 DriverCommandExecutor 来添加新的 Chrome 命令:

import com.google.common.collect.ImmutableMap;
import org.openqa.selenium.remote.CommandInfo;
import org.openqa.selenium.remote.http.HttpMethod;
import org.openqa.selenium.remote.service.DriverCommandExecutor;
import org.openqa.selenium.remote.service.DriverService;

public class MyChromeDriverCommandExecutor extends DriverCommandExecutor {
    private static final ImmutableMap<String, CommandInfo> CHROME_COMMAND_NAME_TO_URL;

    public MyChromeDriverCommandExecutor(DriverService service) {
        super(service, CHROME_COMMAND_NAME_TO_URL);
    }

    static {
        CHROME_COMMAND_NAME_TO_URL = ImmutableMap.of("launchApp", new CommandInfo("/session/:sessionId/chromium/launch_app", HttpMethod.POST)
        , "sendCommandWithResult", new CommandInfo("/session/:sessionId/chromium/send_command_and_get_result", HttpMethod.POST)
        );
    }
}

After that we need to create a new ChromeDriver class which will then use this thing. We need to create the class because the original has no constructor that lets us replace the command executor... So the new class becomes:

之后我们需要创建一个新的 ChromeDriver 类,然后它将使用这个东西。我们需要创建这个类,因为原来的类没有可以让我们替换命令执行器的构造函数……所以新类变成了:

import com.google.common.collect.ImmutableMap;
import org.openqa.selenium.Capabilities;
import org.openqa.selenium.WebDriverException;
import org.openqa.selenium.chrome.ChromeDriverService;
import org.openqa.selenium.html5.LocalStorage;
import org.openqa.selenium.html5.Location;
import org.openqa.selenium.html5.LocationContext;
import org.openqa.selenium.html5.SessionStorage;
import org.openqa.selenium.html5.WebStorage;
import org.openqa.selenium.interactions.HasTouchScreen;
import org.openqa.selenium.interactions.TouchScreen;
import org.openqa.selenium.mobile.NetworkConnection;
import org.openqa.selenium.remote.FileDetector;
import org.openqa.selenium.remote.RemoteTouchScreen;
import org.openqa.selenium.remote.RemoteWebDriver;
import org.openqa.selenium.remote.html5.RemoteLocationContext;
import org.openqa.selenium.remote.html5.RemoteWebStorage;
import org.openqa.selenium.remote.mobile.RemoteNetworkConnection;

public class MyChromeDriver  extends RemoteWebDriver implements LocationContext, WebStorage, HasTouchScreen, NetworkConnection {
    private RemoteLocationContext locationContext;
    private RemoteWebStorage webStorage;
    private TouchScreen touchScreen;
    private RemoteNetworkConnection networkConnection;

    //public MyChromeDriver() {
    //  this(ChromeDriverService.createDefaultService(), new ChromeOptions());
    //}
    //
    //public MyChromeDriver(ChromeDriverService service) {
    //  this(service, new ChromeOptions());
    //}

    public MyChromeDriver(Capabilities capabilities) {
        this(ChromeDriverService.createDefaultService(), capabilities);
    }

    //public MyChromeDriver(ChromeOptions options) {
    //  this(ChromeDriverService.createDefaultService(), options);
    //}

    public MyChromeDriver(ChromeDriverService service, Capabilities capabilities) {
        super(new MyChromeDriverCommandExecutor(service), capabilities);
        this.locationContext = new RemoteLocationContext(this.getExecuteMethod());
        this.webStorage = new RemoteWebStorage(this.getExecuteMethod());
        this.touchScreen = new RemoteTouchScreen(this.getExecuteMethod());
        this.networkConnection = new RemoteNetworkConnection(this.getExecuteMethod());
    }

    @Override
    public void setFileDetector(FileDetector detector) {
        throw new WebDriverException("Setting the file detector only works on remote webdriver instances obtained via RemoteWebDriver");
    }

    @Override
    public LocalStorage getLocalStorage() {
        return this.webStorage.getLocalStorage();
    }

    @Override
    public SessionStorage getSessionStorage() {
        return this.webStorage.getSessionStorage();
    }

    @Override
    public Location location() {
        return this.locationContext.location();
    }

    @Override
    public void setLocation(Location location) {
        this.locationContext.setLocation(location);
    }

    @Override
    public TouchScreen getTouch() {
        return this.touchScreen;
    }

    @Override
    public ConnectionType getNetworkConnection() {
        return this.networkConnection.getNetworkConnection();
    }

    @Override
    public ConnectionType setNetworkConnection(ConnectionType type) {
        return this.networkConnection.setNetworkConnection(type);
    }

    public void launchApp(String id) {
        this.execute("launchApp", ImmutableMap.of("id", id));
    }
}

This is mostly a copy of the original class, but with some constructors disabled (because some of the needed code is package private). If you are in need of these constructors you must place the classes in the package org.openqa.selenium.chrome.

这主要是原始类的副本,但禁用了一些构造函数(因为一些所需的代码是包私有的)。如果您需要这些构造函数,您必须将这些类放在包 org.openqa.selenium.chrome 中。

With these changes you are able to call the required code, as shown by Florent B., but now in Java with the Selenium API:

通过这些更改,您可以调用所需的代码,如 Florent B. 所示,但现在在 Java 中使用 Selenium API:

import com.google.common.collect.ImmutableMap;
import org.openqa.selenium.remote.Command;
import org.openqa.selenium.remote.Response;

import javax.annotation.Nonnull;
import javax.annotation.Nullable;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;
import java.util.HashMap;
import java.util.Map;

public class ChromeExtender {
    @Nonnull
    private MyChromeDriver m_wd;

    public ChromeExtender(@Nonnull MyChromeDriver wd) {
        m_wd = wd;
    }

    public void takeScreenshot(@Nonnull File output) throws Exception {
        Object visibleSize = evaluate("({x:0,y:0,width:window.innerWidth,height:window.innerHeight})");
        Long visibleW = jsonValue(visibleSize, "result.value.width", Long.class);
        Long visibleH = jsonValue(visibleSize, "result.value.height", Long.class);

        Object contentSize = send("Page.getLayoutMetrics", new HashMap<>());
        Long cw = jsonValue(contentSize, "contentSize.width", Long.class);
        Long ch = jsonValue(contentSize, "contentSize.height", Long.class);

        /*
         * In chrome 61, delivered one day after I wrote this comment, the method forceViewport was removed.
         * I commented it out here with the if(false), and hopefully wrote a working alternative in the else 8-/
         */
        if(false) {
            send("Emulation.setVisibleSize", ImmutableMap.of("width", cw, "height", ch));
            send("Emulation.forceViewport", ImmutableMap.of("x", Long.valueOf(0), "y", Long.valueOf(0), "scale", Long.valueOf(1)));
        } else {
            send("Emulation.setDeviceMetricsOverride",
                ImmutableMap.of("width", cw, "height", ch, "deviceScaleFactor", Long.valueOf(1), "mobile", Boolean.FALSE, "fitWindow", Boolean.FALSE)
            );
            send("Emulation.setVisibleSize", ImmutableMap.of("width", cw, "height", ch));
        }

        Object value = send("Page.captureScreenshot", ImmutableMap.of("format", "png", "fromSurface", Boolean.TRUE));

        // Since chrome 61 this call has disappeared too; it does not seem to be necessary anymore with the new code.
        // send("Emulation.resetViewport", ImmutableMap.of()); 
        send("Emulation.setVisibleSize", ImmutableMap.of("x", Long.valueOf(0), "y", Long.valueOf(0), "width", visibleW, "height", visibleH));

        String image = jsonValue(value, "data", String.class);
        byte[] bytes = Base64.getDecoder().decode(image);

        try(FileOutputStream fos = new FileOutputStream(output)) {
            fos.write(bytes);
        }
    }

    @Nonnull
    private Object evaluate(@Nonnull String script) throws IOException {
        Map<String, Object> param = new HashMap<>();
        param.put("returnByValue", Boolean.TRUE);
        param.put("expression", script);

        return send("Runtime.evaluate", param);
    }

    @Nonnull
    private Object send(@Nonnull String cmd, @Nonnull Map<String, Object> params) throws IOException {
        Map<String, Object> exe = ImmutableMap.of("cmd", cmd, "params", params);
        Command xc = new Command(m_wd.getSessionId(), "sendCommandWithResult", exe);
        Response response = m_wd.getCommandExecutor().execute(xc);

        Object value = response.getValue();
        if(response.getStatus() == null || response.getStatus().intValue() != 0) {
            //System.out.println("resp: " + response);
            throw new MyChromeDriverException("Command '" + cmd + "' failed: " + value);
        }
        if(null == value)
            throw new MyChromeDriverException("Null response value to command '" + cmd + "'");
        //System.out.println("resp: " + value);
        return value;
    }

    @Nullable
    static private <T> T jsonValue(@Nonnull Object map, @Nonnull String path, @Nonnull Class<T> type) {
        String[] segs = path.split("\.");
        Object current = map;
        for(String name: segs) {
            Map<String, Object> cm = (Map<String, Object>) current;
            Object o = cm.get(name);
            if(null == o)
                return null;
            current = o;
        }
        return (T) current;
    }
}

This lets you use the commands as specified, and creates a file with a png format image inside it. You can of course also directly create a BufferedImage by using ImageIO.read() on the bytes.

这使您可以使用指定的命令,并在其中创建一个包含 png 格式图像的文件。您当然也可以通过对字节使用 ImageIO.read() 直接创建 BufferedImage 。

回答by Florent B.

Yes it possible to take a full page screenshot with Selenium since Chrome v59. The Chrome driver has two new endpoints to directly call the DevTools API:

是的,自 Chrome v59 以来,可以使用 Selenium 截取整页屏幕截图。Chrome 驱动程序有两个新端点可以直接调用 DevTools API:

/session/:sessionId/chromium/send_command_and_get_result
/session/:sessionId/chromium/send_command

The Selenium API doesn't implement these commands, so you'll have to send them directly with the underlying executor. It's not straightforward, but at least it's possible to produce the exact same result as DevTools.

Selenium API 不实现这些命令,因此您必须直接将它们与底层执行程序一起发送。这并不简单,但至少可以产生与 DevTools 完全相同的结果。

Here's an example with python working on a local or remote instance:

这是一个在本地或远程实例上工作的 python 示例:

from selenium import webdriver
import json, base64

capabilities = {
  'browserName': 'chrome',
  'chromeOptions':  {
    'useAutomationExtension': False,
    'args': ['--disable-infobars']
  }
}

driver = webdriver.Chrome(desired_capabilities=capabilities)
driver.get("https://stackoverflow.com/questions")

png = chrome_takeFullScreenshot(driver)

with open(r"C:\downloads\screenshot.png", 'wb') as f:
  f.write(png)

, and the code to take a full page screenshot :

,以及截取整页屏幕截图的代码:

def chrome_takeFullScreenshot(driver) :

  def send(cmd, params):
    resource = "/session/%s/chromium/send_command_and_get_result" % driver.session_id
    url = driver.command_executor._url + resource
    body = json.dumps({'cmd':cmd, 'params': params})
    response = driver.command_executor._request('POST', url, body)
    return response.get('value')

  def evaluate(script):
    response = send('Runtime.evaluate', {'returnByValue': True, 'expression': script})
    return response['result']['value']

  metrics = evaluate( \
    "({" + \
      "width: Math.max(window.innerWidth, document.body.scrollWidth, document.documentElement.scrollWidth)|0," + \
      "height: Math.max(innerHeight, document.body.scrollHeight, document.documentElement.scrollHeight)|0," + \
      "deviceScaleFactor: window.devicePixelRatio || 1," + \
      "mobile: typeof window.orientation !== 'undefined'" + \
    "})")
  send('Emulation.setDeviceMetricsOverride', metrics)
  screenshot = send('Page.captureScreenshot', {'format': 'png', 'fromSurface': True})
  send('Emulation.clearDeviceMetricsOverride', {})

  return base64.b64decode(screenshot['data'])

With Java:

使用 Java:

public static void main(String[] args) throws Exception {

    ChromeOptions options = new ChromeOptions();
    options.setExperimentalOption("useAutomationExtension", false);
    options.addArguments("disable-infobars");

    ChromeDriverEx driver = new ChromeDriverEx(options);

    driver.get("https://stackoverflow.com/questions");
    File file = driver.getFullScreenshotAs(OutputType.FILE);
}
import java.lang.reflect.Method;
import java.util.Map;
import com.google.common.collect.ImmutableMap;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeDriverService;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.remote.CommandInfo;
import org.openqa.selenium.remote.HttpCommandExecutor;
import org.openqa.selenium.remote.http.HttpMethod;


public class ChromeDriverEx extends ChromeDriver {

    public ChromeDriverEx() throws Exception {
        this(new ChromeOptions());
    }

    public ChromeDriverEx(ChromeOptions options) throws Exception {
        this(ChromeDriverService.createDefaultService(), options);
    }

    public ChromeDriverEx(ChromeDriverService service, ChromeOptions options) throws Exception {
        super(service, options);
        CommandInfo cmd = new CommandInfo("/session/:sessionId/chromium/send_command_and_get_result", HttpMethod.POST);
        Method defineCommand = HttpCommandExecutor.class.getDeclaredMethod("defineCommand", String.class, CommandInfo.class);
        defineCommand.setAccessible(true);
        defineCommand.invoke(super.getCommandExecutor(), "sendCommand", cmd);
    }

    public <X> X getFullScreenshotAs(OutputType<X> outputType) throws Exception {
        Object metrics = sendEvaluate(
            "({" +
            "width: Math.max(window.innerWidth,document.body.scrollWidth,document.documentElement.scrollWidth)|0," +
            "height: Math.max(window.innerHeight,document.body.scrollHeight,document.documentElement.scrollHeight)|0," +
            "deviceScaleFactor: window.devicePixelRatio || 1," +
            "mobile: typeof window.orientation !== 'undefined'" +
            "})");
        sendCommand("Emulation.setDeviceMetricsOverride", metrics);
        Object result = sendCommand("Page.captureScreenshot", ImmutableMap.of("format", "png", "fromSurface", true));
        sendCommand("Emulation.clearDeviceMetricsOverride", ImmutableMap.of());
        String base64EncodedPng = (String)((Map<String, ?>)result).get("data");
        return outputType.convertFromBase64Png(base64EncodedPng);
    }

    protected Object sendCommand(String cmd, Object params) {
        return execute("sendCommand", ImmutableMap.of("cmd", cmd, "params", params)).getValue();
    }

    protected Object sendEvaluate(String script) {
        Object response = sendCommand("Runtime.evaluate", ImmutableMap.of("returnByValue", true, "expression", script));
        Object result = ((Map<String, ?>)response).get("result");
        return ((Map<String, ?>)result).get("value");
    }
}