Java Selenium webdriver:修改 navigator.webdriver 标志以防止 selenium 检测

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/53039551/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 00:42:08  来源:igfitidea点击:

Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection

javaseleniumselenium-webdriverwebdriverwebdriver-w3c-spec

提问by Ajanth

I'm trying to automate a very basic task in a website using selenium and chrome but somehow the website detects when chrome is driven by selenium and blocks every request. I suspect that the website is relying on an exposed DOM variable like this one https://stackoverflow.com/a/41904453/648236to detect selenium driven browser.

我正在尝试使用 selenium 和 chrome 在网站中自动执行一项非常基本的任务,但是该网站以某种方式检测到 chrome 何时由 selenium 驱动并阻止每个请求。我怀疑该网站依赖于这样一个暴露的 DOM 变量https://stackoverflow.com/a/41904453/648236来检测硒驱动的浏览器。

My question is, is there a way I can make the navigator.webdriver flag false? I am willing to go so far as to try and recompile the selenium source after making modifications, but I cannot seem to find the NavigatorAutomationInformation source anywhere in the repository https://github.com/SeleniumHQ/selenium

我的问题是,有没有办法让 navigator.webdriver 标志为假?我愿意在修改后尝试重新编译 selenium 源,但我似乎无法在存储库中的任何位置找到 NavigatorAutomationInformation 源https://github.com/SeleniumHQ/selenium

Any help is much appreciated

任何帮助深表感谢

P.S: I also tried the following from https://w3c.github.io/webdriver/#interface

PS:我也从https://w3c.github.io/webdriver/#interface尝试了以下内容

Object.defineProperty(navigator, 'webdriver', {
    get: () => false,
  });

But it only updates the property after the initial page load. I think the site detects the variable before my script is executed.

但它只在初始页面加载后更新属性。我认为该站点在执行我的脚本之前检测到该变量。

采纳答案by DebanjanB

First the update 1

首先更新1

execute_cdp_cmd():With the availability of execute_cdp_cmd(cmd, cmd_args)command now you can easily execute google-chrome-devtoolscommandsusing Selenium. Using this feature you can modify the navigator.webdrivereasily to prevent Selenium from getting detected.

execute_cdp_cmd()execute_cdp_cmd(cmd, cmd_args)现在命令可用,您可以使用Selenium轻松执行google-chrome-devtools命令。使用此功能,您可以轻松修改以防止 Selenium 被检测到。navigator.webdriver



Preventing Detection 2

防止检测 2

To prevent Selenium driven WebDrivergetting detected a niche approach would include either/all of the below mentioned steps:

为了防止 Selenium 驱动的WebDriver被检测到,一种利基方法将包括以下任一/所有步骤:

  • Rotating the user-agentthrough execute_cdp_cmd()command as follows:

    driver.execute_cdp_cmd("Network.setExtraHTTPHeaders", {"headers": {"User-Agent": "browserClientA"}})
    
  • Change the propertyvalue of the navigatorfor webdriverto undefined

    driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
      "source": """
        Object.defineProperty(navigator, 'webdriver', {
          get: () => undefined
        })
      """
    })
    
  • Exclude the collection of enable-automationswitches

    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    
  • Turn-off useAutomationExtension

    options.add_experimental_option('useAutomationExtension', False)
    
  • 通过命令轮换用户代理execute_cdp_cmd()如下:

    driver.execute_cdp_cmd("Network.setExtraHTTPHeaders", {"headers": {"User-Agent": "browserClientA"}})
    
  • 将for webdriver属性值更改为undefinednavigator

    driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
      "source": """
        Object.defineProperty(navigator, 'webdriver', {
          get: () => undefined
        })
      """
    })
    
  • 排除enable-automation开关集合

    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    
  • 关掉 useAutomationExtension

    options.add_experimental_option('useAutomationExtension', False)
    


Sample Code 3

示例代码3

Clubbing up all the steps mentioned above and effective code block will be:

将上述所有步骤和有效代码块结合起来将是:

from selenium import webdriver

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
  "source": """
    Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined
    })
  """
})
driver.execute_cdp_cmd("Network.enable", {})
driver.execute_cdp_cmd("Network.setExtraHTTPHeaders", {"headers": {"User-Agent": "browser1"}})
driver.get("https://www.google.com/")


History

历史

As per the W3C Editor's Draftthe current implementation strictly mentions:

根据W3C 编辑草案,当前的实现严格提到:

The webdriver-activeflagis set to truewhen the user agentis under remote controlwhich is initially set to false.

用户代理处于远程控制下时,该标志设置为,最初设置为。webdriver-activetruefalse

Further,

更远,

Navigator includes NavigatorAutomationInformation;

It is to be noted that:

需要注意的是:

The NavigatorAutomationInformationinterfaceshould not be exposed on WorkerNavigator.

NavigatorAutomationInformation接口不应在WorkerNavigator公开

The NavigatorAutomationInformationinterfaceis defined as:

NavigatorAutomationInformation接口被定义为:

interface mixin NavigatorAutomationInformation {
    readonly attribute boolean webdriver;
};

which returns trueif webdriver-activeflagis set, false otherwise.

true如果设置了webdriver-active标志,则返回,否则返回false。

Finally, the navigator.webdriverdefines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, so that alternate code paths can be triggered during automation.

最后,navigator.webdriver定义了一种标准方式,用于协作用户代理来通知文档它是由WebDriver控制的,以便在自动化过程中可以触发替代代码路径。

Caution: Altering/tweaking the above mentioned parameters may block the navigationand get the WebDriverinstance detected.

注意:更改/调整上述参数可能会阻止导航并检测到WebDriver实例。



Update (6-Nov-2019)

更新(2019 年 11 月 6 日)

As of the current implementation an ideal way to access a web page without getting detected would be to use the ChromeOptions()class to add a couple of arguments to:

在当前实现中,访问网页而不被检测到的理想方法是使用ChromeOptions()该类向以下内容添加几个参数:

  • Exclude the collection of enable-automationswitches
  • Turn-off useAutomationExtension
  • 排除enable-automation开关集合
  • 关掉 useAutomationExtension

through an instance of ChromeOptionsas follows:

通过以下实例ChromeOptions

  • Java Example:

    System.setProperty("webdriver.chrome.driver", "C:\Utility\BrowserDrivers\chromedriver.exe");
    ChromeOptions options = new ChromeOptions();
    options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation"));
    options.setExperimentalOption("useAutomationExtension", false);
    WebDriver driver =  new ChromeDriver(options);
    driver.get("https://www.google.com/");
    
  • Python Example

    from selenium import webdriver
    
    options = webdriver.ChromeOptions()
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:\path\to\chromedriver.exe')
    driver.get("https://www.google.com/")
    
  • Java 示例:

    System.setProperty("webdriver.chrome.driver", "C:\Utility\BrowserDrivers\chromedriver.exe");
    ChromeOptions options = new ChromeOptions();
    options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation"));
    options.setExperimentalOption("useAutomationExtension", false);
    WebDriver driver =  new ChromeDriver(options);
    driver.get("https://www.google.com/");
    
  • Python 示例

    from selenium import webdriver
    
    options = webdriver.ChromeOptions()
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=options, executable_path=r'C:\path\to\chromedriver.exe')
    driver.get("https://www.google.com/")
    


Legends

传奇

1: Applies to Selenium's Python clients only.

1:仅适用于 Selenium 的 Python 客户端。

2: Applies to Selenium's Python clients only.

2:仅适用于 Selenium 的 Python 客户端。

3: Applies to Selenium's Python clients only.

3: 仅适用于 Selenium 的 Python 客户端。

回答by someone else

Before (in browser console window):

之前(在浏览器控制台窗口中):

> navigator.webdriver
true

Change (in selenium):

变化(在硒中):

// C#
var options = new ChromeOptions();
options.AddExcludedArguments(new List<string>() { "enable-automation" });

// Python
options.add_experimental_option("excludeSwitches", ['enable-automation'])

After (in browser console window):

之后(在浏览器控制台窗口中):

> navigator.webdriver
undefined

This will not work for version ChromeDriver 79.0.3945.16and above. See the release notes here

这不适用于ChromeDriver 79.0.3945.16及更高版本。在此处查看发行说明

回答by pguardiario

Nowadays you can accomplish this with cdp command:

现在,您可以使用 cdp 命令完成此操作:

driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
  "source": """
    Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined
    })
  """
})

driver.get(some_url)

by the way, you want to return undefined, falseis a dead giveaway.

顺便说一句,你想退货undefinedfalse是一个死赠品。

回答by DUDSS

I would like to add a Java alternative to the cdp command method mentioned by pguardiario

我想为pguardiario提到的 cdp 命令方法添加一个 Java 替代方法

Map<String, Object> params = new HashMap<String, Object>();
params.put("source", "Object.defineProperty(navigator, 'webdriver', { get: () => undefined })");
driver.executeCdpCommand("Page.addScriptToEvaluateOnNewDocument", params);

In order for this to work you need to use the ChromiumDriver from the org.openqa.selenium.chromium.ChromiumDriverpackage. From what I can tell that package is not included in Selenium 3.141.59 so I used the Selenium 4 alpha.

为了使其工作,您需要使用org.openqa.selenium.chromium.ChromiumDriver包中的 ChromiumDriver 。据我所知,Selenium 3.141.59 中不包含该软件包,因此我使用了 Selenium 4 alpha。

Also, the excludeSwitches/useAutomationExtension experimental options do not seem to work for me anymore with ChromeDriver 79 and Chrome 79.

此外,对于 ChromeDriver 79 和 Chrome 79, excludeSwitches/useAutomationExtension 实验选项似乎不再适合我。

回答by ScamCast

ChromeDriver:

铬驱动程序

Finally discovered the simple solution for this with a simple flag! :)

终于用一个简单的标志发现了这个简单的解决方案!:)

--disable-blink-features=AutomationControlled

navigator.webdriver=truewill no longer show up with that flag set.

navigator.webdriver=true将不再显示该标志集。

For a list of things you can disable, check them out here

有关您可以禁用的内容列表,请在此处查看

回答by Elias Vargas

Finally this solved the problem for ChromeDriver, Chrome greater than v79.

终于解决了ChromeDriver的问题,Chrome大于v79。

ChromeOptions options = new ChromeOptions();
options.addArguments("--disable-blink-features");
options.addArguments("--disable-blink-features=AutomationControlled");
ChromeDriver driver = new ChromeDriver(options);
Map<String, Object> params = new HashMap<String, Object>();
params.put("source", "Object.defineProperty(navigator, 'webdriver', { get: () => undefined })");
driver.executeCdpCommand("Page.addScriptToEvaluateOnNewDocument", params);

回答by Rick

To exclude the collection of enable-automation switches as mentioned in the 6-Nov-2019 update of the top voted answer doesn't work anymore as of April 2020. Instead I was getting the following error:

要排除 2019 年 11 月 6 日更新的最高投票答案中提到的启用自动化开关的集合,到 2020 年 4 月不再有效。相反,我收到以下错误:

ERROR:broker_win.cc(55)] Error reading broker pipe: The pipe has been ended. (0x6D)

Here's what's working as of 6th April 2020 with Chrome 80.

以下是截至 2020 年 4 月 6 日 Chrome 80 的运行情况。

Before (in the Chrome console window):

之前(在 Chrome 控制台窗口中):

> navigator.webdriver
true

Python example:

蟒蛇示例:

options = webdriver.ChromeOptions()
options.add_argument("--disable-blink-features")
options.add_argument("--disable-blink-features=AutomationControlled")

After (in the Chrome console window):

之后(在 Chrome 控制台窗口中):

> navigator.webdriver
undefined

回答by feng ce

If you use a Remote Webdriver , the code below will set navigator.webdriverto undefined.

如果您使用远程 Webdriver,下面的代码将设置navigator.webdriverundefined.

work for ChromeDriver 81.0.4044.122

适用于ChromeDriver 81.0.4044.122

Python example:

蟒蛇示例:

    options = webdriver.ChromeOptions()
    # options.add_argument("--headless")
    options.add_argument('--disable-gpu')
    options.add_argument('--no-sandbox')
    driver = webdriver.Remote(
       'localhost:9515', desired_capabilities=options.to_capabilities())
    script = '''
    Object.defineProperty(navigator, 'webdriver', {
        get: () => undefined
    })
    '''
    driver.execute_script(script)

回答by Baki Billah

Do not use cdp command to change webdriver value as it will lead to inconsistency which later can be used to detect webdriver. Use the below code, this will remove any traces of webdriver.

请勿使用 cdp 命令更改 webdriver 值,因为它会导致不一致,稍后可用于检测 webdriver。使用下面的代码,这将删除 webdriver 的任何痕迹。

options.add_argument("--disable-blink-features")
options.add_argument("--disable-blink-features=AutomationControlled")