Selenium:如何在加载/执行页面的任何其他脚本之前将 Javascript 注入/执行到页面中?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31354352/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Selenium: How to Inject/execute a Javascript in to a Page before loading/executing any other scripts of the page?
提问by Alex
I'm using selenium python webdriver in order to browse some pages. I want to inject a javascript code in to a pages before any other Javascript codes get loaded and executed. On the other hand, I need my JS code to be executed as the first JS code of that page. Is there a way to do that by Selenium?
我正在使用 selenium python webdriver 来浏览一些页面。我想在加载和执行任何其他 Javascript 代码之前将 javascript 代码注入页面。另一方面,我需要将我的 JS 代码作为该页面的第一个 JS 代码执行。Selenium 有没有办法做到这一点?
I googled it for a couple of hours, but I couldn't find any proper answer!
我用谷歌搜索了几个小时,但我找不到任何正确的答案!
采纳答案by init_js
If you cannot modify the page content, you may use a proxy, or use a content script in an extension installed in your browser. Doing it within selenium you would write some code that injects the script as one of the children of an existing element, but you won't be able to have it run before the page is loaded (when your driver's get()
call returns.)
如果您无法修改页面内容,您可以使用代理,或在浏览器中安装的扩展程序中使用内容脚本。在 selenium 中执行此操作,您将编写一些代码将脚本作为现有元素的子元素之一注入,但在页面加载之前(当您的驱动程序get()
调用返回时),您将无法运行它。
String name = (String) ((JavascriptExecutor) driver).executeScript(
"(function () { ... })();" ...
The documentation leaves unspecified the moment at which the code would start executing. You would want it to before the DOM starts loading so that guarantee might only be satisfiable with the proxy or extension content script route.
文档未指定代码开始执行的时刻。您希望它在 DOM 开始加载之前完成,以便保证可能只能通过代理或扩展内容脚本路由来满足。
If you can instrument your page with a minimal harness, you may detect the presence of a special url query parameter and load additional content, but you need to do so using an inline script. Pseudocode:
如果您可以使用最少的工具来检测您的页面,您可能会检测到特殊 url 查询参数的存在并加载其他内容,但您需要使用内联脚本来执行此操作。伪代码:
<html>
<head>
<script type="text/javascript">
(function () {
if (location && location.href && location.href.indexOf("SELENIUM_TEST") >= 0) {
var injectScript = document.createElement("script");
injectScript.setAttribute("type", "text/javascript");
//another option is to perform a synchronous XHR and inject via innerText.
injectScript.setAttribute("src", URL_OF_EXTRA_SCRIPT);
document.documentElement.appendChild(injectScript);
//optional. cleaner to remove. it has already been loaded at this point.
document.documentElement.removeChild(injectScript);
}
})();
</script>
...
回答by Jonathan
回答by Matt M.
Since version 1.0.9, selenium-wirehas gained the functionality to modify responses to requests. Below is an example of this functionality to inject a script into a page before it reaches a webbrowser.
从 1.0.9 版本开始,selenium-wire获得了修改请求响应的功能。下面是此功能的示例,用于在页面到达 Web 浏览器之前将脚本注入页面。
import os
from seleniumwire import webdriver
from gzip import compress, decompress
from urllib.parse import urlparse
from lxml import html
from lxml.etree import ParserError
from lxml.html import builder
script_elem_to_inject = builder.SCRIPT('alert("injected")')
def inject(req, req_body, res, res_body):
# various checks to make sure we're only injecting the script on appropriate responses
# we check that the content type is HTML, that the status code is 200, and that the encoding is gzip
if res.headers.get_content_subtype() != 'html' or res.status != 200 or res.getheader('Content-Encoding') != 'gzip':
return None
try:
parsed_html = html.fromstring(decompress(res_body))
except ParserError:
return None
try:
parsed_html.head.insert(0, script_elem_to_inject)
except IndexError: # no head element
return None
return compress(html.tostring(parsed_html))
drv = webdriver.Firefox(seleniumwire_options={'custom_response_handler': inject})
drv.header_overrides = {'Accept-Encoding': 'gzip'} # ensure we only get gzip encoded responses
Another way in general to control a browser remotely and be able to inject a script before the pages content loads would be to use a library based on a separate protocol entirely, eg: DevTools Protocol. A Python implementation is available here: https://github.com/pyppeteer/pyppeteer2(Disclaimer: I'm one of the main authors)
通常远程控制浏览器并能够在页面内容加载之前注入脚本的另一种方法是使用完全基于单独协议的库,例如:DevTools 协议。此处提供了 Python 实现:https: //github.com/pyppeteer/pyppeteer2(免责声明:我是主要作者之一)
回答by Jacob
so I know it's been a few years, but I've found a way to do this without modifying the webpage's content and without using a proxy! I'm using the nodejs version, but presumably the API is consistent for other languages as well. What you want to do is as follows
所以我知道已经有几年了,但是我找到了一种无需修改网页内容且无需使用代理即可完成此操作的方法!我使用的是 nodejs 版本,但大概其他语言的 API 也是一致的。你想要做的如下
const {Builder, By, Key, until, Capabilities} = require('selenium-webdriver');
const capabilities = new Capabilities();
cap.setPageLoadStrategy('eager'); // Options are 'eager', 'none', 'normal'
let driver = await new Builder().forBrowser('firefox').setFirefoxOptions(capabilities).build();
await driver.get('http://example.com');
driver.executeScript(\`
console.log('hello'
\`)
That 'eager' option works for me. You may need to use the 'none' option. Documentation: https://seleniumhq.github.io/selenium/docs/api/javascript/module/selenium-webdriver/lib/capabilities_exports_PageLoadStrategy.html
那个“热切”的选项对我有用。您可能需要使用“无”选项。文档:https: //seleniumhq.github.io/selenium/docs/api/javascript/module/selenium-webdriver/lib/capabilities_exports_PageLoadStrategy.html
EDIT: Note that the 'eager' option has not been implemented in Chrome yet...
编辑:请注意,Chrome 中尚未实现 'eager' 选项......