javascript Selenium Webdriver 的替代品

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29671060/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-28 10:53:16  来源:igfitidea点击:

Alternatives to Selenium Webdriver

javascriptseleniumwebdriver

提问by Robert Smit

I use the Selenium Webdriver for C# and for Python to obtain data elements from websites, but the speed of the web scraping is terribly slow. Scraping 35000 data tables took me about 1,5 day. With the Selenium Webdriver I can execute Javascript to get a Java element. Is there some library available which doesn't require something like a Webdriver to execute Javascript on a webpage to retrieve elements and is able to click on elements as well? Or is there a faster alternative to Selenium?

我使用 C# 和 Python 的 Selenium Webdriver 从网站获取数据元素,但网络抓取的速度非常慢。抓取 35000 个数据表花了我大约 1.5 天的时间。使用 Selenium Webdriver 我可以执行 Javascript 来获取 Java 元素。是否有一些可用的库不需要像 Webdriver 这样的东西来在网页上执行 Javascript 来检索元素并且也能够点击元素?或者有没有更快的 Selenium 替代品?

回答by Helen Dikareva

I suggest you to use TestCafe.

我建议你使用TestCafe

enter image description here

在此处输入图片说明

TestCafe is free, open source framework for web functional testing (e2e testing). TestCafe's based on Node.js and doesn't use WebDriver at all.

TestCafe 是用于 Web 功能测试(e2e 测试)的免费开源框架。TestCafe 基于 Node.js,根本不使用 WebDriver。

TestCafe-powered tests are executed on the server side. To obtain DOM-elements, TestCafe provides powerfull flexible system of Selectors. TestCafe can execute JavaScript on tested webpage using the ClientFunction feature (see our Documentation).

TestCafe 驱动的测试在服务器端执行。为了获取 DOM 元素,TestCafe 提供了强大灵活的选择器系统。TestCafe 可以使用 ClientFunction 功能在经过测试的网页上执行 JavaScript(请参阅我们的文档)。

TestCafe tests are really very fast, see for yourself. But the high speed test run does not affect the stability thanks to a build-in smart wait system.

TestCafe 测试真的很快,你自己看看。但由于内置智能等待系统,高速试运行不影响稳定性。

Installation of TestCafe is very easy:

TestCafe 的安装非常简单:

1) Check that you have Node.js on your PC (or install it).

1) 检查您的 PC 上是否有 Node.js(或安装它)。

2) To install TestCafe open cmd and type in:

2) 要安装 TestCafe,打开 cmd 并输入:

npm install -g testcafe

Writing test is not a rocket-science. Here is a quick start: 1) Copy-paste the following code to your text editor and save it as "test.js"

写作测试不是火箭科学。这是一个快速入门:1) 将以下代码复制粘贴到您的文本编辑器中,并将其另存为“test.js”

import { Selector } from 'testcafe';

fixture `Getting Started`
    .page `http://devexpress.github.io/testcafe/example`;

test('My first test', async t => {
    await t
        .typeText('#developer-name', 'John Smith')
        .click('#submit-button')
        .expect(Selector('#article-header').innerText).eql('Thank you, John Smith!');
});

2) Run test in your browser (e.g. chrome) by typing the following command in cmd:

2) 通过在 cmd 中键入以下命令在浏览器(例如 chrome)中运行测试:

testcafe chrome test.js

3) Get the descriptive result in the console output.

3) 在控制台输出中获取描述性结果。

TestCafe allows you to test against various browsers: local, remote (on devices, be it browser for Raspberry Pi or Safari for iOS), cloud (e.g. Sauce Labs) or headless (e.g. Nightmare). This means that you can easily use TestCafe with your Continious Integration infrastructure.

TestCafe 允许您针对各种浏览器进行测试:本地、远程(在设备上,无论是 Raspberry Pi 的浏览器还是 iOS 的 Safari)、云(例如 Sauce Labs)或无头(例如 Nightmare)。这意味着您可以轻松地将 TestCafe 与您的持续集成基础架构一起使用。

回答by LittlePanda

I suggest Selenium + PhantomJSDriver (Ghostdriver), which is used for GUI-lessbrowser automation. With this you can easily navigate through the pages, select elements (you can select the flights), submit forms and also perform some scraping. Javascript is also supported.

我建议使用 Selenium + PhantomJSDriver (Ghostdriver),它用于无GUI 的浏览器自动化。有了它,您可以轻松浏览页面、选择元素(您可以选择航班)、提交表单并执行一些抓取操作。也支持 Javascript。

You can got through the Selenium documentation here. You will have to download phantomjs.exefile.

您可以在此处阅读Selenium 文档。您必须下载phantomjs.exe文件。

A good tutorial forPhantomJSDriver is given in here

这里给出了 PhantomJSDriver 的一个很好的教程

Config of PhantomJSDriver(from the tutorial):

PhantomJSDriver 的配置(来自教程):

DesiredCapabilities caps = new DesiredCapabilities();
caps.setJavascriptEnabled(true); // not really needed: JS enabled by default
caps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "C://phantomjs.exe");
caps.setCapability("takesScreenshot", true);
WebDriver driver = new PhantomJSDriver(caps);   

Other option(this will not require WebDriver): PhantomJS

其他选项(这不需要 WebDriver):PhantomJS

PhantomJS is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG.

PhantomJS 是一个无头的 WebKit,可使用 JavaScript API 编写脚本。它对各种 Web 标准提供快速和原生支持:DOM 处理、CSS 选择器、JSON、Canvas 和 SVG。

This is GUI-less and also has the ability to take screenshots.

这是无 GUI 的,还可以截取屏幕截图。

Example (from here):

示例(来自此处):

var page = require('webpage').create();
page.open('http://example.com', function(status) {
  console.log("Status: " + status);
  if(status === "success") {
    page.render('example.png');
  }
  phantom.exit();
});

PS: I would suggest JSoup for web-scraping but it does not support Javascript. PhantomJSDriver has something called Ghost.py for python.

PS:我建议使用 JSoup 进行网页抓取,但它不支持 Javascript。PhantomJSDriver 有一个叫做 Ghost.py 的 Python 文件。

回答by Sarah Sukin

What about LeanFT? It's a new HP product that works with C# and Java and users say they switched to LeanFT "because Selenium couldn't handleall of [their] applications."

LeanFT 呢?这是一款适用于 C# 和 Java 的惠普新产品,用户表示他们转而使用 LeanFT,“因为Selenium 无法处理所有 [他们的] 应用程序。”

回答by not-bob

If you use the HTMLUnit webdriver, there is no overhead of running a browser, so the code can run much faster. You could speed that up even more by abandoning a framework/toolset altogether and query pages directly and parse them for what you need. However, this makes maintenance and updating a pain.

如果您使用 HTMLUnit webdriver,则没有运行浏览器的开销,因此代码可以运行得更快。您可以通过完全放弃框架/工具集并直接查询页面并根据需要解析它们来进一步加快速度。然而,这使得维护和更新变得痛苦。