javascript 如何在chrome headless+puppeteer评估()中使用xpath?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48448586/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 08:02:31  来源:igfitidea点击:

How to use xpath in chrome headless+puppeteer evaluate()?

javascriptgoogle-chromexpathpuppeteer

提问by MevatlaveKraspek

How can I use $x()to use xpath expressioninside a page.evaluate()?

我怎样才能在 a 中$x()使用xpath 表达式page.evaluate()

As far as pageis not in the same context, I tried $x()directly (like I would do in chrome dev tools), but no cigar.

至于page不是在同一上下文中,我$x()直接尝试(就像我在 chrome 开发工具中所做的那样),但没有雪茄。

The script goes in timeout.

脚本进入超时状态。

回答by Everettss

$x()is not a standard JavaScript method to select element by XPath. $x()it's only a helper in chrome devtools. They claim this in the documentation:

$x()不是通过 XPath 选择元素的标准 JavaScript 方法。$x()它只是chrome devtools 中的一个帮手。他们在文档中声称:

Note: This API is only available from within the console itself. You cannot access the Command Line API from scripts on the page.

注意:此 API 仅在控制台本身内可用。您无法从页面上的脚本访问命令行 API。

And page.evaluate()is treated here as a "scripts on the page".

并且page.evaluate()在此处被视为“页面上的脚本”。

You have two options:

您有两个选择:

  1. Use document.evaluate
  1. 利用 document.evaluate

Here is a example of selecting element (featured article) inside page.evaluate():

以下是选择内部元素(特色文章)的示例page.evaluate()

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://en.wikipedia.org', { waitUntil: 'networkidle2' });

    const text = await page.evaluate(() => {
        // $x() is not a JS standard -
        // this is only sugar syntax in chrome devtools
        // use document.evaluate()
        const featureArticle = document
            .evaluate(
                '//*[@id="mp-tfa"]',
                document,
                null,
                XPathResult.FIRST_ORDERED_NODE_TYPE,
                null
            )
            .singleNodeValue;

        return featureArticle.textContent;
    });

    console.log(text);
    await browser.close();
})();
  1. Select element by Puppeteer page.$x()and pass it to page.evaluate()
  1. 通过 Puppeteer 选择元素page.$x()并将其传递给page.evaluate()

This example achieves the same results as in the 1. example:

此示例实现了与 1. 示例相同的结果:

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://en.wikipedia.org', { waitUntil: 'networkidle2' });

    // await page.$x() returns array of ElementHandle
    // we are only interested in the first element
    const featureArticle = (await page.$x('//*[@id="mp-tfa"]'))[0];
    // the same as:
    // const featureArticle = await page.$('#mp-tfa');

    const text = await page.evaluate(el => {
        // do what you want with featureArticle in page.evaluate
        return el.textContent;
    }, featureArticle);

    console.log(text);
    await browser.close();
})();

Hereis a related question how to inject $x()helper function to your scripts.

是一个相关的问题如何将$x()辅助函数注入您的脚本。

回答by Grant Miller

If you insist on using page.$x(), you can simply pass the result to page.evaluate():

如果你坚持使用page.$x(),你可以简单地将结果传递给page.evaluate()

const example = await page.evaluate(element => {
  return element.textContent;
}, (await page.$x('//*[@id="result"]'))[0]);