javascript CasperJS - 如何打开链接数组中的所有链接

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17926532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-27 10:07:44  来源:igfitidea点击:

CasperJS - How to open up all links in an array of links

javascripthtmlphantomjscasperjs

提问by Michael Yaworski

I'm trying to make it so that CasperJS will open up every link in an arrayof links. I have it so that after I open a link, it will display the title of that page. Yet when I run it, nothing is displayed.

我正在努力使 CasperJS 能够打开链接中的每个array链接。我有它,以便在我打开链接后,它会显示该页面的标题。然而,当我运行它时,什么也没有显示。

I can use a for loopto display the links and it works perfectly.

我可以使用 afor loop来显示链接,而且效果很好。

This is the code for what I just explained:

这是我刚刚解释的代码:

var x;

casper.start(URL, function() {

    x = links.split(" "); // now x is an array of links

    for (var i = 0; j < x.length; i++) // for every link...
    {
        casper.thenOpen(partialURL + x[i], function() { // open that link
            console.log(this.getTitle() + '\n'); // display the title of page
        });
    }

    this.exit();
});

casper.run();

This is another method I tried:

这是我尝试的另一种方法:

var x;

casper.start(URL, function() {
    x = links.split(" "); // now x is an array of links
    this.exit();
});

for (var i = 0; j < x.length; i++) // for every link...
{
    casper.thenOpen(partialURL + x[i], function() { // open that link
        console.log(this.getTitle() + '\n'); // display the title of page
    });
}

casper.run();

It says that 'x' in undefined. Notice that I set x to be a global variable though. Any modifications that you could make would be great. Thanks.

它说'x'未定义。请注意,我将 x 设置为全局变量。您可以进行的任何修改都会很棒。谢谢。

采纳答案by Michael Yaworski

var x; var i = -1;

casper.start(URL, function() {
    x = links.split(" "); // now x is an array of links
});

casper.then(function() {
    this.each(x, function() { 
        i++; // change the link being opened (has to be here specifically)
        this.thenOpen((partialURL + x[i]), function() {
            this.echo(this.getTitle()); // display the title of page
        });
    });
});

casper.run();

回答by dvg

var i = 0;
var nTimes = x.length;

casper.repeat(nTimes, function() {
    //... do your stuff
    i++;
});

worked for me.

对我来说有效。

回答by abdel

casper.start('about:blank');

var urls = ['http://google.fr', 'http://yahoo.fr', 'http://amazon.fr'];

casper.each(urls, function(casper, url) {
  casper.thenOpen(url, function() {
        this.echo("I'm in your " + url + ".");
    });
});

回答by Thank you

In my case, I had to scrape a site that had an unknown number of pages. Each page (except the last) had a <a class="next-page" href="/page/N">Next page</a>link (where Nis the page number). There was no way for the scraper to know when it was finished except when the "Next Page" link was no longer present.

就我而言,我不得不抓取一个页面数量未知的网站。每个页面(除了最后一个)都有一个<a class="next-page" href="/page/N">Next page</a>链接(其中N是页码)。除非“下一页”链接不再存在,否则刮板无法知道何时完成。

Of course you'll have to make adjustments depending on what type of pagination links might exist on your page.

当然,您必须根据页面上可能存在的分页链接类型进行调整。

Here's what I did. Ymmv.

这就是我所做的。嗯。

// imports
var fs = require('fs');

// scraper state
var state = {page: 1, data: []};

// casper
var casper = require("casper").create();

// scraper function
function scrape() {
  this.echo('Scraping page ' + state.page + '...', 'INFO');

  state.data = state.data.concat(this.evaluate(function() {
    // get some stuff from the page
    return someData;
  });

  var nextUrl = this.evaluate(function() {
    var nextLink = document.querySelector("a.next-page");
    return nextLink && nextLink.href;
  });

  if (nextUrl) {
    state.page = state.page + 1;
    casper.thenOpen(nextUrl, scrape); // <- recursion
  }
});

// run
casper.run(function() {
  fs.write('./data.json', JSON.stringify(state.data, null, '\t'), 'w');
  this.echo('Done!', 'INFO');
});

Hope this helps someone. If you have other questions, I'll be happy to try to help.

希望这可以帮助某人。如果您还有其他问题,我将很乐意为您提供帮助。

回答by VonAxt

casper.start();
casper.each(Object.keys(array), function(casper, array_elem) {
    this.thenOpen(partialURL+array[attay_item], function() {
        ...
};

And as to "undefined" error. Try not to use this too much. I experience this error with CasperJS to often, so I prefer to write casper instead of this.

至于“未定义”错误。尽量不要过多地使用它。我经常在使用 CasperJS 时遇到这个错误,所以我更喜欢编写 casper 而不是这个。

回答by hexid

Try something like this.

尝试这样的事情。

var x;

casper.start(URL, function() {
    x = links.split(" "); // now x is an array of links
});

casper.then(function() {
    this.eachThen(x, function(response) {
        this.thenOpen((partialURL + response.data), function() {
            this.echo(this.getTitle()); // display the title of page
        });
    });
});

casper.run();

xwas undefined because the for loop was being executed before casper.start. In the above code, the eachThen()block is nested inside of a casper.thenblock in order to delay its execution.

x未定义,因为 for 循环在 之前执行casper.start。在上面的代码中,eachThen()块嵌套在casper.then块内以延迟其执行。

回答by Alex Albalá

I have solved the same issue with this code:

我用这段代码解决了同样的问题:

casper.then(function () {
    var i = -1;
    this.eachThen(locations, function () {
        i++;
        //Do stuff here like for example:
        this.thenOpen(YOUR_URL, function () {
            this.waitForSelector("MYSELECTOR", 
            function () {

            },                
            function () {

            })
        });
    })
});