node.js 异步并行 HTTP 请求

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31761648/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 19:11:24  来源:igfitidea点击:

Async parallel HTTP request

node.jsasynchronous

提问by ChrisRich

I'm having a control flow problem with an application loading a large array of URLs. I'm using Caolan Async and the NPM request module.

我在加载大量 URL 的应用程序中遇到控制流问题。我正在使用 Caolan Async 和 NPM 请求模块。

My problem is that the HTTP response starts as soon as the function is added to the queue. Ideally I want to build my queue and only start making the HTTP requests when the queue starts. Otherwise the callbacks start firing before the queue starts - causing the queue to finish prematurely.

我的问题是,一旦将函数添加到队列中,HTTP 响应就会开始。理想情况下,我想构建我的队列,并且只在队列开始时才开始发出 HTTP 请求。否则回调在队列开始之前开始触发 - 导致队列过早完成。

var request = require('request') // https://www.npmjs.com/package/request
    , async = require('async'); // https://www.npmjs.com/package/async

var myLoaderQueue = []; // passed to async.parallel
var myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here

for(var i = 0; i < myUrls.length; i++){
    myLoaderQueue.push(function(callback){

        // Async http request
        request(myUrls[i], function(error, response, html) {

            // Some processing is happening here before the callback is invoked
            callback(error, html);
        });
    });
}

// The loader queue has been made, now start to process the queue
async.parallel(queue, function(err, results){
    // Done
});

Is there a better way of attacking this?

有没有更好的方法来攻击它?

回答by robertklep

Using forloops combined with asynchronous calls is problematic (with ES5) and may yield unexpected results (in your case, the wrong URL being retrieved).

for循环与异步调用结合使用是有问题的(使用 ES5)并且可能会产生意外结果(在您的情况下,检索到错误的 URL)。

Instead, consider using async.map():

相反,请考虑使用async.map()

async.map(myUrls, function(url, callback) {
  request(url, function(error, response, html) {
    // Some processing is happening here before the callback is invoked
    callback(error, html);
  });
}, function(err, results) {
  ...
});

Given that you have 1000+ url's to retrieve, async.mapLimit()may also be worth considering.

鉴于您有 1000 多个 url 需要检索,async.mapLimit()也可能值得考虑。

回答by krl

If you're willing to start using Bluebirdand Babelto utilize promisesand ES7async/ awaityou can do the following:

如果您愿意开始使用BluebirdBabel使用promisesES7async/await您可以执行以下操作:

let Promise = require('bluebird');
let request = Promise.promisify(require('request'));

let myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here

async function load() {
  try {
    // map myUrls array into array of request promises
    // wait until all request promises in the array resolve
    let results = await Promise.all(myUrls.map(request));
    // don't know if Babel await supports syntax below
    // let results = await* myUrls.map(request));
    // print array of results or use forEach 
    // to process / collect them in any other way
    console.log(results)
  } catch (e) {
    console.log(e);
  }
}

回答by trex005

I'm pretty confident you experiencing the results of a different error. By the time your queued functions are evaluating, i has been redefined, which might result in it appearing like you missed the first URLs. Try a little closure when you are queing the functions.

我非常有信心您会遇到不同错误的结果。到您排队的函数求值时, i 已被重新定义,这可能会导致它看起来像您错过了第一个 URL。当你排队函数时,尝试一点闭包。

var request = require('request') // https://www.npmjs.com/package/request
    , async = require('async'); // https://www.npmjs.com/package/async

var myLoaderQueue = []; // passed to async.parallel
var myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here

for(var i = 0; i < myUrls.length; i++){
    (function(URLIndex){
       myLoaderQueue.push(function(callback){

           // Async http request
           request(myUrls[URLIndex], function(error, response, html) {

               // Some processing is happening here before the callback is invoked
               callback(error, html);
           });
       });
    })(i);
}

// The loader queue has been made, now start to process the queue
async.parallel(queue, function(err, results){
    // Done
});