使用 http.request 获取 node.js 中的二进制内容

Question

提问by edi9999

I would like to retrieve binary data from an https request.

我想从 https 请求中检索二进制数据。

I found a similar questionthat uses the request method, Getting binary content in Node.js using request, is says setting encodingto nullshould work, but it doesn't.

我发现了一个使用请求方法的类似问题，使用 request 在 Node.js 中获取二进制内容，是说将编码设置为null应该可以工作，但它没有。

options = {
    hostname: urloptions.hostname,
    path: urloptions.path,
    method: 'GET',
    rejectUnauthorized: false,
    encoding: null
};

req = https.request(options, function(res) {
    var data;
    data = "";
    res.on('data', function(chunk) {
        return data += chunk;
    });
    res.on('end', function() {
        return loadFile(data);
    });
    res.on('error', function(err) {
        console.log("Error during HTTP request");
        console.log(err.message);
    });
})

Edit: setting encoding to 'binary'doesn't work either

编辑：将编码设置为“二进制”也不起作用

Answer 1

回答by Guaycuru

The accepted answer did not work for me (i.e., setting encoding to binary), even the user who asked the question mentioned it did not work.

接受的答案对我不起作用（即，将编码设置为二进制），即使提出问题的用户也提到它不起作用。

Here's what worked for me, taken from: http://chad.pantherdev.com/node-js-binary-http-streams/

这是对我有用的方法，取自：http: //chad.pantherdev.com/node-js-binary-http-streams/

http.get(url.parse('http://myserver.com:9999/package'), function(res) {
    var data = [];

    res.on('data', function(chunk) {
        data.push(chunk);
    }).on('end', function() {
        //at this point data is an array of Buffers
        //so Buffer.concat() can make us a new Buffer
        //of all of them together
        var buffer = Buffer.concat(data);
        console.log(buffer.toString('base64'));
    });
});

Edit:Update answer following a suggestion by Semicolon

编辑：根据分号的建议更新答案

Answer 2

回答by moka

You need to set encoding to response, not request:

您需要将编码设置为响应，而不是请求：

req = https.request(options, function(res) {
    res.setEncoding('binary');

    var data = [ ];

    res.on('data', function(chunk) {
        data.push(chunk);
    });
    res.on('end', function() {
        var binary = Buffer.concat(data);
        // binary is your data
    });
    res.on('error', function(err) {
        console.log("Error during HTTP request");
        console.log(err.message);
    });
});

Here is useful answer: Writing image to local server

这是有用的答案：将图像写入本地服务器

Answer 3

回答by P?rt Johanson

Running on NodeJS 6.10(and 8.10, tested in Feb 2019) in the AWS Lambda environment, none of the solutions above worker for me.

在 AWS Lambda 环境中的 NodeJS 6.10（和 8.10，于 2019 年 2 月测试）上运行，上述解决方案都不适合我。

What did work for me was the following:

对我有用的是以下内容：

https.get(opt, (res) => {
    res.setEncoding('binary');
    let chunks = [];

    res.on('data', (chunk) => {
        chunks.push(Buffer.from(chunk, 'binary'));
    });

    res.on('end', () => {
        let binary = Buffer.concat(chunks);
        // binary is now a Buffer that can be used as Uint8Array or as
        // any other TypedArray for data processing in NodeJS or 
        // passed on via the Buffer to something else.
    });
});

Take note the res.setEncoding('binary'); and Buffer.from(chunk, 'binary') lines. One sets the response encoding and the other creates a Buffer object from the string provided in the encoding specified previously.

注意 res.setEncoding('binary'); 和 Buffer.from(chunk, 'binary') 行。一个设置响应编码，另一个从前面指定的编码中提供的字符串创建一个 Buffer 对象。

Answer 4

回答by caffeinatedbits

P?rt Johanson I wish I could comment just to thank you for saving me from the recursive loop I've been in all day of ripping my hair out and then reading the (incredibly unhelpful) node docs on this, over, and over. Upon finding your answer, I went to dig into the docs, and I can't even find the res.setEncodingmethod documented anywhere! It's just shown as part of two examples, wherein they call res.setEncoding('utf8');Where did you find this or how did you figure it out!?

P?rt Johanson 我希望我能发表评论，感谢您将我从递归循环中拯救出来，我整天都在扯头发，然后一遍又一遍地阅读（非常无用的）节点文档。找到你的答案后，我去深入研究文档，我什至找不到res.setEncoding任何地方记录的方法！它只是作为两个示例的一部分显示的，其中他们称res.setEncoding('utf8');您在哪里找到这个或您是如何弄清楚的！？

Since I don't have enough reputation to comment, I'll at least contribute something useful with my answer: P?rt Johanson's answer worked 100% for me, I just tweaked it a bit for my needs because I'm using it to download and eval a script hosted on my server (and compiled with nwjc) using nw.Window.get().evalNWBin()on NWJS 0.36.4 / Node 11.11.0:

由于我没有足够的声誉来发表评论，我至少会为我的答案贡献一些有用的东西：P?rt Johanson 的答案对我来说是 100% 的，我只是根据我的需要稍微调整了一下，因为我正在使用它使用nw.Window.get().evalNWBin()NWJS 0.36.4 / Node 11.11.0下载并评估托管在我的服务器上的脚本（并使用 nwjc 编译）：

let opt = {...};
let req = require('https').request(opt, (res) => {
  // server error returned
  if (200 !== res.statusCode) {
    res.setEncoding('utf8');
    let data = '';
    res.on('data', (strData) => {
      data += strData;
    });
    res.on('end', () => {
      if (!res.complete) {
        console.log('Server error, incomplete response: ' + data);
      } else {
        console.log('Server error, response: ' + data);
      }
    });
  }
  // expected response
  else {
    res.setEncoding('binary');
    let data = [];
    res.on('data', (binData) => {
      data.push(Buffer.from(binData, 'binary'));
    });
    res.on('end', () => {
      data = Buffer.concat(data);
      if (!res.complete) {
        console.log('Request completed, incomplete response, ' + data.length + ' bytes received);
      } else {
        console.log('Request completed, ' + data.length + ' bytes received');
        nw.Window.get().evalNWBin(null, data);
      }
    });
  }
};

Edit: P.S. I posted this just in case anyone wanted to know how to handle a non-binary response -- my actual code goes a little deeper and checks response content type header to parse JSON (intended failure, i.e. 400, 401, 403) or HTML (unexpected failure, i.e. 404 or 500)

编辑：PS 我发布这个只是为了防止有人想知道如何处理非二进制响应——我的实际代码更深入一点并检查响应内容类型标头以解析 JSON（预期失败，即 400、401、403）或 HTML（意外失败，即 404 或 500）

Answer 5

回答by Naijia Liu

Don't call setEncoding()method, because by default, no encoding is assigned and stream data will be returned as Bufferobjects
Call Buffer.from()in on.datacallback method to convert the chunkvalue to a Bufferobject.

不要调用setEncoding()方法，因为默认情况下，没有分配编码，流数据将作为Buffer对象返回
调用Buffer.from()的on.data回调方法的转换chunk价值为Buffer对象。

http.get('my_url', (response) => {
  const chunks = [];
  response.on('data', chunk => chunks.push(Buffer.from(chunk))) // Converte `chunk` to a `Buffer` object.
    .on('end', () => {
      const buffer = Buffer.concat(chunks);
      console.log(buffer.toString('base64'));
    });
});

Answer 6

回答by noseratio

As others here, I needed to process binary data chunks from Node.js HTTP response (aka http.IncomingMessage).

和这里的其他人一样，我需要处理来自 Node.js HTTP 响应（又名http.IncomingMessage）的二进制数据块。

None of the existing answers really worked for my Electron 6 project (bundled with Node.js 12.4.0, at the time of posting), besides P?rt Johanson's answerand its variants.

除了 P?rt Johanson 的答案及其变体之外，现有的答案都不适用于我的 Electron 6 项目（发布时与 Node.js 12.4.0 捆绑在一起）。

Still, even with that solution, the chunks were always arriving at the response.on('data', ondata)handler as stringobjects (rather than expected and desired Bufferobjects). That incurred extra conversion with Buffer.from(chunk, 'binary'). I was getting strings regardless of whether I explicitly specified binary encoding with response.setEncoding('binary')or response.setEncoding(null).

尽管如此，即使使用该解决方案，块也总是response.on('data', ondata)作为string对象（而不是预期和期望的Buffer对象）到达处理程序。这导致了额外的转换Buffer.from(chunk, 'binary')。无论我是否使用response.setEncoding('binary')或明确指定二进制编码，我都在获取字符串response.setEncoding(null)。

The only way I managed to get the original Bufferchunks was to pipe the responseto an instance of stream.Writablewhere I provide a custom writemethod:

我设法获得原始Buffer块的唯一方法是将其通过管道response传输到stream.Writable我提供自定义write方法的实例：

const https = require('https');
const { Writable } = require('stream');

async function getBinaryDataAsync(url) {
  // start HTTP request, get binary response
  const { request, response } = await new Promise((resolve, reject) => {
    const request = https.request(url, { 
      method: 'GET', 
        headers: { 
          'Accept': 'application/pdf', 
          'Accept-Encoding': 'identity'
        }        
      }
    );

    request.on('response', response => 
      resolve({request, response}));
    request.on('error', reject);
    request.end();
  });

  // read the binary response by piping it to stream.Writable
  const buffers = await new Promise((resolve, reject) => {

    response.on('aborted', reject);
    response.on('error', reject);

    const chunks = [];

    const stream = new Writable({
      write: (chunk, encoding, notifyComplete) => {
        try {
          chunks.push(chunk);
          notifyComplete();      
        }
        catch(error) {
          notifyComplete(error);      
        }
      }
    });

    stream.on('error', reject);
    stream.on('finish', () => resolve(chunks));
    response.pipe(stream);
  });

  const buffer = Buffer.concat(buffers);
  return buffer.buffer; // as ArrayBuffer
}

async function main() {
  const arrayBuff = await getBinaryDataAsync('https://download.microsoft.com/download/8/A/4/8A48E46A-C355-4E5C-8417-E6ACD8A207D4/VisualStudioCode-TipsAndTricks-Vol.1.pdf');
  console.log(arrayBuff.byteLength);
};

main().catch(error => console.error(error));

Updated, as it turns, this behavior only manifests for our Web API server. So, response.on('data')actually works well for the sample URL I use in the above code snippet and the stream is not needed for it. It's weird though this is sever-specific, I'm investigating it further.

更新，事实上，这种行为仅在我们的 Web API 服务器上体现。因此，response.on('data')对于我在上面的代码片段中使用的示例 URL ，实际上效果很好，并且不需要流。虽然这是特定于服务器的，但很奇怪，我正在进一步调查。

Answer 7

回答by ShortFuse

Everyone here is on the right track, but to put the bed the issue, you cannotcall .setEncoding()EVER.

这里的每个人都走在正确的轨道上，但要把床放在问题上，你永远不能打电话.setEncoding()。

If you call .setEncoding(), it will create a StringDecoderand set it as the default decoder. If you try to pass nullor undefined, then it will still create a StringDecoderwith its default decoder of UTF-8. Even if you call .setEncoding('binary'), it's the same as calling .setEncoding('latin1'). Yes, seriously.

如果调用.setEncoding()，它将创建一个StringDecoder并将其设置为默认解码器。如果您尝试传递null或undefined，那么它仍将StringDecoder使用其默认解码器UTF-8. 即使你打电话.setEncoding('binary')，也和打电话一样.setEncoding('latin1')。是的，认真的。

I wish I could say you set ._readableState.encodingand _readableState.decoderback to null, but when you call .setEncoding()buffer gets wiped and replaced with a binary encoding of the decoded string of what was there before. That means your data has already been changed.

我希望我可以说你设置._readableState.encoding并_readableState.decoder返回到null，但是当你调用.setEncoding()缓冲区时，缓冲区被擦除并替换为之前存在的解码字符串的二进制编码。这意味着您的数据已被更改。

If you want to "undo" the decoding, you have to re-encode the data stream back into binary like so:

如果要“撤消”解码，则必须将数据流重新编码回二进制，如下所示：

  req.on('data', (chunk) => {
      let buffer;
      if (typeof chunk === 'string') {
        buffer = Buffer.from(chunk, req.readableEncoding);
      } else {
        buffer = chunk;
      }
      // Handle chunk
  });

Of course, if you never call .setEncoding(), then you don't have to worry about the chunk being returned as a string.

当然，如果您从不调用.setEncoding()，那么您不必担心块作为string.

After you have a your chunk as Buffer, then you can work with it as you chose. In the interested of thoroughness, here's how to use with a preset buffer size, while also checking Content-Length:

在您将块设置为Buffer之后，您就可以根据自己的选择使用它了。出于彻底性的考虑，以下是如何使用预设缓冲区大小，同时还要检查Content-Length：

const BUFFER_SIZE = 4096;

/**
 * @param {IncomingMessage} req
 * @return {Promise<Buffer>}
 */
function readEntireRequest(req) {
  return new Promise((resolve, reject) => {
    const expectedSize = parseInt(req.headers['content-length'], 10) || null;
    let data = Buffer.alloc(Math.min(BUFFER_SIZE, expectedSize || BUFFER_SIZE));
    let bytesWritten = 0;
    req.on('data', (chunk) => {
      if ((chunk.length + bytesWritten) > data.length) {
        // Buffer is too small. Double it.
        let newLength = data.length * 2;
        while (newLength < chunk.length + data.length) {
          newLength *= 2;
        }
        const newBuffer = Buffer.alloc(newLength);
        data.copy(newBuffer);
        data = newBuffer;
      }
      bytesWritten += chunk.copy(data, bytesWritten);
      if (bytesWritten === expectedSize) {
        // If we trust Content-Length, we could return immediately here.
      }
    });
    req.on('end', () => {
      if (data.length > bytesWritten) {
        // Return a slice of the original buffer
        data = data.subarray(0, bytesWritten);
      }
      resolve(data);
    });
    req.on('error', (err) => {
      reject(err);
    });
  });
}

The choice to use a buffer size here is to avoid immediately reserving a large amount of memory, but instead only fetch RAM as needed. The Promisefunctionality is just for convenience.

此处选择使用缓冲区大小是为了避免立即保留大量内存，而是仅根据需要获取 RAM。该Promise功能仅仅是为了方便。

使用 http.request 获取 node.js 中的二进制内容

提问by edi9999

回答by Guaycuru

回答by moka

回答by P?rt Johanson

回答by caffeinatedbits

回答by Naijia Liu

回答by noseratio

回答by ShortFuse

相关推荐

最近更新

标签

使用 http.request 获取 node.js 中的二进制内容

提问by edi9999

回答by Guaycuru

回答by moka

回答by P?rt Johanson

回答by caffeinatedbits

回答by Naijia Liu

回答by noseratio

回答by ShortFuse

相关推荐

为什么我无法从 Expressjs/Nodejs 访问我的 javascript 文件？

node.js 已安装 karma 时找不到 karma 命令

如何在 node.js 中进行类似`tail -f logfile.txt`的处理？

node.js 将变量绑定到回调函数

相关推荐

最近更新

标签