Node.js:如何将流读入缓冲区?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14269233/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Node.js: How to read a stream into a buffer?
提问by Gal Ben-Haim
I wrote a pretty simple function that downloads an image from a given URL, resize it and upload to S3 (using 'gm' and 'knox'), I have no idea if I'm doing the reading of a stream to a buffer correctly. (everything is working, but is it the correct way?)
我写了一个非常简单的函数,它从给定的 URL 下载图像,调整它的大小并上传到 S3(使用“gm”和“knox”),我不知道我是否正确地将流读取到缓冲区. (一切正常,但这是正确的方法吗?)
also, I want to understand something about the event loop, how do I know that one invocation of the function won't leak anything or change the 'buf' variable to another already running invocation (or this scenario is impossible because the callbacks are anonymous functions?)
另外,我想了解有关事件循环的一些信息,我怎么知道函数的一次调用不会泄漏任何内容或将“buf”变量更改为另一个已经运行的调用(或者这种情况是不可能的,因为回调是匿名的职能?)
var http = require('http');
var https = require('https');
var s3 = require('./s3');
var gm = require('gm');
module.exports.processImageUrl = function(imageUrl, filename, callback) {
var client = http;
if (imageUrl.substr(0, 5) == 'https') { client = https; }
client.get(imageUrl, function(res) {
if (res.statusCode != 200) {
return callback(new Error('HTTP Response code ' + res.statusCode));
}
gm(res)
.geometry(1024, 768, '>')
.stream('jpg', function(err, stdout, stderr) {
if (!err) {
var buf = new Buffer(0);
stdout.on('data', function(d) {
buf = Buffer.concat([buf, d]);
});
stdout.on('end', function() {
var headers = {
'Content-Length': buf.length
, 'Content-Type': 'Image/jpeg'
, 'x-amz-acl': 'public-read'
};
s3.putBuffer(buf, '/img/d/' + filename + '.jpg', headers, function(err, res) {
if(err) {
return callback(err);
} else {
return callback(null, res.client._httpMessage.url);
}
});
});
} else {
callback(err);
}
});
}).on('error', function(err) {
callback(err);
});
};
回答by loganfsmyth
Overall I don't see anything that would break in your code.
总的来说,我没有看到任何会破坏你的代码的东西。
Two suggestions:
两个建议:
The way you are combining Bufferobjects is a suboptimal because it has to copy all the pre-existing data on every 'data' event. It would be better to put the chunks in an array and concatthem all at the end.
您组合Buffer对象的方式是次优的,因为它必须复制每个“数据”事件上的所有预先存在的数据。最好将块放在一个数组中,concat并将它们全部放在最后。
var bufs = [];
stdout.on('data', function(d){ bufs.push(d); });
stdout.on('end', function(){
var buf = Buffer.concat(bufs);
}
For performance, I would look into if the S3 library you are using supports streams. Ideally you wouldn't need to create one large buffer at all, and instead just pass the stdoutstream directly to the S3 library.
为了性能,我会调查您使用的 S3 库是否支持流。理想情况下,您根本不需要创建一个大缓冲区,而只需将stdout流直接传递给 S3 库。
As for the second part of your question, that isn't possible. When a function is called, it is allocated its own private context, and everything defined inside of that will only be accessible from other items defined inside that function.
至于你问题的第二部分,那是不可能的。当一个函数被调用时,它被分配了自己的私有上下文,其中定义的所有内容只能从该函数中定义的其他项访问。
Update
更新
Dumping the file to the filesystem would probably mean less memory usage per request, but file IO can be pretty slow so it might not be worth it. I'd say that you shouldn't optimize too much until you can profile and stress-test this function. If the garbage collector is doing its job you may be overoptimizing.
将文件转储到文件系统可能意味着每个请求的内存使用量减少,但文件 IO 可能非常慢,因此可能不值得。我想说的是,在您可以对该功能进行概要分析和压力测试之前,您不应该对其进行过多优化。如果垃圾收集器正在执行其工作,则您可能会过度优化。
With all that said, there are better ways anyway, so don't use files. Since all you want is the length, you can calculate that without needing to append all of the buffers together, so then you don't need to allocate a new Buffer at all.
尽管如此,还是有更好的方法,所以不要使用文件。由于您只需要长度,您可以计算它而无需将所有缓冲区附加在一起,因此您根本不需要分配新的缓冲区。
var pause_stream = require('pause-stream');
// Your other code.
var bufs = [];
stdout.on('data', function(d){ bufs.push(d); });
stdout.on('end', function(){
var contentLength = bufs.reduce(function(sum, buf){
return sum + buf.length;
}, 0);
// Create a stream that will emit your chunks when resumed.
var stream = pause_stream();
stream.pause();
while (bufs.length) stream.write(bufs.shift());
stream.end();
var headers = {
'Content-Length': contentLength,
// ...
};
s3.putStream(stream, ....);
回答by Tiberiu-Ionu? Stan
You can easily do this using node-fetchif you are pulling from http(s) URIs.
如果您从 http(s) URI 中提取,您可以使用node-fetch轻松完成此操作。
From the readme:
从自述文件:
fetch('https://assets-cdn.github.com/images/modules/logos_page/Octocat.png')
.then(res => res.buffer())
.then(buffer => console.log)
回答by Maddocks
I suggest loganfsmyths method, using an array to hold the data.
我建议使用 loganfsmyths 方法,使用数组来保存数据。
var bufs = [];
stdout.on('data', function(d){ bufs.push(d); });
stdout.on('end', function(){
var buf = Buffer.concat(bufs);
}
IN my current working example, i am working with GRIDfs and npm's Jimp.
在我当前的工作示例中,我正在使用 GRIDfs 和 npm 的 Jimp。
var bucket = new GridFSBucket(getDBReference(), { bucketName: 'images' } );
var dwnldStream = bucket.openDownloadStream(info[0]._id);// original size
dwnldStream.on('data', function(chunk) {
data.push(chunk);
});
dwnldStream.on('end', function() {
var buff =Buffer.concat(data);
console.log("buffer: ", buff);
jimp.read(buff)
.then(image => {
console.log("read the image!");
IMAGE_SIZES.forEach( (size)=>{
resize(image,size);
});
});
I did some other research
我做了一些其他的研究
with a string method but that did not work, per haps because i was reading from an image file, but the array method did work.
使用字符串方法但不起作用,可能是因为我正在从图像文件中读取,但数组方法确实有效。
const DISCLAIMER = "DONT DO THIS";
var data = "";
stdout.on('data', function(d){
bufs+=d;
});
stdout.on('end', function(){
var buf = Buffer.from(bufs);
//// do work with the buffer here
});
When i did the string method i got this error from npm jimp
当我使用 string 方法时,我从 npm jimp 得到了这个错误
buffer: <Buffer 00 00 00 00 00>
{ Error: Could not find MIME for Buffer <null>
basically i think the type coersion from binary to string didnt work so well.
基本上我认为从二进制到字符串的类型转换效果不佳。
回答by Andrey Sidorov
I suggest to have array of buffers and concat to resulting buffer only once at the end. Its easy to do manually, or one could use node-buffers
我建议有一个缓冲区数组,并在最后只连接到结果缓冲区一次。它很容易手动完成,或者可以使用节点缓冲区
回答by Angelos Veglektsis
I just want to post my solution. Previous answers was pretty helpful for my research. I use length-stream to get the size of the stream, but the problem here is that the callback is fired near the end of the stream, so i also use stream-cache to cache the stream and pipe it to res object once i know the content-length. In case on an error,
我只想发布我的解决方案。以前的答案对我的研究非常有帮助。我使用长度流来获取流的大小,但这里的问题是回调在流的末尾附近被触发,所以我也使用流缓存来缓存流并在我知道后将其通过管道传输到 res 对象内容长度。万一出现错误,
var StreamCache = require('stream-cache');
var lengthStream = require('length-stream');
var _streamFile = function(res , stream , cb){
var cache = new StreamCache();
var lstream = lengthStream(function(length) {
res.header("Content-Length", length);
cache.pipe(res);
});
stream.on('error', function(err){
return cb(err);
});
stream.on('end', function(){
return cb(null , true);
});
return stream.pipe(lstream).pipe(cache);
}
回答by u11414307
in ts, [].push(bufferPart) is not compatible;
在 ts 中,[].push(bufferPart) 不兼容;
so:
所以:
getBufferFromStream(stream: Part | null): Promise<Buffer> {
if (!stream) {
throw 'FILE_STREAM_EMPTY';
}
return new Promise(
(r, j) => {
let buffer = Buffer.from([]);
stream.on('data', buf => {
buffer = Buffer.concat([buffer, buf]);
});
stream.on('end', () => r(buffer));
stream.on('error', j);
}
);
}

