Javascript 在 node.js 中一次读取一行文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6156501/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 20:28:47  来源:igfitidea点击:

Read a file one line at a time in node.js?

javascriptnode.jsfile-iolazy-evaluation

提问by Alex C

I am trying to read a large file one line at a time. I found a question on Quorathat dealt with the subject but I'm missing some connections to make the whole thing fit together.

我正在尝试一次一行读取一个大文件。我在 Quora 上发现了一个关于这个主题的问题,但我缺少一些联系来使整个事情融合在一起。

 var Lazy=require("lazy");
 new Lazy(process.stdin)
     .lines
     .forEach(
          function(line) { 
              console.log(line.toString()); 
          }
 );
 process.stdin.resume();

The bit that I'd like to figure out is how I might read one line at a time from a file instead of STDIN as in this sample.

我想弄清楚的是如何一次从文件中读取一行,而不是像本示例中那样从 STDIN 读取。

I tried:

我试过:

 fs.open('./VeryBigFile.csv', 'r', '0666', Process);

 function Process(err, fd) {
    if (err) throw err;
    // DO lazy read 
 }

but it's not working. I know that in a pinch I could fall back to using something like PHP, but I would like to figure this out.

但它不起作用。我知道在紧要关头我可以回退到使用 PHP 之类的东西,但我想弄清楚这一点。

I don't think the other answer would work as the file is much larger than the server I'm running it on has memory for.

我不认为另一个答案会起作用,因为该文件比我运行它的服务器的内存大得多。

回答by Dan Dascalescu

Since Node.js v0.12 and as of Node.js v4.0.0, there is a stable readlinecore module. Here's the easiest way to read lines from a file, without any external modules:

从 Node.js v0.12 和 Node.js v4.0.0 开始,有一个稳定的readline核心模块。这是从文件中读取行的最简单方法,无需任何外部模块:

const fs = require('fs');
const readline = require('readline');

async function processLineByLine() {
  const fileStream = fs.createReadStream('input.txt');

  const rl = readline.createInterface({
    input: fileStream,
    crlfDelay: Infinity
  });
  // Note: we use the crlfDelay option to recognize all instances of CR LF
  // ('\r\n') in input.txt as a single line break.

  for await (const line of rl) {
    // Each line in input.txt will be successively available here as `line`.
    console.log(`Line from file: ${line}`);
  }
}

processLineByLine();

Or alternatively:

或者:

var lineReader = require('readline').createInterface({
  input: require('fs').createReadStream('file.in')
});

lineReader.on('line', function (line) {
  console.log('Line from file:', line);
});

The last line is read correctly (as of Node v0.12 or later), even if there is no final \n.

最后一行被正确读取(从 Node v0.12 或更高版本开始),即使没有 final \n

UPDATE: this example has been added to Node's API official documentation.

更新:此示例已添加到 Node 的 API 官方文档中

回答by kofrasa

For such a simple operation there shouldn't be any dependency on third-party modules. Go easy.

对于这样一个简单的操作,不应该对第三方模块有任何依赖。放轻松。

var fs = require('fs'),
    readline = require('readline');

var rd = readline.createInterface({
    input: fs.createReadStream('/path/to/file'),
    output: process.stdout,
    console: false
});

rd.on('line', function(line) {
    console.log(line);
});

回答by Raynos

You don't have to openthe file, but instead, you have to create a ReadStream.

您不必使用open该文件,而是必须创建一个ReadStream.

fs.createReadStream

fs.createReadStream

Then pass that stream to Lazy

然后将该流传递给 Lazy

回答by polaretto

there is a very nice module for reading a file line by line, it's called line-reader

有一个非常好的模块可以逐行读取文件,它叫做line-reader

with it you simply just write:

有了它,你只需写:

var lineReader = require('line-reader');

lineReader.eachLine('file.txt', function(line, last) {
  console.log(line);
  // do whatever you want with line...
  if(last){
    // or check if it's the last one
  }
});

you can even iterate the file with a "java-style" interface, if you need more control:

如果您需要更多控制,您甚至可以使用“java 样式”界面迭代文件:

lineReader.open('file.txt', function(reader) {
  if (reader.hasNextLine()) {
    reader.nextLine(function(line) {
      console.log(line);
    });
  }
});

回答by John Williams

require('fs').readFileSync('file.txt', 'utf-8').split(/\r?\n/).forEach(function(line){
  console.log(line);
})

回答by Lead Developer

Update in 2019

2019年更新

An awesome example is already posted on official Nodejs documentation. here

一个很棒的例子已经发布在官方的 Nodejs 文档中。这里

This requires the latest Nodejs is installed on your machine. >11.4

这需要在您的机器上安装最新的 Nodejs。>11.4

const fs = require('fs');
const readline = require('readline');

async function processLineByLine() {
  const fileStream = fs.createReadStream('input.txt');

  const rl = readline.createInterface({
    input: fileStream,
    crlfDelay: Infinity
  });
  // Note: we use the crlfDelay option to recognize all instances of CR LF
  // ('\r\n') in input.txt as a single line break.

  for await (const line of rl) {
    // Each line in input.txt will be successively available here as `line`.
    console.log(`Line from file: ${line}`);
  }
}

processLineByLine();

回答by nf071590

Old topic, but this works:

老话题,但这有效:

var rl = readline.createInterface({
      input : fs.createReadStream('/path/file.txt'),
      output: process.stdout,
      terminal: false
})
rl.on('line',function(line){
     console.log(line) //or parse line
})

Simple. No need for an external module.

简单的。无需外部模块。

回答by Ernelli

You can always roll your own line reader. I have'nt benchmarked this snippet yet, but it correctly splits the incoming stream of chunks into lines without the trailing '\n'

您可以随时滚动自己的行阅读器。我还没有对这个片段进行基准测试,但它正确地将传入的块流拆分为没有尾随 '\n' 的行

var last = "";

process.stdin.on('data', function(chunk) {
    var lines, i;

    lines = (last+chunk).split("\n");
    for(i = 0; i < lines.length - 1; i++) {
        console.log("line: " + lines[i]);
    }
    last = lines[i];
});

process.stdin.on('end', function() {
    console.log("line: " + last);
});

process.stdin.resume();

I did come up with this when working on a quick log parsing script that needed to accumulate data during the log parsing and I felt that it would nice to try doing this using js and node instead of using perl or bash.

我在处理一个需要在日志解析过程中积累数据的快速日志解析脚本时确实想到了这个,我觉得尝试使用 js 和 node 而不是使用 perl 或 bash 来做到这一点会很好。

Anyway, I do feel that small nodejs scripts should be self contained and not rely on third party modules so after reading all the answers to this question, each using various modules to handle line parsing, a 13 SLOC native nodejs solution might be of interest .

无论如何,我确实觉得小 nodejs 脚本应该是自包含的,而不是依赖第三方模块,所以在阅读了这个问题的所有答案后,每个模块都使用各种模块来处理行解析,一个 13 SLOC 本机 nodejs 解决方案可能会引起兴趣。

回答by Touv

With the carrier module:

使用载体模块

var carrier = require('carrier');

process.stdin.resume();
carrier.carry(process.stdin, function(line) {
    console.log('got one line: ' + line);
});

回答by j03m

I ended up with a massive, massive memory leak using Lazy to read line by line when trying to then process those lines and write them to another stream due to the way drain/pause/resume in node works (see: http://elegantcode.com/2011/04/06/taking-baby-steps-with-node-js-pumping-data-between-streams/(i love this guy btw)). I haven't looked closely enough at Lazy to understand exactly why, but I couldn't pause my read stream to allow for a drain without Lazy exiting.

由于节点中的drain/pause/resume方式(参见:http://elegantcode .com/2011/04/06/taking-baby-steps-with-node-js-pumping-data-between-streams/(顺便说一句,我喜欢这个家伙))。我还没有足够仔细地观察 Lazy 以了解确切原因,但我无法暂停我的读取流以允许在没有 Lazy 退出的情况下耗尽。

I wrote the code to process massive csv files into xml docs, you can see the code here: https://github.com/j03m/node-csv2xml

我编写了将大量 csv 文件处理为 xml 文档的代码,您可以在这里查看代码:https: //github.com/j03m/node-csv2xml

If you run the previous revisions with Lazy line it leaks. The latest revision doesn't leak at all and you can probably use it as the basis for a reader/processor. Though I have some custom stuff in there.

如果您使用 Lazy line 运行以前的修订版,它会泄漏。最新版本根本不会泄漏,您可能可以将其用作阅读器/处理器的基础。虽然我在那里有一些定制的东西。

Edit: I guess I should also note that my code with Lazy worked fine until I found myself writing large enough xml fragments that drain/pause/resume because a necessity. For smaller chunks it was fine.

编辑:我想我还应该注意,我使用 Lazy 的代码运行良好,直到我发现自己编写了足够大的 xml 片段,这些片段由于需要而耗尽/暂停/恢复。对于较小的块,这很好。