node.js fs.readdir 递归目录搜索

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5827612/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 14:07:46  来源:igfitidea点击:

node.js fs.readdir recursive directory search

node.jsreaddir

提问by crawf

Any ideas on an async directory search using fs.readdir? I realise that we could introduce recursion and call the read directory function with the next directory to read, but am a little worried about it not being async...

关于使用 fs.readdir 进行异步目录搜索的任何想法?我意识到我们可以引入递归并使用下一个要读取的目录调用读取目录函数,但我有点担心它不是异步的......

Any ideas? I've looked at node-walkwhich is great, but doesn't give me just the files in an array, like readdir does. Although

有任何想法吗?我看过很棒的node-walk,但它不像 readdir 那样只给我一个数组中的文件。虽然

Looking for output like...

寻找像...的输出

['file1.txt', 'file2.txt', 'dir/file3.txt']

回答by chjj

There are basically two ways of accomplishing this. In an async environment you'll notice that there are two kinds of loops: serial and parallel. A serial loop waits for one iteration to complete before it moves onto the next iteration - this guarantees that every iteration of the loop completes in order. In a parallel loop, all the iterations are started at the same time, and one may complete before another, however, it is much faster than a serial loop. So in this case, it's probably better to use a parallel loop because it doesn't matter what order the walk completes in, just as long as it completes and returns the results (unless you want them in order).

基本上有两种方法可以实现这一点。在异步环境中,您会注意到有两种循环:串行和并行。串行循环在进入下一次迭代之前等待一次迭代完成——这保证了循环的每次迭代都按顺序完成。在并行循环中,所有的迭代都是同时开始的,并且一个可能在另一个之前完成,但是,它比串行循环快得多。因此,在这种情况下,最好使用并行循环,因为步行完成的顺序无关紧要,只要它完成并返回结果(除非您希望它们按顺序排列)。

A parallel loop would look like this:

并行循环如下所示:

var fs = require('fs');
var path = require('path');
var walk = function(dir, done) {
  var results = [];
  fs.readdir(dir, function(err, list) {
    if (err) return done(err);
    var pending = list.length;
    if (!pending) return done(null, results);
    list.forEach(function(file) {
      file = path.resolve(dir, file);
      fs.stat(file, function(err, stat) {
        if (stat && stat.isDirectory()) {
          walk(file, function(err, res) {
            results = results.concat(res);
            if (!--pending) done(null, results);
          });
        } else {
          results.push(file);
          if (!--pending) done(null, results);
        }
      });
    });
  });
};

A serial loop would look like this:

串行循环如下所示:

var fs = require('fs');
var path = require('path');
var walk = function(dir, done) {
  var results = [];
  fs.readdir(dir, function(err, list) {
    if (err) return done(err);
    var i = 0;
    (function next() {
      var file = list[i++];
      if (!file) return done(null, results);
      file = path.resolve(dir, file);
      fs.stat(file, function(err, stat) {
        if (stat && stat.isDirectory()) {
          walk(file, function(err, res) {
            results = results.concat(res);
            next();
          });
        } else {
          results.push(file);
          next();
        }
      });
    })();
  });
};

And to test it out on your home directory (WARNING: the results list will be huge if you have a lot of stuff in your home directory):

并在您的主目录中对其进行测试(警告:如果您的主目录中有很多内容,结果列表将会很大):

walk(process.env.HOME, function(err, results) {
  if (err) throw err;
  console.log(results);
});

EDIT: Improved examples.

编辑:改进的例子。

回答by qwtel

This one uses the maximum amount of new, buzzwordy features available in node 8, including Promises, util/promisify, destructuring, async-await, map+reduce and more, making your co-workers scratch their heads as they try to figure out what is going on.

这个使用了 node 8 中可用的最大数量的新的、流行的特性,包括 Promises、util/promisify、destructuring、async-await、map+reduce 等等,让你的同事在他们试图弄清楚什么时挠头正在进行。

Node 8+

节点 8+

No external dependencies.

没有外部依赖。

const { promisify } = require('util');
const { resolve } = require('path');
const fs = require('fs');
const readdir = promisify(fs.readdir);
const stat = promisify(fs.stat);

async function getFiles(dir) {
  const subdirs = await readdir(dir);
  const files = await Promise.all(subdirs.map(async (subdir) => {
    const res = resolve(dir, subdir);
    return (await stat(res)).isDirectory() ? getFiles(res) : res;
  }));
  return files.reduce((a, f) => a.concat(f), []);
}

Usage

用法

getFiles(__dirname)
  .then(files => console.log(files))
  .catch(e => console.error(e));

Node 10.10+

节点 10.10+

Updated for node 10+ with even more whizbang:

为节点 10+ 更新了更多 whizbang:

const { resolve } = require('path');
const { readdir } = require('fs').promises;

async function getFiles(dir) {
  const dirents = await readdir(dir, { withFileTypes: true });
  const files = await Promise.all(dirents.map((dirent) => {
    const res = resolve(dir, dirent.name);
    return dirent.isDirectory() ? getFiles(res) : res;
  }));
  return Array.prototype.concat(...files);
}

Note that starting with node 11.15.0 you can use files.flat()instead of Array.prototype.concat(...files)to flatten the files array.

请注意,从节点 11.15.0 开始,您可以使用files.flat()代替Array.prototype.concat(...files)来展平文件数组。

Node 11+

节点 11+

If you want to blow everybody's head up completely, you can use the following version using async iterators. In addition to being really cool, it also allows consumers to pull out results one-at-a-time, making it better suited for really large directories.

如果你想彻底炸毁每个人的头,你可以使用async iterators使用以下版本。除了非常酷之外,它还允许消费者一次拉出一个结果,使其更适合真正大的目录。

const { resolve } = require('path');
const { readdir } = require('fs').promises;

async function* getFiles(dir) {
  const dirents = await readdir(dir, { withFileTypes: true });
  for (const dirent of dirents) {
    const res = resolve(dir, dirent.name);
    if (dirent.isDirectory()) {
      yield* getFiles(res);
    } else {
      yield res;
    }
  }
}

Usage has changed because the return type is now an async iterator instead of a promise

用法已更改,因为返回类型现在是异步迭代器而不是承诺

(async () => {
  for await (const f of getFiles('.')) {
    console.log(f);
  }
})()

In case somebody is interested, I've written more about async iterators here: https://qwtel.com/posts/software/async-generators-in-the-wild/

如果有人感兴趣,我在这里写了更多关于异步迭代器的内容:https: //qwtel.com/posts/software/async-generators-in-the-wild/

回答by Victor Powell

Just in case anyone finds it useful, I also put together a synchronousversion.

以防万一有人觉得它有用,我还整理了一个同步版本。

var walk = function(dir) {
    var results = [];
    var list = fs.readdirSync(dir);
    list.forEach(function(file) {
        file = dir + '/' + file;
        var stat = fs.statSync(file);
        if (stat && stat.isDirectory()) { 
            /* Recurse into a subdirectory */
            results = results.concat(walk(file));
        } else { 
            /* Is a file */
            results.push(file);
        }
    });
    return results;
}

Tip: To use less resources when filtering. Filter within this function itself. E.g. Replace results.push(file);with below code. Adjust as required:

提示:过滤时使用较少的资源。在此功能本身内进行过滤。例如results.push(file);用下面的代码替换。根据需要调整:

    file_type = file.split(".").pop();
    file_name = file.split(/(\|\/)/g).pop();
    if (file_type == "json") results.push(file);

回答by Johann Philipp Strathausen

A.Have a look at the file module. It has a function called walk:

A.看看文件模块。它有一个名为 walk 的函数:

file.walk(start, callback)

Navigates a file tree, calling callback for each directory, passing in (null, dirPath, dirs, files).

file.walk(开始,回调)

导航文件树,为每个目录调用回调,传入 (null, dirPath, dirs, files)。

This may be for you! And yes, it is async. However, I think you would have to aggregate the full path's yourself, if you needed them.

这可能适合你!是的,它是异步的。但是,我认为如果需要,您必须自己汇总完整路径。

B.An alternative, and even one of my favourites: use the unix findfor that. Why do something again, that has already been programmed? Maybe not exactly what you need, but still worth checking out:

B.另一种选择,甚至是我最喜欢的一个:find为此使用 unix 。为什么要再做一些已经编程的事情?也许不完全是你需要的,但仍然值得一试:

var execFile = require('child_process').execFile;
execFile('find', [ 'somepath/' ], function(err, stdout, stderr) {
  var file_list = stdout.split('\n');
  /* now you've got a list with full path file names */
});

Find has a nice build-in caching mechanism that makes subsequent searches very fast, as long as only few folder have changed.

Find 有一个很好的内置缓存机制,只要只有几个文件夹发生了变化,它就可以非常快速地进行后续搜索。

回答by Thorsten Lorenz

Another nice npm package is glob.

另一个不错的 npm 包是glob

npm install glob

npm install glob

It is very powerful and should cover all your recursing needs.

它非常强大,应该涵盖您所有的递归需求。

Edit:

编辑:

I actually wasn't perfectly happy with glob, so I created readdirp.

我实际上对 glob 并不完全满意,所以我创建了readdirp

I'm very confident that its API makes finding files and directories recursively and applying specific filters very easy.

我非常有信心它的 API 使递归查找文件和目录以及应用特定过滤器变得非常容易。

Read through its documentationto get a better idea of what it does and install via:

通读它的文档以更好地了解它的作用并通过以下方式安装:

npm install readdirp

npm install readdirp

回答by Diogo Cardoso

I recommend using node-globto accomplish that task.

我建议使用node-glob来完成该任务。

var glob = require( 'glob' );  

glob( 'dirname/**/*.js', function( err, files ) {
  console.log( files );
});

回答by Domenic

If you want to use an npm package, wrenchis pretty good.

如果你想使用 npm 包,扳手是很不错的。

var wrench = require("wrench");

var files = wrench.readdirSyncRecursive("directory");

wrench.readdirRecursive("directory", function (error, files) {
    // live your dreams
});

EDIT (2018):
Anyone reading through in recent time: The author deprecated this package in 2015:

编辑(2018 年):
最近阅读的任何人:作者在 2015 年弃用了这个包:

wrench.js is deprecated, and hasn't been updated in quite some time. I heavily recommend using fs-extrato do any extra filesystem operations.

wrench.js 已弃用,并且已经有一段时间没有更新了。我强烈建议使用 fs-extra来执行任何额外的文件系统操作。

回答by kalisjoshua

I loved the answerfrom chjjabove and would not have been able to create my version of the parallel loop without that start.

我喜欢的答案,从chjj以上,并且不会已经能够没有这种开始创建我的版本并行循环的。

var fs = require("fs");

var tree = function(dir, done) {
  var results = {
        "path": dir
        ,"children": []
      };
  fs.readdir(dir, function(err, list) {
    if (err) { return done(err); }
    var pending = list.length;
    if (!pending) { return done(null, results); }
    list.forEach(function(file) {
      fs.stat(dir + '/' + file, function(err, stat) {
        if (stat && stat.isDirectory()) {
          tree(dir + '/' + file, function(err, res) {
            results.children.push(res);
            if (!--pending){ done(null, results); }
          });
        } else {
          results.children.push({"path": dir + "/" + file});
          if (!--pending) { done(null, results); }
        }
      });
    });
  });
};

module.exports = tree;

I created a Gistas well. Comments welcome. I am still starting out in the NodeJS realm so that is one way I hope to learn more.

我也创建了一个 Gist。欢迎评论。我仍然在 NodeJS 领域起步,所以这是我希望了解更多信息的一种方式。

回答by Christiaan Westerbeek

Use node-dirto produce exactly the output you like

使用node-dir精确生成您喜欢的输出

var dir = require('node-dir');

dir.files(__dirname, function(err, files) {
  if (err) throw err;
  console.log(files);
  //we have an array of files now, so now we can iterate that array
  files.forEach(function(path) {
    action(null, path);
  })
});

回答by Loourr

With Recursion

带递归

var fs = require('fs')
var path = process.cwd()
var files = []

var getFiles = function(path, files){
    fs.readdirSync(path).forEach(function(file){
        var subpath = path + '/' + file;
        if(fs.lstatSync(subpath).isDirectory()){
            getFiles(subpath, files);
        } else {
            files.push(path + '/' + file);
        }
    });     
}

Calling

打电话

getFiles(path, files)
console.log(files) // will log all files in directory