是否有计算文件的 bash 命令?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11307257/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:22:12  来源:igfitidea点击:

Is there a bash command which counts files?

bash

提问by hudi

Is there a bash command which counts the number of files that match a pattern?

是否有一个 bash 命令可以计算与模式匹配的文件数?

For example, I want to get the count of all files in a directory which match this pattern: log*

例如,我想获取目录中与此模式匹配的所有文件的计数: log*

回答by Daniel

This simple one-liner should work in any shell, not just bash:

这个简单的单行应该适用于任何 shell,而不仅仅是 bash:

ls -1q log* | wc -l

ls -1q will give you one line per file, even if they contain whitespace or special characters such as newlines.

ls -1q 将为每个文件提供一行,即使它们包含空格或特殊字符(如换行符)。

The output is piped to wc -l, which counts the number of lines.

输出通过管道传送到 wc -l,它计算行数。

回答by Mat

You can do this safely (i.e. won't be bugged by files with spaces or \nin their name) with bash:

您可以\n使用 bash安全地执行此操作(即不会被带有空格或名称的文件所干扰):

$ shopt -s nullglob
$ logfiles=(*.log)
$ echo ${#logfiles[@]}

You need to enable nullglobso that you don't get the literal *.login the $logfilesarrayif no files match. (See How to "undo" a 'set -x'?for examples of how to safely reset it.)

您需要启用,nullglob以便*.log在没有文件匹配时不会在$logfiles数组中获取文字。(有关如何安全重置它的示例,请参见如何“撤消”'set -x'?

回答by mogsie

Lots of answers here, but some don't take into account

这里有很多答案,但有些没有考虑到

  • file names with spaces, newlines, or control characters in them
  • file names that start with hyphens (imagine a file called -l)
  • hidden files, that start with a dot (if the glob was *.loginstead of log*
  • directories that match the glob (e.g. a directory called logsthat matches log*)
  • empty directories (i.e. the result is 0)
  • extremely large directories (listing them all could exhaust memory)
  • 包含空格、换行符或控制字符的文件名
  • 以连字符开头的文件名(想象一个名为 的文件-l
  • 以点开头的隐藏文件(如果 glob*.log不是log*
  • 与 glob 匹配的目录(例如,名为logs匹配的目录log*
  • 空目录(即结果为 0)
  • 非常大的目录(将它们全部列出可能会耗尽内存)

Here's a solution that handles all of them:

这是处理所有这些的解决方案:

ls 2>/dev/null -Ubad1 -- log* | wc -l

Explanation:

解释:

  • -Ucauses lsto not sort the entries, meaning it doesn't need to load the entire directory listing in memory
  • -bprints C-style escapes for nongraphic characters, crucially causing newlines to be printed as \n.
  • -aprints out all files, even hidden files (not strictly needed when the glob log*implies no hidden files)
  • -dprints out directories without attempting to list the contentsof the directory, which is what lsnormally would do
  • -1makes sure that it's on one column (ls does this automatically when writing to a pipe, so it's not strictly necessary)
  • 2>/dev/nullredirects stderr so that if there are 0 log files, ignore the error message. (Note that shopt -s nullglobwould cause lsto list the entire working directory instead.)
  • wc -lconsumes the directory listing as it's being generated, so the output of lsis never in memory at any point in time.
  • --File names are separated from the command using --so as not to be understood as arguments to ls(in case log*is removed)
  • -U导致ls不对条目进行排序,这意味着它不需要在内存中加载整个目录列表
  • -b为非图形字符打印 C 风格的转义符,关键是导致换行符打印为\n.
  • -a打印出所有文件,甚至隐藏文件(当 globlog*意味着没有隐藏文件时,不是严格需要的)
  • -d打印目录而不尝试列出目录的内容,这是ls通常会做的
  • -1确保它位于一列上(ls 在写入管道时会自动执行此操作,因此并非绝对必要)
  • 2>/dev/null重定向 stderr,以便如果有 0 个日志文件,则忽略错误消息。(请注意,这shopt -s nullglob会导致ls列出整个工作目录。)
  • wc -l在生成目录列表时使用它,因此 的输出ls在任何时候都不会在内存中。
  • --文件名与命令 using 分开,--以免被理解为参数ls(以防log*被删除)

The shell willexpand log*to the full list of files, which may exhaust memory if it's a lot of files, so then running it through grep is be better:

shell扩展log*到完整的文件列表,如果文件很多,这可能会耗尽内存,因此通过 grep 运行它会更好:

ls -Uba1 | grep ^log | wc -l

This last one handles extremely large directories of files without using a lot of memory (albeit it does use a subshell). The -dis no longer necessary, because it's only listing the contents of the current directory.

最后一个处理非常大的文件目录而不使用大量内存(尽管它确实使用了子shell)。将-d不再是必要的,因为它仅列出当前目录的内容。

回答by Will Vousden

For a recursive search:

对于递归搜索:

find . -type f -name '*.log' -printf x | wc -c

wc -cwill count the number of characters in the output of find, while -printf xtells findto print a single xfor each result.

wc -c将计算 输出中的字符数find,同时-printf x告诉为每个结果find打印一个x

For a non-recursive search, do this:

对于非递归搜索,请执行以下操作:

find . -maxdepth 1 -type f -name '*.log' -printf x | wc -c

回答by Dan Yard

The accepted answer for this question is wrong, but I have low rep so can't add a comment to it.

这个问题的公认答案是错误的,但我的代表很低,所以无法添加评论。

The correct answer to this question is given by Mat:

这个问题的正确答案是由 Mat 给出的:

shopt -s nullglob
logfiles=(*.log)
echo ${#logfiles[@]}

The problem with the accepted answer is that wc -l counts the number of newline characters, and counts them even if they print to the terminal as '?' in the output of 'ls -l'. This means that the accepted answer FAILS when a filename contains a newline character. I have tested the suggested command:

接受的答案的问题是 wc -l 计算换行符的数量,即使它们作为“?”打印到终端也会计算它们。在'ls -l'的输出中。这意味着当文件名包含换行符时,接受的答案失败。我已经测试了建议的命令:

ls -l log* | wc -l

and it erroneously reports a value of 2 even if there is only 1 file matching the pattern whose name happens to contain a newline character. For example:

即使只有 1 个文件匹配其名称恰好包含换行符的模式,它也会错误地报告值 2。例如:

touch log$'\n'def
ls log* -l | wc -l

回答by mogsie

If you have a lot of files and you don't want to use the elegant shopt -s nullgloband bash array solution, you can use find and so on as long as you don't print out the file name (which might contain newlines).

如果您有很多文件并且不想使用优雅shopt -s nullglob和 bash 数组解决方案,则可以使用 find 等,只要您不打印出文件名(可能包含换行符)。

find -maxdepth 1 -name "log*" -not -name ".*" -printf '%i\n' | wc -l

This will find all files that match log* and that don't start with .*— The "not name .*" is redunant, but it's important to note that the default for "ls" is to not show dot-files, but the default for find is to include them.

这将找到所有与 log* 匹配且不以开头的文件.*——“not name .*”是多余的,但重要的是要注意“ls”的默认值是不显示点文件,而是默认值因为 find 是包含它们。

This is a correct answer, and handles any type of file name you can throw at it, because the file name is never passed around between commands.

这是一个正确的答案,并且可以处理您可以抛出的任何类型的文件名,因为文件名永远不会在命令之间传递。

But, the shopt nullglobanswer is the best answer!

但是,shopt nullglob答案是最好的答案!

回答by zee

Here is my one liner for this.

这是我的一个班轮。

 file_count=$( shopt -s nullglob ; set -- $directory_to_search_inside/* ; echo $#)

回答by Moh .S

You can use the -R option to find the files along with those inside the recursive directories

您可以使用 -R 选项来查找文件以及递归目录中的文件

ls -R | wc -l // to find all the files

ls -R | grep log | wc -l // to find the files which contains the word log

you can use patterns on the grep

您可以在 grep 上使用模式

回答by bballdave025

I've given this answer a lot of thought, especially given the don't-parse-ls stuff. At first, I tried

我对这个答案深思熟虑,特别是考虑到don't-parse-ls 的东西。起初,我试过

<WARNING! DID NOT WORK>
du --inodes --files0-from=<(find . -maxdepth 1 -type f -print0) | awk '{sum+=int()}END{print sum}'
</WARNING! DID NOT WORK>

which worked if there was only a filename like

如果只有一个文件名,它会起作用

touch $'w\nlf.aa'

but failed if I made a filename like this

但是如果我创建这样的文件名就失败了

touch $'firstline\n3 and some other\n1\n2\texciting\n86stuff.jpg'

I finally came up with what I'm putting below. Note I was trying to get a count of all files in the directory (not including any subdirectories). I think it, along with the answers by @Mat and @Dan_Yard , as well as having at least most of the requirements set out by @mogsie (I'm not sure about memory.) I think the answer by @mogsie is correct, but I always try to stay away from parsing lsunless it's an extremely specific situation.

我终于想出了我在下面放的东西。注意我试图获取目录中所有文件的数量(不包括任何子目录)。我认为它,连同@Mat 和 @Dan_Yard 的答案,以及至少有 @mogsie 提出的大部分要求(我不确定内存。)我认为 @mogsie 的答案是正确的,但我总是尽量远离解析,ls除非是非常特殊的情况。

awk -F"
awk -F"
awk -F"
awk -F"##代码##" '{print NF-1}' < \
  <(find . -type f -name "log*" -print0) | \
    awk '{sum+=}END{print sum}'
" '{print NF-1}' < \ <(find . -maxdepth 1 -type f -name "log*" -print0) | \ awk '{sum+=}END{print sum}'
" '{print NF-1}' < \ <(find . -maxdepth 1 -type f -print0) | \ awk '{sum+=}END{print sum}'
" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=}END{print sum}'

More readably:

更易读:

##代码##

This is doing a find specifically for files, delimiting the output with a null character (to avoid problems with spaces and linefeeds), then counting the number of null characters. The number of files will be one less than the number of null characters, since there will be a null character at the end.

这是专门针对文件进行查找,用空字符分隔输出(以避免空格和换行符出现问题),然后计算空字符的数量。文件数将比空字符数少 1,因为最后会有一个空字符。

To answer the OP's question, there are two cases to consider

要回答 OP 的问题,需要考虑两种情况

1) Non-recursive search:

1) 非递归搜索:

##代码##

2) Recursive search. Note that what's inside the -nameparameter might need to be changed for slightly different behavior (hidden files, etc.).

2) 递归搜索。请注意,-name对于稍微不同的行为(隐藏文件等),可能需要更改参数中的内容。

##代码##

If anyone would like to comment on how these answers compare to those I've mentioned in this answer, please do.

如果有人想评论这些答案与我在本答案中提到的那些答案的比较,请做。



Note, I got to this thought process while getting this answer.

请注意,我在得到这个答案时进入了这个思考过程。

回答by Shuang Liang

Here's what I always do:

这是我经常做的:

ls log* | awk 'END{print NR}'

ls 日志* | awk 'END{print NR}'