bash 如何遍历find返回的文件名?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9612090/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 21:45:37  来源:igfitidea点击:

How to loop through file names returned by find?

bashfind

提问by Haiyuan Zhang

x=$(find . -name "*.txt")
echo $x

if I run the above piece of code in Bash shell, what I get is a string containing several file names separated by blank, not a list.

如果我在 Bash shell 中运行上面的代码,我得到的是一个字符串,其中包含多个由空格分隔的文件名,而不是一个列表。

Of course, I can further separate them by blank to get a list, but I'm sure there is a better way to do it.

当然,我可以进一步用空白将它们分开以获得列表,但我相信有更好的方法来做到这一点。

So what is the best way to loop through the results of a findcommand?

那么循环遍历find命令结果的最佳方法是什么?

回答by Kevin

TL;DR: If you're just here for the most correct answer, you probably want my personal preference, find . -name '*.txt' -exec process {} \;(see the bottom of this post). If you have time, read through the rest to see several different ways and the problems with most of them.

TL;DR:如果你只是为了最正确的答案,你可能想要我的个人偏好,find . -name '*.txt' -exec process {} \;(见这篇文章的底部)。如果您有时间,请通读其余部分以了解几种不同的方法以及其中大多数方法的问题。



The full answer:

完整答案:

The best way depends on what you want to do, but here are a few options. As long as no file or folder in the subtree has whitespace in its name, you can just loop over the files:

最好的方法取决于您想要做什么,但这里有几个选项。只要子树中的文件或文件夹的名称中没有空格,您就可以遍历这些文件:

for i in $x; do # Not recommended, will break on whitespace
    process "$i"
done

Marginally better, cut out the temporary variable x:

稍微好一点,去掉临时变量x

for i in $(find -name \*.txt); do # Not recommended, will break on whitespace
    process "$i"
done

It is muchbetter to glob when you can. White-space safe, for files in the current directory:

如果可以最好使用 glob。空白安全,用于当前目录中的文件:

for i in *.txt; do # Whitespace-safe but not recursive.
    process "$i"
done

By enabling the globstaroption, you can glob all matching files in this directory and all subdirectories:

通过启用该globstar选项,您可以将此目录和所有子目录中的所有匹配文件全局化:

# Make sure globstar is enabled
shopt -s globstar
for i in **/*.txt; do # Whitespace-safe and recursive
    process "$i"
done

In some cases, e.g. if the file names are already in a file, you may need to use read:

在某些情况下,例如,如果文件名已经在文件中,您可能需要使用read

# IFS= makes sure it doesn't trim leading and trailing whitespace
# -r prevents interpretation of \ escapes.
while IFS= read -r line; do # Whitespace-safe EXCEPT newlines
    process "$line"
done < filename

readcan be used safely in combination with findby setting the delimiter appropriately:

readfind通过适当设置分隔符,可以安全地与 结合使用:

find . -name '*.txt' -print0 | 
    while IFS= read -r -d '' line; do 
        process "$line"
    done

For more complex searches, you will probably want to use find, either with its -execoption or with -print0 | xargs -0:

对于更复杂的搜索,您可能希望使用find,无论是使用它的-exec选项还是使用-print0 | xargs -0

# execute `process` once for each file
find . -name \*.txt -exec process {} \;

# execute `process` once with all the files as arguments*:
find . -name \*.txt -exec process {} +

# using xargs*
find . -name \*.txt -print0 | xargs -0 process

# using xargs with arguments after each filename (implies one run per filename)
find . -name \*.txt -print0 | xargs -0 -I{} process {} argument

findcan also cd into each file's directory before running a command by using -execdirinstead of -exec, and can be made interactive (prompt before running the command for each file) using -okinstead of -exec(or -okdirinstead of -execdir).

find也可以在运行命令之前使用-execdir代替cd 进入每个文件的目录-exec,并且可以使用-ok代替-exec(或-okdir代替-execdir)进行交互(在为每个文件运行命令之前提示)。

*: Technically, both findand xargs(by default) will run the command with as many arguments as they can fit on the command line, as many times as it takes to get through all the files. In practice, unless you have a very large number of files it won't matter, and if you exceed the length but need them all on the same command line, you're SOLfind a different way.

*:从技术上讲,findxargs(默认情况下)都将使用命令行中可以容纳的尽可能多的参数运行命令,尽可能多地浏览所有文件。实际上,除非您有大量文件,否则这无关紧要,并且如果您超过了长度但需要在同一命令行上全部使用,则SOL 会找到不同的方法。

回答by David W.

What ever you do, don't use a forloop:

无论你做什么,都不要使用for循环

# Don't do this
for file in $(find . -name "*.txt")
do
    …code using "$file"
done

Three reasons:

三个原因:

  • For the for loop to even start, the findmust run to completion.
  • If a file name has any whitespace (including space, tab or newline) in it, it will be treated as two separate names.
  • Although now unlikely, you can overrun your command line buffer. Imagine if your command line buffer holds 32KB, and your forloop returns 40KB of text. That last 8KB will be dropped right off your forloop and you'll never know it.
  • 为了让 for 循环开始,find必须运行到完成。
  • 如果文件名中有任何空格(包括空格、制表符或换行符),它将被视为两个单独的名称。
  • 尽管现在不太可能,但您可以超出命令行缓冲区。想象一下,如果您的命令行缓冲区有 32KB,而您的for循环返回 40KB 的文本。最后 8KB 将立即从您的for循环中删除,您永远不会知道。


Always use a while readconstruct:

始终使用while read构造:

find . -name "*.txt" -print0 | while read -d $'
find . -name "*.txt"|while read fname; do
  echo "$fname"
done
' file do …code using "$file" done

The loop will execute while the findcommand is executing. Plus, this command will work even if a file name is returned with whitespace in it. And, you won't overflow your command line buffer.

循环将在find命令执行时执行。另外,即使返回的文件名中包含空格,此命令也能正常工作。而且,您不会溢出命令行缓冲区。

The -print0will use the NULL as a file separator instead of a newline and the -d $'\0'will use NULL as the separator while reading.

-print0将使用NULL作为文件分隔符,而不是换行和-d $'\0'将使用NULL作为分隔符,而读。

回答by 0xC0000022L

find . -name '*.txt' -exec echo "{}" \;

Note: this method andthe (second) method shown by bmargulies are safe to use with white space in the file/folder names.

注意:此方法bmargulies 显示的(第二种)方法可以安全地与文件/文件夹名称中的空格一起使用。

In order to also have the - somewhat exotic - case of newlines in the file/folder names covered, you will have to resort to the -execpredicate of findlike this:

为了在文件/文件夹名称中也包含 - 有点奇特的 - 换行符的情况,您将不得不求助于这样的-exec谓词find

find . -name '*.txt' -print0|xargs -0 -n 1 echo

The {}is the placeholder for the found item and the \;is used to terminate the -execpredicate.

{}是,找到的项目占位符和\;用于终止的-exec谓语。

And for the sake of completeness let me add another variant - you gotta love the *nix ways for their versatility:

为了完整起见,让我添加另一个变体 - 您必须喜欢 *nix 方式的多功能性:

for file in ./*.txt; do
    [[ ! -e $file ]] && continue  # continue, if file does not exist
    # single filename is in $file
    echo "$file"
    # your code here
done

This would separate the printed items with a \0character that isn't allowed in any of the file systems in file or folder names, to my knowledge, and therefore should cover all bases. xargspicks them up one by one then ...

\0我所知,这会将打印的项目与文件或文件夹名称中的任何文件系统中不允许的字符分开,因此应该涵盖所有基础。xargs一一捡起来,然后……

回答by Michael Brux

Filenames can include spaces and even control characters. Spaces are (default) delimiters for shell expansion in bash and as a result of that x=$(find . -name "*.txt")from the question is not recommended at all. If find gets a filename with spaces e.g. "the file.txt"you will get 2 separated strings for processing, if you process xin a loop. You can improve this by changing delimiter (bash IFSVariable) e.g. to \r\n, but filenames can include control characters - so this is not a (completely) safe method.

文件名可以包含空格甚至控制字符。空格是 bash 中 shell 扩展的(默认)分隔符,因此x=$(find . -name "*.txt")根本不建议使用该问题的结果。如果 find 得到一个带空格的文件名,例如"the file.txt",如果您x在循环中处理,您将得到 2 个分隔的字符串进行处理。您可以通过将分隔符(bashIFS变量)更改为 来改善这一点\r\n,但文件名可以包含控制字符 - 因此这不是(完全)安全的方法。

From my point of view, there are 2 recommended (and safe) patterns for processing files:

从我的角度来看,有 2 种推荐(和安全)的文件处理模式:

1. Use for loop & filename expansion:

1. 使用 for 循环 & 文件名扩展:

while IFS= read -r -d '' file; do
    # single filename is in $file
    echo "$file"
    # your code here
done < <(find . -name "*.txt" -print0)

2. Use find-read-while & process substitution

2. 使用 find-read-while 和 process 替换

find . -name "*.txt" -exec $SHELL -c '
    for i in "$@" ; do
        echo "$i"
    done
' {} +

Remarks

评论

on Pattern 1:

在模式 1 上:

  1. bash returns the search pattern ("*.txt") if no matching file is found - so the extra line "continue, if file does not exist" is needed. see Bash Manual, Filename Expansion
  2. shell option nullglobcan be used to avoid this extra line.
  3. "If the failglobshell option is set, and no matches are found, an error message is printed and the command is not executed." (from Bash Manual above)
  4. shell option globstar: "If set, the pattern ‘**' used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/', only directories and subdirectories match." see Bash Manual, Shopt Builtin
  5. other options for filename expansion: extglob, nocaseglob, dotglob& shell variable GLOBIGNORE
  1. 如果没有找到匹配的文件,bash 将返回搜索模式(“*.txt”) - 因此需要额外的行“继续,如果文件不存在”。参见Bash 手册,文件名扩展
  2. shell 选项nullglob可以用来避免这个额外的行。
  3. “如果failglob设置了shell 选项,但未找到匹配项,则会打印错误消息并且不执行命令。” (来自上面的 Bash 手册)
  4. shell 选项globstar:“如果设置,文件名扩展上下文中使用的模式 '**' 将匹配所有文件以及零个或多个目录和子目录。如果模式后跟一个 '/',则只有目录和子目录匹配。” 请参阅Bash 手册,Shopt 内置
  5. 文件名扩展的其他选项:extglob, nocaseglob, dotglob& shell 变量GLOBIGNORE

on Pattern 2:

在模式 2 上:

  1. filenames can contain blanks, tabs, spaces, newlines, ... to process filenames in a safe way, findwith -print0is used: filename is printed with all control characters & terminated with NUL. see also Gnu Findutils Manpage, Unsafe File Name Handling, safe File Name Handling, unusual characters in filenames. See David A. Wheeler below for detailed discussion of this topic.

  2. There are some possible patterns to process find results in a while loop. Others (kevin, David W.) have shown how to do this using pipes:

    files_found=1 find . -name "*.txt" -print0 | while IFS= read -r -d '' file; do # single filename in $file echo "$file" files_found=0 # not working example # your code here done [[ $files_found -eq 0 ]] && echo "files found" || echo "no files found"
    When you try this piece of code, you will see, that it does not work: files_foundis always "true" & the code will always echo "no files found". Reason is: each command of a pipeline is executed in a separate subshell, so the changed variable inside the loop (separate subshell) does not change the variable in the main shell script. This is why I recommend using process substitution as the "better", more useful, more general pattern.
    See I set variables in a loop that's in a pipeline. Why do they disappear...(from Greg's Bash FAQ) for a detailed discussion on this topic.

  1. 文件名可包含空格,制表符,空格,新行,...以以安全的方式处理的文件名,find-print0使用:文件名是印有所有的控制字符和与NUL终止。另请参阅Gnu Findutils 手册页、不安全的文件名处理安全的文件名处理文件名中的异常字符。有关此主题的详细讨论,请参阅下面的 David A. Wheeler。

  2. 有一些可能的模式可以在 while 循环中处理查找结果。其他人(凯文,大卫 W.)已经展示了如何使用管道来做到这一点:

    files_found=1 find . -name "*.txt" -print0 | while IFS= read -r -d '' file; do # single filename in $file echo "$file" files_found=0 # not working example # your code here done [[ $files_found -eq 0 ]] && echo "files found" || echo "no files found"
    当您尝试这段代码时,您会发现它不起作用:files_found始终为“真”并且代码将始终回显“未找到文件”。原因是:一个管道的每个命令都在一个单独的子shell中执行,所以循环内部(单独的子shell)改变的变量不会改变主shell脚本中的变量。这就是为什么我建议使用流程替换作为“更好”、更有用、更通用的模式。
    请参阅我在管道中的循环中设置变量。为什么它们会消失...(来自 Greg 的 Bash 常见问题解答)有关此主题的详细讨论。

Additional References & Sources:

其他参考资料和来源:

回答by user569825

(Updated to include @Socowi's execellent speed improvement)

(更新包括@Socowi 出色的速度提升)

With any $SHELLthat supports it (dash/zsh/bash...):

任何$SHELL支持它的(dash/zsh/bash ...):

find . -name "*.txt" -exec $SHELL -c '
    echo "
# Doesn't handle whitespace
for x in `find . -name "*.txt" -print`; do
  process_one $x
done

or

# Handles whitespace and newlines
find . -name "*.txt" -print0 | xargs -0 -n 1 process_one
" ' {} \;

Done.

完毕。



Original answer (shorter, but slower):

原始答案(较短但较慢):

array=($(find . -name "*.txt"))

回答by bmargulies

for i in ${array[@]};do echo $i; done

回答by Rakholiya Jenish

You can store your findoutput in array if you wish to use the output later as:

find如果您希望稍后使用输出,您可以将输出存储在数组中:

printf '%s\n' "${array[@]}"

Now to print the each element in new line, you can either use forloop iterating to all the elements of array, or you can use printf statement.

现在要在新行中打印每个元素,您可以使用for循环迭代数组的所有元素,也可以使用 printf 语句。

for file in "`find . -name "*.txt"`"; do echo "$file"; done

or

或者

find . -name "*.txt" -print 2>/dev/null


You can also use:

您还可以使用:

find . -name "*.txt" -print | grep -v 'Permission denied'

This will print each filename in newline

这将在换行符中打印每个文件名

To only print the findoutput in list form, you can use either of the following:

要仅以find列表形式打印输出,您可以使用以下任一方法:

readarray -t x < <(find . -name '*.txt')

or

或者

readarray -d '' x < <(find . -name '*.txt' -print0)

This will remove error messages and only give the filename as output in new line.

这将删除错误消息,并仅在新行中将文件名作为输出提供。

If you wish to do something with the filenames, storing it in array is good, else there is no need to consume that space and you can directly print the output from find.

如果你想对文件名做一些事情,把它存储在数组中是好的,否则就不需要消耗那个空间,你可以直接从find.

回答by Seppo Enarvi

If you can assume the file names don't contain newlines, you can read the output of findinto a Bash array using the following command:

如果您可以假设文件名不包含换行符,则可以find使用以下命令将 的输出读入 Bash 数组:

FilesFound=$(find . -name "*.txt")

IFSbkp="$IFS"
IFS=$'\n'
counter=1;
for file in $FilesFound; do
    echo "${counter}: ${file}"
    let counter++;
done
IFS="$IFSbkp"

Note:

笔记:

  • -tcauses readarrayto strip newlines.
  • It won't work if readarrayis in a pipe, hence the process substitution.
  • readarrayis available since Bash 4.
  • -t导致readarray剥离换行符。
  • 如果readarray在管道中,它将不起作用,因此过程替换。
  • readarray从 Bash 4 开始可用。

Bash 4.4 and up also supports the -dparameter for specifying the delimiter. Using the null character, instead of newline, to delimit the file names works also in the rare case that the file names contain newlines:

Bash 4.4 及更高版本还支持-d用于指定分隔符的参数。使用空字符而不是换行符来分隔文件名也适用于文件名包含换行符的罕见情况:

array=()
while IFS=  read -r -d ''; do
    array+=("$REPLY")
done < <(find . -name '*.txt' -print0)

readarraycan also be invoked as mapfilewith the same options.

readarray也可以像mapfile使用相同的选项一样调用。

Reference: https://mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream

参考:https: //mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream

回答by Paco

I like to use find which is first assigned to variable and IFS switched to new line as follow:

我喜欢使用先分配给变量的 find ,然后 IFS 切换到新行,如下所示:

##代码##

Just in case you would like to repeat more actions on the same set of DATA and find is very slow on your server (I/0 high utilization)

以防万一您想在同一组 DATA 上重复更多操作并发现服务器上的速度非常慢(I/0 高利用率)

回答by Jahid

You can put the filenames returned by findinto an array like this:

您可以将返回的文件名find放入这样的数组中:

##代码##

Now you can just loop through the array to access individual items and do whatever you want with them.

现在,您只需遍历数组即可访问单个项目并对其进行任何您想做的事情。

Note:It's white space safe.

注意:它是空白安全的。