bash 如何遍历find返回的文件名?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9612090/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to loop through file names returned by find?
提问by Haiyuan Zhang
x=$(find . -name "*.txt")
echo $x
if I run the above piece of code in Bash shell, what I get is a string containing several file names separated by blank, not a list.
如果我在 Bash shell 中运行上面的代码,我得到的是一个字符串,其中包含多个由空格分隔的文件名,而不是一个列表。
Of course, I can further separate them by blank to get a list, but I'm sure there is a better way to do it.
当然,我可以进一步用空白将它们分开以获得列表,但我相信有更好的方法来做到这一点。
So what is the best way to loop through the results of a find
command?
那么循环遍历find
命令结果的最佳方法是什么?
回答by Kevin
TL;DR: If you're just here for the most correct answer, you probably want my personal preference, find . -name '*.txt' -exec process {} \;
(see the bottom of this post). If you have time, read through the rest to see several different ways and the problems with most of them.
TL;DR:如果你只是为了最正确的答案,你可能想要我的个人偏好,find . -name '*.txt' -exec process {} \;
(见这篇文章的底部)。如果您有时间,请通读其余部分以了解几种不同的方法以及其中大多数方法的问题。
The full answer:
完整答案:
The best way depends on what you want to do, but here are a few options. As long as no file or folder in the subtree has whitespace in its name, you can just loop over the files:
最好的方法取决于您想要做什么,但这里有几个选项。只要子树中的文件或文件夹的名称中没有空格,您就可以遍历这些文件:
for i in $x; do # Not recommended, will break on whitespace
process "$i"
done
Marginally better, cut out the temporary variable x
:
稍微好一点,去掉临时变量x
:
for i in $(find -name \*.txt); do # Not recommended, will break on whitespace
process "$i"
done
It is muchbetter to glob when you can. White-space safe, for files in the current directory:
如果可以,最好使用 glob。空白安全,用于当前目录中的文件:
for i in *.txt; do # Whitespace-safe but not recursive.
process "$i"
done
By enabling the globstar
option, you can glob all matching files in this directory and all subdirectories:
通过启用该globstar
选项,您可以将此目录和所有子目录中的所有匹配文件全局化:
# Make sure globstar is enabled
shopt -s globstar
for i in **/*.txt; do # Whitespace-safe and recursive
process "$i"
done
In some cases, e.g. if the file names are already in a file, you may need to use read
:
在某些情况下,例如,如果文件名已经在文件中,您可能需要使用read
:
# IFS= makes sure it doesn't trim leading and trailing whitespace
# -r prevents interpretation of \ escapes.
while IFS= read -r line; do # Whitespace-safe EXCEPT newlines
process "$line"
done < filename
read
can be used safely in combination with find
by setting the delimiter appropriately:
read
find
通过适当设置分隔符,可以安全地与 结合使用:
find . -name '*.txt' -print0 |
while IFS= read -r -d '' line; do
process "$line"
done
For more complex searches, you will probably want to use find
, either with its -exec
option or with -print0 | xargs -0
:
对于更复杂的搜索,您可能希望使用find
,无论是使用它的-exec
选项还是使用-print0 | xargs -0
:
# execute `process` once for each file
find . -name \*.txt -exec process {} \;
# execute `process` once with all the files as arguments*:
find . -name \*.txt -exec process {} +
# using xargs*
find . -name \*.txt -print0 | xargs -0 process
# using xargs with arguments after each filename (implies one run per filename)
find . -name \*.txt -print0 | xargs -0 -I{} process {} argument
find
can also cd into each file's directory before running a command by using -execdir
instead of -exec
, and can be made interactive (prompt before running the command for each file) using -ok
instead of -exec
(or -okdir
instead of -execdir
).
find
也可以在运行命令之前使用-execdir
代替cd 进入每个文件的目录-exec
,并且可以使用-ok
代替-exec
(或-okdir
代替-execdir
)进行交互(在为每个文件运行命令之前提示)。
*: Technically, both find
and xargs
(by default) will run the command with as many arguments as they can fit on the command line, as many times as it takes to get through all the files. In practice, unless you have a very large number of files it won't matter, and if you exceed the length but need them all on the same command line, you're SOLfind a different way.
*:从技术上讲,find
和xargs
(默认情况下)都将使用命令行中可以容纳的尽可能多的参数运行命令,尽可能多地浏览所有文件。实际上,除非您有大量文件,否则这无关紧要,并且如果您超过了长度但需要在同一命令行上全部使用,则SOL 会找到不同的方法。
回答by David W.
What ever you do, don't use a for
loop:
无论你做什么,都不要使用for
循环:
# Don't do this
for file in $(find . -name "*.txt")
do
…code using "$file"
done
Three reasons:
三个原因:
- For the for loop to even start, the
find
must run to completion. - If a file name has any whitespace (including space, tab or newline) in it, it will be treated as two separate names.
- Although now unlikely, you can overrun your command line buffer. Imagine if your command line buffer holds 32KB, and your
for
loop returns 40KB of text. That last 8KB will be dropped right off yourfor
loop and you'll never know it.
- 为了让 for 循环开始,
find
必须运行到完成。 - 如果文件名中有任何空格(包括空格、制表符或换行符),它将被视为两个单独的名称。
- 尽管现在不太可能,但您可以超出命令行缓冲区。想象一下,如果您的命令行缓冲区有 32KB,而您的
for
循环返回 40KB 的文本。最后 8KB 将立即从您的for
循环中删除,您永远不会知道。
Always use a while read
construct:
始终使用while read
构造:
find . -name "*.txt" -print0 | while read -d $'find . -name "*.txt"|while read fname; do
echo "$fname"
done
' file
do
…code using "$file"
done
The loop will execute while the find
command is executing. Plus, this command will work even if a file name is returned with whitespace in it. And, you won't overflow your command line buffer.
循环将在find
命令执行时执行。另外,即使返回的文件名中包含空格,此命令也能正常工作。而且,您不会溢出命令行缓冲区。
The -print0
will use the NULL as a file separator instead of a newline and the -d $'\0'
will use NULL as the separator while reading.
在-print0
将使用NULL作为文件分隔符,而不是换行和-d $'\0'
将使用NULL作为分隔符,而读。
回答by 0xC0000022L
find . -name '*.txt' -exec echo "{}" \;
Note: this method andthe (second) method shown by bmargulies are safe to use with white space in the file/folder names.
注意:此方法和bmargulies 显示的(第二种)方法可以安全地与文件/文件夹名称中的空格一起使用。
In order to also have the - somewhat exotic - case of newlines in the file/folder names covered, you will have to resort to the -exec
predicate of find
like this:
为了在文件/文件夹名称中也包含 - 有点奇特的 - 换行符的情况,您将不得不求助于这样的-exec
谓词find
:
find . -name '*.txt' -print0|xargs -0 -n 1 echo
The {}
is the placeholder for the found item and the \;
is used to terminate the -exec
predicate.
的{}
是,找到的项目占位符和\;
用于终止的-exec
谓语。
And for the sake of completeness let me add another variant - you gotta love the *nix ways for their versatility:
为了完整起见,让我添加另一个变体 - 您必须喜欢 *nix 方式的多功能性:
for file in ./*.txt; do
[[ ! -e $file ]] && continue # continue, if file does not exist
# single filename is in $file
echo "$file"
# your code here
done
This would separate the printed items with a \0
character that isn't allowed in any of the file systems in file or folder names, to my knowledge, and therefore should cover all bases. xargs
picks them up one by one then ...
据\0
我所知,这会将打印的项目与文件或文件夹名称中的任何文件系统中不允许的字符分开,因此应该涵盖所有基础。xargs
一一捡起来,然后……
回答by Michael Brux
Filenames can include spaces and even control characters. Spaces are (default) delimiters for shell expansion in bash and as a result of that x=$(find . -name "*.txt")
from the question is not recommended at all. If find gets a filename with spaces e.g. "the file.txt"
you will get 2 separated strings for processing, if you process x
in a loop. You can improve this by changing delimiter (bash IFS
Variable) e.g. to \r\n
, but filenames can include control characters - so this is not a (completely) safe method.
文件名可以包含空格甚至控制字符。空格是 bash 中 shell 扩展的(默认)分隔符,因此x=$(find . -name "*.txt")
根本不建议使用该问题的结果。如果 find 得到一个带空格的文件名,例如"the file.txt"
,如果您x
在循环中处理,您将得到 2 个分隔的字符串进行处理。您可以通过将分隔符(bashIFS
变量)更改为 来改善这一点\r\n
,但文件名可以包含控制字符 - 因此这不是(完全)安全的方法。
From my point of view, there are 2 recommended (and safe) patterns for processing files:
从我的角度来看,有 2 种推荐(和安全)的文件处理模式:
1. Use for loop & filename expansion:
1. 使用 for 循环 & 文件名扩展:
while IFS= read -r -d '' file; do
# single filename is in $file
echo "$file"
# your code here
done < <(find . -name "*.txt" -print0)
2. Use find-read-while & process substitution
2. 使用 find-read-while 和 process 替换
find . -name "*.txt" -exec $SHELL -c '
for i in "$@" ; do
echo "$i"
done
' {} +
Remarks
评论
on Pattern 1:
在模式 1 上:
- bash returns the search pattern ("*.txt") if no matching file is found - so the extra line "continue, if file does not exist" is needed. see Bash Manual, Filename Expansion
- shell option
nullglob
can be used to avoid this extra line. - "If the
failglob
shell option is set, and no matches are found, an error message is printed and the command is not executed." (from Bash Manual above) - shell option
globstar
: "If set, the pattern ‘**' used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/', only directories and subdirectories match." see Bash Manual, Shopt Builtin - other options for filename expansion:
extglob
,nocaseglob
,dotglob
& shell variableGLOBIGNORE
- 如果没有找到匹配的文件,bash 将返回搜索模式(“*.txt”) - 因此需要额外的行“继续,如果文件不存在”。参见Bash 手册,文件名扩展
- shell 选项
nullglob
可以用来避免这个额外的行。 - “如果
failglob
设置了shell 选项,但未找到匹配项,则会打印错误消息并且不执行命令。” (来自上面的 Bash 手册) - shell 选项
globstar
:“如果设置,文件名扩展上下文中使用的模式 '**' 将匹配所有文件以及零个或多个目录和子目录。如果模式后跟一个 '/',则只有目录和子目录匹配。” 请参阅Bash 手册,Shopt 内置 - 文件名扩展的其他选项:
extglob
,nocaseglob
,dotglob
& shell 变量GLOBIGNORE
on Pattern 2:
在模式 2 上:
filenames can contain blanks, tabs, spaces, newlines, ... to process filenames in a safe way,
find
with-print0
is used: filename is printed with all control characters & terminated with NUL. see also Gnu Findutils Manpage, Unsafe File Name Handling, safe File Name Handling, unusual characters in filenames. See David A. Wheeler below for detailed discussion of this topic.There are some possible patterns to process find results in a while loop. Others (kevin, David W.) have shown how to do this using pipes:
When you try this piece of code, you will see, that it does not work:files_found=1 find . -name "*.txt" -print0 | while IFS= read -r -d '' file; do # single filename in $file echo "$file" files_found=0 # not working example # your code here done [[ $files_found -eq 0 ]] && echo "files found" || echo "no files found"
files_found
is always "true" & the code will always echo "no files found". Reason is: each command of a pipeline is executed in a separate subshell, so the changed variable inside the loop (separate subshell) does not change the variable in the main shell script. This is why I recommend using process substitution as the "better", more useful, more general pattern.
See I set variables in a loop that's in a pipeline. Why do they disappear...(from Greg's Bash FAQ) for a detailed discussion on this topic.
文件名可包含空格,制表符,空格,新行,...以以安全的方式处理的文件名,
find
与-print0
使用:文件名是印有所有的控制字符和与NUL终止。另请参阅Gnu Findutils 手册页、不安全的文件名处理、 安全的文件名处理、文件名中的异常字符。有关此主题的详细讨论,请参阅下面的 David A. Wheeler。有一些可能的模式可以在 while 循环中处理查找结果。其他人(凯文,大卫 W.)已经展示了如何使用管道来做到这一点:
当您尝试这段代码时,您会发现它不起作用:files_found=1 find . -name "*.txt" -print0 | while IFS= read -r -d '' file; do # single filename in $file echo "$file" files_found=0 # not working example # your code here done [[ $files_found -eq 0 ]] && echo "files found" || echo "no files found"
files_found
始终为“真”并且代码将始终回显“未找到文件”。原因是:一个管道的每个命令都在一个单独的子shell中执行,所以循环内部(单独的子shell)改变的变量不会改变主shell脚本中的变量。这就是为什么我建议使用流程替换作为“更好”、更有用、更通用的模式。
请参阅我在管道中的循环中设置变量。为什么它们会消失...(来自 Greg 的 Bash 常见问题解答)有关此主题的详细讨论。
Additional References & Sources:
其他参考资料和来源:
回答by user569825
(Updated to include @Socowi's execellent speed improvement)
(更新包括@Socowi 出色的速度提升)
With any $SHELL
that supports it (dash/zsh/bash...):
任何$SHELL
支持它的(dash/zsh/bash ...):
find . -name "*.txt" -exec $SHELL -c '
echo "# Doesn't handle whitespace
for x in `find . -name "*.txt" -print`; do
process_one $x
done
or
# Handles whitespace and newlines
find . -name "*.txt" -print0 | xargs -0 -n 1 process_one
"
' {} \;
Done.
完毕。
Original answer (shorter, but slower):
原始答案(较短但较慢):
array=($(find . -name "*.txt"))
回答by bmargulies
for i in ${array[@]};do echo $i; done
回答by Rakholiya Jenish
You can store your find
output in array if you wish to use the output later as:
find
如果您希望稍后使用输出,您可以将输出存储在数组中:
printf '%s\n' "${array[@]}"
Now to print the each element in new line, you can either use for
loop iterating to all the elements of array, or you can use printf statement.
现在要在新行中打印每个元素,您可以使用for
循环迭代数组的所有元素,也可以使用 printf 语句。
for file in "`find . -name "*.txt"`"; do echo "$file"; done
or
或者
find . -name "*.txt" -print 2>/dev/null
You can also use:
您还可以使用:
find . -name "*.txt" -print | grep -v 'Permission denied'
This will print each filename in newline
这将在换行符中打印每个文件名
To only print the find
output in list form, you can use either of the following:
要仅以find
列表形式打印输出,您可以使用以下任一方法:
readarray -t x < <(find . -name '*.txt')
or
或者
readarray -d '' x < <(find . -name '*.txt' -print0)
This will remove error messages and only give the filename as output in new line.
这将删除错误消息,并仅在新行中将文件名作为输出提供。
If you wish to do something with the filenames, storing it in array is good, else there is no need to consume that space and you can directly print the output from find
.
如果你想对文件名做一些事情,把它存储在数组中是好的,否则就不需要消耗那个空间,你可以直接从find
.
回答by Seppo Enarvi
If you can assume the file names don't contain newlines, you can read the output of find
into a Bash array using the following command:
如果您可以假设文件名不包含换行符,则可以find
使用以下命令将 的输出读入 Bash 数组:
FilesFound=$(find . -name "*.txt")
IFSbkp="$IFS"
IFS=$'\n'
counter=1;
for file in $FilesFound; do
echo "${counter}: ${file}"
let counter++;
done
IFS="$IFSbkp"
Note:
笔记:
-t
causesreadarray
to strip newlines.- It won't work if
readarray
is in a pipe, hence the process substitution. readarray
is available since Bash 4.
-t
导致readarray
剥离换行符。- 如果
readarray
在管道中,它将不起作用,因此过程替换。 readarray
从 Bash 4 开始可用。
Bash 4.4 and up also supports the -d
parameter for specifying the delimiter. Using the null character, instead of newline, to delimit the file names works also in the rare case that the file names contain newlines:
Bash 4.4 及更高版本还支持-d
用于指定分隔符的参数。使用空字符而不是换行符来分隔文件名也适用于文件名包含换行符的罕见情况:
array=()
while IFS= read -r -d ''; do
array+=("$REPLY")
done < <(find . -name '*.txt' -print0)
readarray
can also be invoked as mapfile
with the same options.
readarray
也可以像mapfile
使用相同的选项一样调用。
Reference: https://mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream
参考:https: //mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream
回答by Paco
I like to use find which is first assigned to variable and IFS switched to new line as follow:
我喜欢使用先分配给变量的 find ,然后 IFS 切换到新行,如下所示:
##代码##Just in case you would like to repeat more actions on the same set of DATA and find is very slow on your server (I/0 high utilization)
以防万一您想在同一组 DATA 上重复更多操作并发现服务器上的速度非常慢(I/0 高利用率)
回答by Jahid
You can put the filenames returned by find
into an array like this:
您可以将返回的文件名find
放入这样的数组中:
Now you can just loop through the array to access individual items and do whatever you want with them.
现在,您只需遍历数组即可访问单个项目并对其进行任何您想做的事情。
Note:It's white space safe.
注意:它是空白安全的。