Bash 和带空格的文件名

Question

提问by Jim Garrison

The following is a simple Bash command line:

下面是一个简单的 Bash 命令行：

grep -li 'regex' "filename with spaces" "filename"

No problems. Also the following works just fine:

没问题。以下工作也很好：

grep -li 'regex' $(<listOfFiles.txt)

where listOfFiles.txtcontains a list of filenames to be grepped, one filename per line.

其中listOfFiles.txt包含要搜索的文件名列表，每行一个文件名。

The problem occurs when listOfFiles.txtcontains filenames with embedded spaces. In all cases I've tried (see below), Bash splits the filenames at the spaces so, for example, a line in listOfFiles.txtcontaining a name like ./this is a file.xmlends up trying to run grep on each piece (./this, is, aand file.xml).

当listOfFiles.txt包含带有嵌入空格的文件名时会出现问题。在我尝试过的所有情况下（见下文），Bash 在空格处拆分文件名，因此，例如，listOfFiles.txt包含名称 like 的行./this is a file.xml最终试图在每个部分（./this、is、a和file.xml）上运行 grep 。

I thought I was a relatively advanced Bash user, but I cannot find a simple magic incantation to get this to work. Here are the things I've tried.

我以为我是一个相对高级的 Bash 用户，但我找不到一个简单的魔法咒语来让它工作。这是我尝试过的事情。

grep -li 'regex' `cat listOfFiles.txt`

Fails as described above (I didn't really expect this to work), so I thought I'd put quotes around each filename:

如上所述失败（我真的没想到这会起作用），所以我想我会在每个文件名周围加上引号：

grep -li 'regex' `sed -e 's/.*/"&"/' listOfFiles.txt`

Bash interprets the quotes as part of the filename and gives "No such file or directory" for each file (and stillsplits the filenames with blanks)

Bash 将引号解释为文件名的一部分，并为每个文件提供“没有这样的文件或目录”（并且仍然用空格分割文件名）

for i in $(<listOfFiles.txt); do grep -li 'regex' "$i"; done

This fails as for the original attempt (that is, it behaves as if the quotes are ignored) and is very slow since it has to launch one 'grep' process per file instead of processing all files in one invocation.

对于最初的尝试，这失败了（也就是说，它的行为就像忽略了引号）并且非常慢，因为它必须为每个文件启动一个“grep”进程，而不是在一次调用中处理所有文件。

The following works, but requires some careful double-escaping if the regular expression contains shell metacharacters:

以下工作，但如果正则表达式包含 shell 元字符，则需要一些仔细的双重转义：

eval grep -li 'regex' `sed -e 's/.*/"&"/' listOfFiles.txt`

Is this the only way to construct the command line so it will correctly handle filenames with spaces?

这是构建命令行以便正确处理带空格的文件名的唯一方法吗？

Answer 1

回答by Stephan202

Try this:

尝试这个：

(IFS=$'\n'; grep -li 'regex' $(<listOfFiles.txt))

IFSis the Internal Field Separator. Setting it to $'\n'tells Bash to use the newline character to delimit filenames. Its default value is $' \t\n'and can be printed using cat -etv <<<"$IFS".

IFS是内部字段分隔符。将其设置为$'\n'告诉 Bash 使用换行符来分隔文件名。它的默认值是$' \t\n'并且可以使用cat -etv <<<"$IFS".

Enclosing the script in parenthesis starts a subshell so that only commands within the parenthesis are affected by the custom IFSvalue.

将脚本括在括号中会启动一个子 shell，以便只有括号内的命令受自定义IFS值影响。

Answer 2

回答by Michael Potter

cat listOfFiles.txt |tr '\n' 'cat listOfFiles.txt |tr '\n' 'while read file; do grep -li dtw "$file"; done < listOfFiles.txt
' |xargs -0 grep -i 'regex' /dev/null
' |xargs -0 grep -li 'regex'

The -0 option on xargs tells xargs to use a null character rather than white space as a filename terminator. The tr command converts the incoming newlines to a null character.

xargs 上的 -0 选项告诉 xargs 使用空字符而不是空格作为文件名终止符。tr 命令将传入的换行符转换为空字符。

This meets the OP's requirement that grep not be invoked multiple times. It has been my experience that for a large number of files avoiding the multiple invocations of grep improves performance considerably.

这符合 OP 的要求，即不能多次调用 grep。根据我的经验，对于大量文件，避免多次调用 grep 可以显着提高性能。

This scheme also avoids a bug in the OP's original method because his scheme will break where listOfFiles.txt contains a number of files that would exceed the buffer size for the commands. xargs knows about the maximum command size and will invoke grep multiple times to avoid that problem.

该方案还避免了 OP 原始方法中的错误，因为他的方案将破坏 listOfFiles.txt 包含的文件数量超出命令缓冲区大小的情况。xargs 知道最大命令大小，并将多次调用 grep 以避免该问题。

A related problem with using xargs and grep is that grep will prefix the output with the filename when invoked with multiple files. Because xargs invokes grep with multiple files one will receive output with the filename prefixed, but not for the case of one file in listOfFiles.txt or the case of multiple invocations where the last invocation contains one filename. To achieve consistent output add /dev/null to the grep command:

使用 xargs 和 grep 的一个相关问题是，当使用多个文件调用时，grep 将在输出前加上文件名。因为 xargs 使用多个文件调用 grep ，所以将接收带有文件名前缀的输出，但不适用于 listOfFiles.txt 中的一个文件的情况或最后一次调用包含一个文件名的多次调用的情况。要获得一致的输出，请将 /dev/null 添加到 grep 命令：

grep -i 'regex' $(cat listOfFiles.txt | sed -e "s/ /?/g")

Note that was not an issue for the OP because he was using the -l option on grep; however it is likely to be an issue for others.

请注意，这对 OP 来说不是问题，因为他在 grep 上使用了 -l 选项；然而，对于其他人来说，这可能是一个问题。

Answer 3

回答by Paused until further notice.

This works:

这有效：

cat <(echo -e "AA AA\nBB BB") | while read file; do echo $file; done

Answer 4

回答by Chris Thiessen

Though it may overmatch, this is my favorite solution:

虽然它可能会超过匹配，但这是我最喜欢的解决方案：

grep -rlI 'search' "My Dir"/ | while read file; do echo $file; grep 'search\|else' "$ix"; done

Answer 5

回答by sdaau

Do note that if you somehow ended up with a list in a file which has Windows line endings, \r\n, NONE of the notes above about the input file separator $IFS(and quoting the argument) will work; so make sure that the line endings are correctly \n(I use sciteto show the line endings, and easily change them from one to the other).

请注意，如果您以某种方式在具有 Windows 行结尾的文件中得到一个列表，\r\n, 上面关于输入文件分隔符$IFS（并引用参数）的任何注释都不会起作用；所以请确保行尾正确\n（我scite用来显示行尾，并轻松地将它们从一个更改为另一个）。

Also catpiped into while file read ...seems to work (apparently without need to set separators):

也通过cat管道输入while file read ...似乎可以工作（显然不需要设置分隔符）：

$ tree
.
├── a
│   ├── a 1
│   └── a 2
├── b
│   ├── b 1
│   └── b 2
└── c
    ├── c 1
    └── c 2

3 directories, 6 files
$ mapfile -t files < <(find -type f)
$ for file in "${files[@]}"; do
> echo "file: $file"
> done
file: ./a/a 2
file: ./a/a 1
file: ./b/b 2
file: ./b/b 1
file: ./c/c 2
file: ./c/c 1

... although for me it was more relevant for a "grep" through a directory with spaces in filenames:

...虽然对我来说，它与通过文件名中带有空格的目录的“grep”更相关：

##代码##

Answer 6

回答by kitekat75

With Bash 4, you can also use the builtin mapfile function to set an array containing each line and iterate on this array:

在 Bash 4 中，您还可以使用内置的 mapfile 函数来设置包含每一行的数组并在该数组上进行迭代：

##代码##

Bash 和带空格的文件名

提问by Jim Garrison

回答by Stephan202

回答by Michael Potter

回答by Paused until further notice.

回答by Chris Thiessen

回答by sdaau

回答by kitekat75

相关推荐

最近更新

标签

Bash 和带空格的文件名

提问by Jim Garrison

回答by Stephan202

回答by Michael Potter

回答by Paused until further notice.

回答by Chris Thiessen

回答by sdaau

回答by kitekat75

相关推荐

如何创建一个 bash 脚本来检查 SSH 连接？

bash 如何在shell脚本中提取字符串的前两个字符？

bash linux shell 脚本：拆分字符串，将它们放在一个数组中，然后循环遍历它们

bash 如何拆分文件并在每个部分中保留第一行？

相关推荐

最近更新

标签