bash 是否有可用的文件和目录名称的“转义转换器”？

Question

提问by Richard T

The day came when I had to write a BASH script that walks arbitrary directory trees and looks at arbitrary files and attempts to determine something regarding a comparison among them. I thought it would be a simple couple-of-hours tops!process - Not So!

有一天，我不得不编写一个 BASH 脚本，该脚本遍历任意目录树并查看任意文件并尝试确定有关它们之间比较的某些内容。我以为这将是一个简单的几个小时上衣！过程 - 不是这样！

My hangup is that sometimes some idiot -ahem!- excuse me, lovely userchooses to put spaces in directory and file names. This causes my script to fail.

我的挂断是有时有些白痴 - 咳咳！ - 对不起，可爱的用户选择在目录和文件名中放置空格。这导致我的脚本失败。

The perfect solution, aside from threatening the guillotine for those who insist on using spaces in such places (not to mention the guys who put this in operating systems' code!), might be a routine that "escapes" the file and directory names for us, kind of like how cygwin has routines to convert from unix to dos filename formats. Is there anything like this in a standard Unix / Linux distribution?

完美的解决方案，除了对那些坚持在这些地方使用空格的人（更不用说将其放入操作系统代码的人！）我们，有点像 cygwin 有从 unix 转换为 dos 文件名格式的例程。在标准的 Unix/Linux 发行版中是否有这样的东西？

Note that the simple for file in *construct doesn't work so well when one is trying to compare directory trees as it ONLYworks on "the current directory" - and, in this case as in many others, constantly CDing to various directory locations brings with it its own problems. So, in doing my homework, I found this question Handle special characters in bash for...in loopand the proposed solution there hangs up on spaces in directory names, but can simply be overcome like this:

请注意，for file in *当人们尝试比较目录树时，简单的构造并不能很好地工作，因为它仅适用于“当前目录” - 在这种情况下，就像在许多其他情况下一样，不断地 CDing 到不同的目录位置带来了它自己的问题。因此，在做作业时，我发现了这个问题Handle special characters in bash for...in loop和建议的解决方案在目录名称中的空格上挂起，但可以像这样简单地克服：

dir="dirname with spaces"
ls -1 "$dir" | while read x; do
   echo $x
done

PLEASE NOTE:The above code isn't particularly wonderful because the variables used inside the while loop are INACCESSIBLE outside that while loop. This is because there's an implied subshell created when the ls command's output is piped. This is a key motivating factor to my query!

请注意：上面的代码并不是特别好，因为在 while 循环内部使用的变量在 while 循环之外是不可访问的。这是因为在通过管道传输 ls 命令的输出时会创建一个隐含的子 shell。这是我查询的关键驱动因素！

...OK, the code above helps for many situations but "escaping" the characters would be pretty powerful too. For example, dir above might contain:

...好的，上面的代码在很多情况下都有帮助，但是“转义”字符也非常强大。例如，上面的目录可能包含：

dir\ with\ spaces

Does this already exist and I've just been overlooking it?

这是否已经存在而我只是忽略了它？

If not, does anyone have an easy proposal to write one - maybe with sed or lex? (I'm far from competent with either.)

如果没有，有没有人有一个简单的建议来写一个 - 也许用 sed 或 lex？（我远不能胜任。）

Answer 1

采纳答案by Paused until further notice.

Make a really nasty filename for testing:

为测试创建一个非常讨厌的文件名：

mkdir escapetest
cd escapetest && touch "m'i;x&e\"d u(p\nmulti)\nlines'\nand5ca&rr5re;t"

[ Edit:Chances are that I intended that touchcommand to be:

[编辑：有可能我打算将该touch命令设为：

touch $'m\'i;x&e\"d u(p\nmulti)\nlines\'\nand5ca&rr5re;t'

which puts more ugly characters in the filename. The output will look a little different. ]

这在文件名中放置了更多丑陋的字符。输出看起来会有些不同。]

Then run this:

然后运行这个：

find -print0 | while read -d '' -r line; do echo -en "--[${line}]--\t\t"; echo "$line"|sed -e ':t;N;s/\n/\n/;bt' | sed 's/\([ \o47()"&;\]\)/\/g;s/\o15/\r/g'; done

The output should look like this:

输出应如下所示：

--[./m'i;x&e"d u(p
multi)
lines'
re;t]--         ./m\'i\;x\&e\"d\ u\(p\nmulti\)\nlines\'\nand\015ca\&rr\015re\;t

This consists of a condensed version of Pascal Thivent'ssedmonster, plus handling for carriage returns and newlines and maybe a bit more.

这包括Pascal Thiventsed怪物的压缩版本，加上对回车和换行符的处理，也许还有更多。

The first pass through sedmerges multiple lines into one delimited by "\n" for filenames that have newlines. The second pass replaces any from a list of characters with a backslash preceding itself. The last part replaces carriage returns with "\r".

sed对于具有换行符的文件名，第一次通过将多行合并为由“\n”分隔的行。第二遍用前面的反斜杠替换字符列表中的任何字符。最后一部分用“\r”替换回车。

One thing to note is that, as you know, whilewill handle spaces and forwon't but by sending the output of findwith null termination and setting the delimiter of readto null, you can also handle newlines in filenames. The -roption causes readto accept backslashes without interpreting them.

需要注意的一件事是，如您所知，while将处理空格并且for不会但是通过发送findwith null 终止的输出并将的分隔符设置read为 null，您还可以处理文件名中的换行符。该-r选项导致read接受反斜杠而不解释它们。

Edit:

编辑：

Another way to escape the special characters, this time without using sed, uses the quoting and variable-creating feature of the Bash printfbuiltin (this also illustrates using process substitution rather than a pipe):

另一种转义特殊字符的方法，这次不使用sed，使用 Bashprintf内置的引用和变量创建功能（这也说明了使用进程替换而不是管道）：

while read -d '' -r file; do echo "$file"; printf -v name "%q" "$file"; echo "$name"; done< <(find -print0)

The variable $namewill be available outside the loop, since using process substitution prevents the creation of a subshell around the loop.

该变量$name将在循环外可用，因为使用进程替换可防止在循环周围创建子外壳。

Answer 2

回答by Fritz G. Mehner

The following snippet handles all filenames (those including blanks, quotes, newlines, ...):

以下代码段处理所有文件名（包括空格、引号、换行符等）：

startdir="${1:-.}"                              # first parameter or working directory

#-------------------------------------------------------------------------------
#  IFS is undefined
#  read:
#  -r  do not allow backslashes to escape any characters
#  -d  delimiter is touch a "b c" d
files="a b\ c d"
ls $files
  (not a valid character in a filename)
#  done < <( find ... ) . redirection from a process substitution
#-------------------------------------------------------------------------------
while IFS=  read -r -d '' file; do
  echo "'$file'"
done < <( find "$startdir" -type f -print0 )

回答by Gordon Davisson

There's a pretty serious problem with the escaping approach: what escapes are needed depends on the context the variable's going to be expanded in, and in the usual case there's no escaping that'll work. For instance, if you're going to do something simple like:

转义方法有一个非常严重的问题：需要什么转义取决于变量将在其中展开的上下文，并且在通常情况下没有转义会起作用。例如，如果你要做一些简单的事情，比如：

shopt -s nullglob    # In case of empty directories...
for filepath in "$dir"/*; do    # loop over all files in the specified directory
    filename="${filepath##*/}"    # You just wanted the files' names?  No problem.
    echo "$filename"
done

...it won't work (ls looks for 4 files: "a", "b\", "c", and "d") because the shell doesn't pay any attention to escapes when it word-splits $files. You could use eval ls $files, but that would fail on things like tabs in the filenames.

...它不会工作（ls 查找 4 个文件：“a”、“b\”、“c”和“d”）因为 shell 在分词时不注意转义 $文件。您可以使用eval ls $files，但在文件名中的选项卡之类的内容上会失败。

The while ... read ... done < <(find ... -print0)approach fgm suggested works solidly (and because of the flexibility of find's search patterns, is very powerful), but it's also a rather messy pile of workarounds for various possible problems; if you don't need find's power, it's not hard to get things done with forand *:

while ... read ... done < <(find ... -print0)fgm 建议的方法很有效（并且由于 find 的搜索模式的灵活性，非常强大），但对于各种可能的问题，它也是一堆相当混乱的变通方法；如果你不需要 find 的力量，用forand完成事情并不难*：

shopt -s nullglob
pathlist1=("$dir1"/*)    # Get a list of paths of files in dir1
filelist1=("${pathlist1[@]##*/}")    # Parse off just the filenames
pathlist2=("$dir2"/*)    # Same for dir2
filelist2=("${pathlist2[@]##*/}")
# now compare filelist1 with filelist2...

If (as you mention in the question) you're interested in comparing the two directory trees, looping through one of them isn't quite what you want; it'd be better to put their contents into arrays, like this:

如果（正如您在问题中提到的）您对比较两个目录树感兴趣，那么遍历其中之一并不是您想要的；最好将它们的内容放入数组中，如下所示：

FILE_ESCAPED=`echo "$FILE" | \
sed s/\ /\\\\\\\ /g | \
sed s/\'/\\\\\\\'/g | \
sed s/\&/\\\\\\\&/g | \
sed s/\;/\\\\\\\;/g | \
sed s/\(/\\\\\(/g | \
sed s/\)/\\\\\)/g `

(Note that AFAIK the "${pathlist2[@]##*/}"construct is not standard, but seems to have been supported in both bash and zsh for a while now.)

（请注意，AFAIK"${pathlist2[@]##*/}"构造不是标准的，但现在 bash 和 zsh 似乎都支持了一段时间。）

Answer 4

回答by Pascal Thivent

I found this How to escape file names in bash shell scriptswhile googling that I'm quoting below:

我在谷歌搜索时发现了这个如何在 bash shell 脚本中转义文件名，我在下面引用：

After fighting with Bash for quite some time, I found out that the following code provides a nice basis for escaping special characters. Of cource it is not complete, but the most important characters are filtered.
If anybody has a better solution, please let me know. It works and it is readable but not pretty.
FILE_ESCAPED=`echo "$FILE" | \
sed s/\ /\\\\\\\ /g | \
sed s/\'/\\\\\\\'/g | \
sed s/\&/\\\\\\\&/g | \
sed s/\;/\\\\\\\;/g | \
sed s/$/\\\\\(/g | \
sed s/$/\\\\\)/g `

在与 Bash 斗争了一段时间后，我发现以下代码为转义特殊字符提供了很好的基础。当然它并不完整，但最重要的字符被过滤掉了。
如果有人有更好的解决方案，请告诉我。它有效，可读但不漂亮。
#!/bin/bash

while read filename; do
  echo 'I am doing something with "'"$filename"'".'
done < <(find)

Maybe you could use it as starting point.

也许你可以用它作为起点。

Answer 5

回答by Ignacio Vazquez-Abrams

find . -exec ls {} \;

Do note that the <( )notation won't work when bash is invoked as /bin/sh.

请注意，<( )当 bash 被调用为/bin/sh.

Answer 6

回答by ennuikiller

The find command sometimes works in this situation:

find 命令有时适用于这种情况：

##代码##

for example

例如

bash 是否有可用的文件和目录名称的“转义转换器”？

提问by Richard T

采纳答案by Paused until further notice.

回答by Fritz G. Mehner

回答by Gordon Davisson

回答by Pascal Thivent

回答by Ignacio Vazquez-Abrams

回答by ennuikiller

相关推荐

最近更新

标签

bash 是否有可用的文件和目录名称的“转义转换器”？

提问by Richard T

采纳答案by Paused until further notice.

回答by Fritz G. Mehner

回答by Gordon Davisson

回答by Pascal Thivent

回答by Ignacio Vazquez-Abrams

回答by ennuikiller

相关推荐

Bash Case 菜单 - 动态选择

在 bash 中动态构建命令

使用 Lua 脚本启用 bash 输出颜色

bash 带空格的 Shell 变量，引用单个命令行选项

相关推荐

最近更新

标签