使用 Bash 时需要转义哪些字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15783701/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 23:32:22  来源:igfitidea点击:

Which characters need to be escaped when using Bash?

bashshellunixescapingspecial-characters

提问by fedorqui 'SO stop harming'

Is there any comprehensive list of characters that need to be escaped in Bash? Can it be checked just with sed?

是否有任何需要在 Bash 中转义的字符的完整列表?可以只检查sed吗?

In particular, I was checking whether %needs to be escaped or not. I tried

特别是,我正在检查是否%需要转义。我试过

echo "h%h" | sed 's/%/i/g'

and worked fine, without escaping %. Does it mean %does not need to be escaped? Was this a good way to check the necessity?

并且工作正常,没有逃脱%。是不是意味着%不需要转义?这是检查必要性的好方法吗?

And more general: are they the same characters to escape in shelland bash?

而更普遍的:它们是相同的字符,在逃避shellbash

回答by Jo So

There are two easy and safe rules which work not only in shbut also bash.

有两个工作,这不仅容易和安全规则sh,但也bash

1. Put the whole string in single quotes

1. 将整个字符串放在单引号中

This works for all chars except single quote itself. To escape the single quote, close the quoting before it, insert the single quote, and re-open the quoting.

这适用于除单引号本身之外的所有字符。要转义单引号,请关闭它之前的引用,插入单引号,然后重新打开引用。

'I'\''m a s@fe $tring which ends in newline
'

sed command: sed -e "s/'/'\\\\''/g; 1s/^/'/; \$s/\$/'/"

sed 命令: sed -e "s/'/'\\\\''/g; 1s/^/'/; \$s/\$/'/"

2. Escape every char with a backslash

2. 用反斜杠转义每个字符

This works for all characters except newline. For newline characters use single or double quotes. Empty strings must still be handled - replace with ""

这适用于除换行符以外的所有字符。对于换行符,使用单引号或双引号。空字符串仍必须处理 - 替换为""

\I\'\m\ \a\ \s\@\f\e\ $\t\r\i\n\g\ \w\h\i\c\h\ \e\n\d\s\ \i\n\ \n\e\w\l\i\n\e"
"

sed command: sed -e 's/./\\&/g; 1{$s/^$/""/}; 1!s/^/"/; $!s/$/"/'.

sed 命令:sed -e 's/./\\&/g; 1{$s/^$/""/}; 1!s/^/"/; $!s/$/"/'.

2b. More readable version of 2

2b. 更具可读性的 2 版本

There's an easy safe set of characters, like [a-zA-Z0-9,._+:@%/-], which can be left unescaped to keep it more readable

有一组简单安全的字符,例如[a-zA-Z0-9,._+:@%/-],可以不转义以使其更具可读性

I\'m\ a\ s@fe\ $tring\ which\ ends\ in\ newline"
"

sed command: LC_ALL=C sed -e 's/[^a-zA-Z0-9,._+@%/-]/\\&/g; 1{$s/^$/""/}; 1!s/^/"/; $!s/$/"/'.

sed 命令:LC_ALL=C sed -e 's/[^a-zA-Z0-9,._+@%/-]/\\&/g; 1{$s/^$/""/}; 1!s/^/"/; $!s/$/"/'.



Note that in a sed program, one can't know whether the last line of input ends with a newline byte (except when it's empty). That's why both above sed commands assume it does not. You can add a quoted newline manually.

请注意,在 sed 程序中,我们无法知道输入的最后一行是否以换行字节结尾(除非它为空)。这就是为什么上面的两个 sed 命令都假设它没有。您可以手动添加带引号的换行符。

Note that shell variables are only defined for text in the POSIX sense. Processing binary data is not defined. For the implementations that matter, binary works with the exception of NUL bytes (because variables are implemented with C strings, and meant to be used as C strings, namely program arguments), but you should switch to a "binary" locale such as latin1.

请注意,shell 变量仅针对 POSIX 意义上的文本定义。未定义处理二进制数据。对于重要的实现,除 NUL 字节外,二进制可以工作(因为变量是用 C 字符串实现的,旨在用作 C 字符串,即程序参数),但您应该切换到“二进制”语言环境,例如 latin1 .



(You can easily validate the rules by reading the POSIX spec for sh. For bash, check the reference manual linked by @AustinPhillips)

(您可以通过阅读 POSIX 规范轻松验证规则sh。对于 bash,请查看 @AustinPhillips 链接的参考手册)

回答by F. Hauri

format that can be reused as shell input

可以作为 shell 输入重用的格式

There is a specialprintfformat directive (%q) built for this kind of request:

为这种请求构建了一个特殊的printf格式指令 ( %q):

printf [-v var] format [arguments]

 %q     causes printf to output the corresponding argument
        in a format that can be reused as shell input.

printf [-v var] 格式 [参数]

 %q     causes printf to output the corresponding argument
        in a format that can be reused as shell input.

Some samples:

一些示例:

read foo
Hello world
printf "%q\n" "$foo"
Hello\ world

printf "%q\n" $'Hello world!\n'
$'Hello world!\n'

This could be used through variables too:

这也可以通过变量使用:

printf -v var "%q" "$foo
"
echo "$var"
$'Hello world\n'

Quick check with all (128) ascii bytes:

快速检查所有 (128) 个 ascii 字节:

Note that all bytes from 128 to 255 have to be escaped.

请注意,必须转义从 128 到 255 的所有字节。

for i in {0..127} ;do
    printf -v var \%o $i
    printf -v var $var
    printf -v res "%q" "$var"
    esc=E
    [ "$var" = "$res" ] && esc=-
    printf "%02X %s %-7s\n" $i $esc "$res"
done |
    column

This must render something like:

这必须呈现如下内容:

00 E ''         1A E $'2'    34 - 4          4E - N          68 - h      
01 E $'
echo test 1, 2, 3 and 4,5.
test 1, 2, 3 and 4,5.
1' 1B E $'\E' 35 - 5 4F - O 69 - i 02 E $'
echo test { 1, 2, 3 }
test { 1, 2, 3 }
2' 1C E $'4' 36 - 6 50 - P 6A - j 03 E $'
echo test{1,2,3}
test1 test2 test3

echo test\ {1,2,3}
test 1 test 2 test 3

echo test\ {\ 1,\ 2,\ 3\ }
test  1 test  2 test  3

echo test\ {\ 1\,\ 2,\ 3\ }
test  1, 2 test  3 
3' 1D E $'5' 37 - 7 51 - Q 6B - k 04 E $'
#!/bin/bash
special=$'`!@#$%^&*()-_+={}|[]\;\':",.<>?/ '
for ((i=0; i < ${#special}; i++)); do
    char="${special:i:1}"
    printf -v q_char '%q' "$char"
    if [[ "$char" != "$q_char" ]]; then
        printf 'Yes - character %s needs to be escaped\n' "$char"
    else
        printf 'No - character %s does not need to be escaped\n' "$char"
    fi
done | sort
4' 1E E $'6' 38 - 8 52 - R 6C - l 05 E $'
No, character % does not need to be escaped
No, character + does not need to be escaped
No, character - does not need to be escaped
No, character . does not need to be escaped
No, character / does not need to be escaped
No, character : does not need to be escaped
No, character = does not need to be escaped
No, character @ does not need to be escaped
No, character _ does not need to be escaped
Yes, character   needs to be escaped
Yes, character ! needs to be escaped
Yes, character " needs to be escaped
Yes, character # needs to be escaped
Yes, character $ needs to be escaped
Yes, character & needs to be escaped
Yes, character ' needs to be escaped
Yes, character ( needs to be escaped
Yes, character ) needs to be escaped
Yes, character * needs to be escaped
Yes, character , needs to be escaped
Yes, character ; needs to be escaped
Yes, character < needs to be escaped
Yes, character > needs to be escaped
Yes, character ? needs to be escaped
Yes, character [ needs to be escaped
Yes, character \ needs to be escaped
Yes, character ] needs to be escaped
Yes, character ^ needs to be escaped
Yes, character ` needs to be escaped
Yes, character { needs to be escaped
Yes, character | needs to be escaped
Yes, character } needs to be escaped
5' 1F E $'7' 39 - 9 53 - S 6D - m 06 E $'
 !"$&'()*,:;<=>?@[\]^`{|}
6' 20 E \ 3A - : 54 - T 6E - n 07 E $'\a' 21 E \! 3B E \; 55 - U 6F - o 08 E $'\b' 22 E \" 3C E \< 56 - V 70 - p 09 E $'\t' 23 E \# 3D - = 57 - W 71 - q 0A E $'\n' 24 E $ 3E E \> 58 - X 72 - r 0B E $'\v' 25 - % 3F E \? 59 - Y 73 - s 0C E $'\f' 26 E \& 40 - @ 5A - Z 74 - t 0D E $'\r' 27 E \' 41 - A 5B E \[ 75 - u 0E E $'6' 28 E \( 42 - B 5C E \ 76 - v 0F E $'7' 29 E \) 43 - C 5D E \] 77 - w 10 E $'0' 2A E \* 44 - D 5E E \^ 78 - x 11 E $'1' 2B - + 45 - E 5F - _ 79 - y 12 E $'2' 2C E \, 46 - F 60 E \` 7A - z 13 E $'3' 2D - - 47 - G 61 - a 7B E \{ 14 E $'4' 2E - . 48 - H 62 - b 7C E \| 15 E $'5' 2F - / 49 - I 63 - c 7D E \} 16 E $'6' 30 - 0 4A - J 64 - d 7E E \~ 17 E $'7' 31 - 1 4B - K 65 - e 7F E $'7' 18 E $'0' 32 - 2 4C - L 66 - f 19 E $'1' 33 - 3 4D - M 67 - g

Where first field is hexa value of byte, second contain Eif character need to be escaped and third field show escaped presentation of character.

第一个字段是字节的十六进制值,第二个包含E是否需要转义字符,第三个字段显示字符的转义表示。

Why ,?

为什么,

You could see some characters that don't alwaysneed to be escaped, like ,, }and {.

您可能会看到一些并不总是需要转义的字符,例如,,}{

So not alwaysbut sometime:

所以不总是有时

#%+-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~

or

或者

##代码##

but care:

但要注意:

##代码##

回答by Matthew

To save someone else from having to RTFM... in bash:

为了让其他人免于 RTFM... 在bash 中

Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of $, `, \, and, when history expansion is enabled, !.

用双引号将字符括起来会保留引号内所有字符的字面值,除了$, `, \, 和,当启用历史扩展时,!

...so if you escape those (and the quote itself, of course) you're probably okay.

......所以如果你逃避那些(当然还有引用本身)你可能没问题。

If you take a more conservative 'when in doubt, escape it' approach, it should be possible to avoid getting instead characters with special meaning by not escaping identifier characters (i.e. ASCII letters, numbers, or '_'). It's very unlikely these would ever (i.e. in some weird POSIX-ish shell) have special meaning and thus need to be escaped.

如果您采取更保守的“如有疑问,请转义”方法,应该可以通过不转义标识符字符(即 ASCII 字母、数字或“_”)来避免获得具有特殊含义的字符。这些不太可能(即在一些奇怪的 POSIX-ish shell 中)具有特殊含义,因此需要进行转义。

回答by codeforester

Using the print '%q'technique, we can run a loop to find out which characters are special:

使用该print '%q'技术,我们可以运行一个循环来找出哪些字符是特殊的:

##代码##

It gives this output:

它给出了这个输出:

##代码##

Some of the results, like ,look a little suspicious. Would be interesting to get @CharlesDuffy's inputs on this.

有些结果,好像,有点可疑。获得@CharlesDuffy 对此的意见会很有趣。

回答by cdarke

Characters that need escaping are different in Bourne or POSIX shell than Bash. Generally (very) Bash is a superset of those shells, so anything you escape in shellshould be escaped in Bash.

需要转义的字符在 Bourne 或 POSIX shell 中与 Bash 不同。通常(非常)Bash 是这些 shell 的超集,因此您转义的任何内容都shell应该在 Bash 中转义。

A nice general rule would be "if in doubt, escape it". But escaping some characters gives them a special meaning, like \n. These are listed in the man bashpages under Quotingand echo.

一个很好的一般规则是“如果有疑问,请避开它​​”。但是转义某些字符会赋予它们特殊的含义,例如\n. 这些man bashQuoting和下的页面中列出echo

Other than that, escape any character that is not alphanumeric, it is safer. I don't know of a single definitive list.

除此之外,转义任何不是字母数字的字符,这样更安全。我不知道一个明确的清单。

The man pages list them all somewhere, but not in one place. Learn the language, that is the way to be sure.

手册页在某处列出了它们,但不在一处。学习语言,这是确定的方法。

One that has caught me out is !. This is a special character (history expansion) in Bash (and csh) but not in Korn shell. Even echo "Hello world!"gives problems. Using single-quotes, as usual, removes the special meaning.

一个让我失望的是!。这是 Bash(和 csh)中的一个特殊字符(历史扩展),但不是 Korn shell。甚至echo "Hello world!"会出问题。像往常一样使用单引号会删除特殊含义。

回答by Austin Phillips

I presume that you're talking about bash strings. There are different types of strings which have a different set of requirements for escaping. eg. Single quotes strings are different from double quoted strings.

我想你说的是 bash 字符串。有不同类型的字符串对转义有不同的要求。例如。单引号字符串不同于双引号字符串。

The best reference is the Quotingsection of the bash manual.

最好的参考是bash 手册的引用部分。

It explains which characters needs escaping. Note that some characters may need escaping depending on which options are enabled such as history expansion.

它解释了哪些字符需要转义。请注意,根据启用的选项(例如历史扩展),某些字符可能需要转义。

回答by yuri

I noticed that bash automatically escapes some characters when using auto-complete.

我注意到 bash 在使用自动完成功能时会自动转义一些字符。

For example, if you have a directory named dir:A, bash will auto-complete to dir\:A

例如,如果您有一个名为 的目录dir:A,bash 将自动完成dir\:A

Using this, I runned some experiments using characters of the ASCII table and derived the following lists:

使用这个,我使用 ASCII 表的字符运行了一些实验并得出以下列表:

Characters that bash escapes on auto-complete: (includes space)

bash 在自动完成时转义的字符:(包括空格)

##代码##

Characters that bash does not escape:

bash 不会转义的字符

##代码##

(I excluded /, as it cannot be used in directory names)

(我排除了/,因为它不能用于目录名称)