在 sed bash 脚本变量中转义问号字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24599788/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 10:48:42  来源:igfitidea点击:

Escaping question mark character in sed bash script variable

regexbashsed

提问by user3553107

I have a set of saved html files with links in them of the form http://mywebsite.com/showfile.cgi?key=somenumberbut I want to kill the question mark (side-story is that firefox hates ? and randomly converts it to %3F I'm sure there's some magic fix but that's for another question...)

我有一组保存的 html 文件,其中包含形式为http://mywebsite.com/showfile.cgi?key=somenumber 的链接,但我想取消问号(侧面故事是 firefox 讨厌?并随机转换它到 %3F 我确定有一些神奇的修复,但那是另一个问题......)

However, I think my code is causing the question-mark character to not be read/saved/handled properly when storing the options as a variable by bash

但是,我认为我的代码导致在 bash 将选项存储为变量时无法正确读取/保存/处理问号字符

# Doesn't work (no pattern matched)
SED_OPTIONS='-i s/\.cgi\?key/\.cgikey/g'

# Works e.g. http://mywebsite.com/showfileblah?key=somenumber
SED_OPTIONS='-i s/\.cgi/blah/g'

# Leaves question mark in e.g. http://mywebsite.com/showfile.blah?key=somenumber
SED_OPTIONS='-i s/cgi\?/blah/g'

# Actual sed command run when using SED_OPTIONS (I define FILES earlier in
# the code)
sed $SED_OPTIONS $FILES

# Not using the SED_OPTIONS variable works
# e.g. http://mywebsite.com/showfile.cgikey=somenumber
sed -i s/\.cgi\?key/\.cgikey/g $FILES

How can I get the full command to work using the SED_OPTIONS variable?

如何使用 SED_OPTIONS 变量获得完整的命令?

回答by mklement0

The safest way to store a list of options and arguments in variables is to use an array:

在变量中存储选项和参数列表最安全方法是使用数组

Also:

还:

  • You're using a basicregular expression (no -ror -Eoption), so ?is not a special char. and needs no escaping.
  • In the replacement string, which is not a regex, do not escape ..
  • No need for option g, since you're only replacing 1occurrence per line.
  • 您使用的是基本正则表达式(无-r-E选项),因此?不是特殊字符。并且不需要逃避。
  • 在不是正则表达式的替换字符串中,不要转义.
  • 不需要 option g,因为您每行只替换1 个出现。
# Create array with individual options/arguments.
SED_ARGS=( '-i' 's/\.cgi?key/.cgikey/' )

# Invoke `sed` with array - note the double-quoting.
sed "${SED_ARGS[@]}" $FILES

Similarly, it would be safer to use an array for the list of input files. $FILESwill only work if the individual filenames contain no embedded whitespace or other elements subject to shell expansions.

同样,使用数组作为输入文件列表会更安全。$FILES仅当单个文件名不包含嵌入的空格或其他受 shell 扩展影响的元素时才有效。

Generally:

一般来说:

  • Single-quote string literals, such as the sedscript here - to prevent the shell from interpreting them.
  • Double-quote variable references, to prevent the shell from performing additional operations on them, such as pathname expansion (globbing) and word splitting (splitting into multiple tokens by whitespace).
  • 引号字符串文字,例如sed这里的脚本 - 以防止 shell 解释它们。
  • 引号变量引用,以防止 shell 对它们执行额外的操作,例如路径名扩展(通配)和分词(通过空格拆分为多个标记)。

回答by Jonathan Leffler

I suggest storing the arguments for sedin an array:

我建议将 for 的参数存储sed在数组中:

SED_OPTIONS=( '-i' '-e' 's/\.cgi?key/\.cgikey/g' )

sed "${SED_OPTIONS[@]}" $FILES

However, that's only a part of the trouble.

然而,这只是问题的一部分。

First, note that when you type:

首先,请注意,当您键入:

sed -i s/\.cgi\?key/\.cgikey/g $FILES

what sedsees as the script argument is actually:

什么sed认为作为脚本参数居然是:

s/.cgi?key/.cgikey/g

because you didn't use any quotes to preserve the backslashes. (To demonstrate, use printf "%s\n" s/\.cgi\?key/\.cgikey/g, thus avoiding any questions of whether echois interpreting the backslashes.) One side effect of this is that a URL such as:

因为您没有使用任何引号来保留反斜杠。(为了演示,使用printf "%s\n" s/\.cgi\?key/\.cgikey/g,从而避免任何关于是否echo解释反斜杠的问题。)这样做的一个副作用是一个 URL,例如:

http://example.com/nodotcgi?key=value

will be mapped to:

将映射到:

http://example.com/nodo.cgikey=value

Using the single quotes when setting SED_OPTIONS ensures that the backslashes are preserved where required, and not putting a backslash before the ?works. I have both GNU sedand BSD sedon my Mac; I've aliased them as gnu-sedand bsd-sedfor clarity. Note that BSD sedrequires a suffix for -iand won't accept standard input with -i. So, I've dropped the -ifrom the commands.

在设置 SED_OPTIONS 时使用单引号可确保在需要的地方保留反斜杠,而不是在?工作前放置反斜杠。我的 Mac 上有 GNUsed和 BSD sed;我别名它们作为gnu-sedbsd-sed为清楚起见。请注意,BSDsed需要一个后缀,-i并且不会接受带有-i. 所以,我已经-i从命令中删除了。

$ URLS=(http://example.com/script.cgi?key=value http://example.com/nodotcgi?key=value)
$ SED_OPTIONS=( '-e' 's/\.cgi?key/\.cgikey/g' )
$ printf "%s\n" "${URLS[@]}" | bsd-sed "${SED_OPTIONS[@]}"
http://example.com/script.cgikey=value
http://example.com/nodotcgi?key=value
$ printf "%s\n" "${URLS[@]}" | gnu-sed "${SED_OPTIONS[@]}"
http://example.com/script.cgikey=value
http://example.com/nodotcgi?key=value
$ SED_OPTIONS=( '-e' 's/\.cgi\?key/\.cgikey/g' )
$ printf "%s\n" "${URLS[@]}" | bsd-sed "${SED_OPTIONS[@]}"
http://example.com/script.cgikey=value
http://example.com/nodotcgi?key=value
$ printf "%s\n" "${URLS[@]}" | gnu-sed "${SED_OPTIONS[@]}"
http://example.com/script.cgi?key=value
http://example.com/nodotcgi?key=value
$

Note the difference in behaviour between the two versions of sedwhen there's a backslash before the question mark (second part of the example).

请注意sed问号前有反斜杠时两个版本之间的行为差​​异(示例的第二部分)。