将包含空字符 (\0) 的字符串分配给 Bash 中的变量

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6570531/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 20:43:43  来源:igfitidea点击:

Assign string containing null-character (\0) to a variable in Bash

bashnull-character

提问by antiplex

While trying to process a list of file-/foldernames correctly (see my other questions) through the use of a NULL-character as a delimiter I stumbled over a strange behaviour of Bash that I don't understand:

在尝试通过使用 NULL 字符作为分隔符正确处理文件/文件夹名列表(请参阅我的其他问题)时,我偶然发现了 Bash 的一个奇怪行为,我不明白:

When assigning a string containing one or more NULL-character to a variable, the NULL-characters are lost / ignored / not stored.

将包含一个或多个 NULL 字符的字符串分配给变量时,NULL 字符将丢失/忽略/不存储。

For example,

例如,

echo -ne "n
VAR1=`echo -ne "n
VAR1=`echo -ne "n
quote() { sed 's/\/\\/g;s/\x0/\x00/g'; }
m
## Our example output generator, with NUL chars
ascii_table() { echo -en "$(echo '\'0{0..3}{0..7}{0..7} | tr -d " ")"; }
## store
myvar_quoted=$(ascii_table | quote)
## use
echo -en "$myvar_quoted"
k" | xxd -p | tr -d '\n'` echo -ne "$VAR1" | xxd -r -p | od -c # -> 0000000 n
VAR=$(echo -ne "foo##代码##bar" | base64)
echo -n "$VAR" | base64 -d | xargs -0 ...
m ##代码## k
m##代码##k"` echo -ne "$VAR1" | od -c # -> 0000000 n m k
m##代码##k" | od -c # -> 0000000 n ##代码## m ##代码## k

But:

但:

##代码##

This means that I would need to write that string to a file (for example, in /tmp) and read it back from there if piping directly is not desired or feasible.

这意味着我需要将该字符串写入文件(例如,在 /tmp 中),如果不需要或不可行直接管道,则从那里读取它。

When executing these scripts in Z shell(zsh) the strings containing \0 are preserved in both cases, but sadly I can't assume that zsh is present in the systems running my script while Bash should be.

Z shell(zsh) 中执行这些脚本时,包含 \0 的字符串在两种情况下都保留,但遗憾的是,我不能假设 zsh 存在于运行我的脚本的系统中,而 Bash 应该存在。

How can strings containing \0 chars be stored or handled efficiently without losing any (meta-) characters?

如何有效地存储或处理包含 \0 个字符的字符串而不会丢失任何(元)字符?

回答by jeff

In Bash, you can't store the NULL-character in a variable.

在 Bash 中,您不能将 NULL 字符存储在变量中。

You may, however, store a plain hex dump of the data (and later reverse this operation again) by using the xxdcommand.

但是,您可以使用该xxd命令存储数据的纯十六进制转储(然后再次反转此操作)。

##代码##

回答by vaab

As others have already stated, you can't store/use NUL char:

正如其他人已经说过的,您不能存储/使用 NUL char

  • in a variable
  • in an argument of the command line.
  • 在一个变量中
  • 在命令行的参数中。

However, you can handle any binary data(including NUL char):

但是,您可以处理任何二进制数据(包括 NUL 字符):

  • in pipes
  • in files
  • 在管道中
  • 在文件中

So to answer your last question:

所以要回答你的最后一个问题:

can anybody give me a hint how strings containing \0 chars can be stored or handled efficiently without losing any (meta-) characters?

谁能给我一个提示,如何在不丢失任何(元)字符的情况下有效地存储或处理包含 \0 个字符的字符串?

You can use files or pipesto store and handle efficiently any string with any meta-characters.

您可以使用文件或管道来有效地存储和处理具有任何元字符的任何字符串。

If you plan to handle data, you should note additionally that:

如果您打算处理数据,则还应注意:

  • Only the NUL char will be eaten by variable and argument of the command line, you can check this.
  • Be wary that command substitution (as $(command..)or `command..`) has an additional twist above being a variable as it'll eat your ending new lines.

Bypassing limitations

绕过限制

If you want to use variables, then you must get rid of the NUL char by encoding it, and various other solutions here give clever ways to do that (an obvious way is to use for example base64 encoding/decoding).

如果你想使用变量,那么你必须通过编码来摆脱 NUL 字符,这里的各种其他解决方案提供了巧妙的方法来做到这一点(一个明显的方法是使用例如 base64 编码/解码)。

If you are concerned by memory or speed, you'll probably want to use a minimal parser and only quote NUL character (and the quoting char). In this case this would help you:

如果您关心内存或速度,您可能希望使用最小的解析器并且只引用 NUL 字符(和引用字符)。在这种情况下,这将帮助您:

##代码##

Then, you can secure your data before storing them in variables and command line argument by piping your sensitive data into quote, which will output a safe data stream without NUL chars. You can get back the original string (with NUL chars) by using echo -en "$var_quoted"which will send the correct string on the standard output.

然后,您可以在将您的数据存储在变量和命令行参数中之前保护您的数据quote,方法是将您的敏感数据传送到 中,这将输出一个没有 NUL 字符的安全数据流。您可以通过使用echo -en "$var_quoted"它将在标准输出上发送正确的字符串来取回原始字符串(带有 NUL 字符)。

Example:

例子:

##代码##

Note: use | hdto get a clean view of your data in hexadecimal and check that you didn't loose any NUL chars.

注意:用于| hd以十六进制清晰地查看数据并检查您是否没有丢失任何 NUL 字符。

Changing tools

更换工具

Remember you can go pretty far with pipes without using variables nor argument in command line, don't forget for instance the <(command ...)construct that will create a named pipe (sort of a temporary file).

请记住,您可以在不使用命令行中的变量或参数的情况下使用管道走得很远,不要忘记例如<(command ...)将创建命名管道(某种临时文件)的构造。

EDIT:the first implementation of quotewas incorrect and would not deal correctly with \special characters interpreted by echo -en. Thanks @xhienne for spotting that.

编辑:的第一个实现quote是不正确的,并且不会正确处理\echo -en. 感谢@xhienne 发现这一点。

EDIT2:the second implementation of quotehad bug because of using only \0than would actually eat up more zeroes as \0, \00, \000and \0000are equivalent. So \0was replaced by \x00. Thanks for @MatthijsSteen for spotting this one.

EDIT2:第二个执行的quote,因为只使用过的错误\0比实际会吃掉更多的零为\0\00\000\0000是等价的。所以\0被替换了\x00。感谢 @MatthijsSteen 发现这个。

回答by vontrapp

I love jeff's answer. I would use Base64 encoding instead of xxd. It saves a little space and would be (I think) more recognizable as to what is intended.

我喜欢杰夫的回答。我会使用 Base64 编码而不是 xxd。它节省了一点空间,并且(我认为)更容易识别其意图。

##代码##

As for -e, it is needed for the echo of a literal string with an encoded null ('\0'), though I also seem to recall something about "echo -e" being unsafe if you're echoing any user input as they could inject escape sequences that echo will interpret and end up with bad things. The -e flag is not needed when echoing the encoded stored string into the decode.

至于-e,它需要一个带有编码空值('\0')的文字字符串的回显,尽管我似乎还记得如果您将任何用户输入回显为不安全的“echo -e”他们可以注入转义序列,echo 将解释并以坏事结束。将编码的存储字符串回显到解码中时不需要 -e 标志。