C++ 字符串文字转义符规则
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10220401/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Rules for C++ string literals escape character
提问by David Stone
What are the rules for the escape character \
in string literals? Is there a list of all the characters that are escaped?
\
字符串文字中转义字符的规则是什么?是否有所有转义字符的列表?
In particular, when I use \
in a string literal in gedit, and follow it by any three numbers, it colors them differently.
特别是,当我\
在 gedit 中的字符串文字中使用时,并在其后跟任意三个数字时,它们的颜色会有所不同。
I was trying to create a std::string
constructed from a literal with the character 0
followed by the null character (\0
), followed by the character 0
. However, the syntax highlighting alerted me that maybe this would create something like the character 0
followed by the null character (\00
, aka \0
), which is to say, only two characters.
我试图创建一个std::string
从文字构造的字符,0
后跟空字符 ( \0
),后跟字符0
. 但是,语法突出显示提醒我,这可能会创建类似字符0
后跟空字符(\00
, aka \0
)的东西,也就是说,只有两个字符。
For the solution to just this one problem, is this the best way to do it:
为了解决这个问题,这是最好的方法吗:
std::string ("0std::string ("0std::string ("0std::string const str({'using namespace std::string_literals;
auto const x = "##代码##" "0"s;
', '6', '\a', 'H', '\t'});
" "0", 3) // String concatenation
00", 3);
" "0", 3) // String concatenation
And is there some reference for what the escape character does in string literals in general? What is '\a', for instance?
是否有一些关于转义字符在字符串文字中的作用的参考?例如,'\a' 是什么?
回答by dan04
Control characters:
控制字符:
(Hex codes assume an ASCII-compatible character encoding.)
(十六进制代码采用与 ASCII 兼容的字符编码。)
\a
=\x07
= alert (bell)\b
=\x08
= backspace\t
=\x09
= horizonal tab\n
=\x0A
= newline (or line feed)\v
=\x0B
= vertical tab\f
=\x0C
= form feed\r
=\x0D
= carriage return\e
=\x1B
= escape (non-standard GCC extension)
\a
=\x07
=警报(钟形)\b
=\x08
=退格\t
== 水平\x09
标签\n
==\x0A
换行(或换行)\v
=\x0B
=垂直制表\f
=\x0C
= 换页\r
=\x0D
=回车\e
==\x1B
转义(非标准 GCC 扩展)
Punctuation characters:
标点符号:
\"
= quotation mark (backslash not required for'"'
)\'
= apostrophe (backslash not required for"'"
)\?
= question mark (used to avoid trigraphs)\\
= backslash
\"
= 引号(不需要反斜杠'"'
)\'
= 撇号(不需要反斜杠"'"
)\?
= 问号(用于避免三合字母)\\
= 反斜杠
Numeric character references:
数字字符参考:
\
+ up to 3 octal digits\x
+ any number of hex digits\u
+ 4 hex digits (Unicode BMP, new in C++11)\U
+ 8 hex digits (Unicode astral planes, new in C++11)
\
+ 最多 3 个八进制数字\x
+ 任意数量的十六进制数字\u
+ 4 个十六进制数字(Unicode BMP,C++11 新增)\U
+ 8 个十六进制数字(Unicode 星体平面,C++11 中的新内容)
\0
= \00
= \000
= octal ecape for null character
\0
== \00
=\000
空字符的八进制转义
If you do want an actual digit character after a \0
, then yes, I recommend string concatenation. Note that the whitespace between the parts of the literal is optional, so you can write "\0""0"
.
如果您确实想要 a 之后的实际数字字符\0
,那么是的,我建议使用字符串连接。请注意,文字部分之间的空格是可选的,因此您可以编写"\0""0"
.
回答by jli
\a
is the bell/alert character, which on some systems triggers a sound. \nnn
, represents an arbitrary ASCII character in octal base. However, \0
is special in that it represents the null character no matter what.
\a
是铃声/警报字符,在某些系统上会触发声音。\nnn
, 表示以八进制为基础的任意 ASCII 字符。但是,\0
它的特殊之处在于它无论如何都代表空字符。
To answer your original question, you could escape your '0' characters as well, as:
要回答您的原始问题,您也可以转义 '0' 字符,如下所示:
##代码##(since an ASCII '0' is 60 in octal)
(因为 ASCII '0' 在八进制中是 60)
The MSDN documentationhas a pretty detailed article on this, as well cppreference
在MSDN文档对此有一个非常详细的文章,以及cppreference
回答by mgiuffrida
\0 will be interpreted as an octal escape sequence if it is followed by other digits, so \00 will be interpreted as a single character. (\0 is technically an octal escape sequence as well, at least in C).
\0 将被解释为八进制转义序列,如果其后跟其他数字,因此 \00 将被解释为单个字符。(\0 在技术上也是一个八进制转义序列,至少在 C 中是这样)。
The way you're doing it:
你这样做的方式:
##代码##works because this version of the constructor takes a char array; if you try to just pass "0\0" "0" as a const char*, it will treat it as a C string and only copy everything up until the null character.
之所以有效,是因为此版本的构造函数采用 char 数组;如果您尝试将 "0\0" "0" 作为 const char* 传递,它会将其视为 C 字符串,并且只复制所有内容直到空字符。
Here is a list of escape sequences.
这是转义序列列表。
回答by David Stone
I left something like this as a comment, but I feel it probably needs more visibility as none of the answers mention this method:
我留下了这样的评论作为评论,但我觉得它可能需要更多的可见性,因为没有一个答案提到这个方法:
The method I now prefer for initializing a std::string
with non-printing characters in general (and embedded null characters in particular) is to use the C++11 feature of initializer lists.
我现在更喜欢std::string
使用非打印字符(特别是嵌入的空字符)初始化 a 的方法是使用初始化列表的 C++11 特性。
I am not required to perform error-prone manual counting of the number of characters that I am using, so that if later on I want to insert a '\013' in the middle somewhere, I can and all of my code will still work. It also completely sidesteps any issues of using the wrong escape sequence by accident.
我不需要对我正在使用的字符数执行容易出错的手动计数,因此如果稍后我想在中间某处插入一个 '\013',我可以并且我的所有代码仍然可以工作. 它还完全避免了意外使用错误转义序列的任何问题。
The only downside is all of those extra '
and ,
characters.
唯一不足的是所有这些额外的'
和,
字符。
回答by David Stone
With the magic of user-defined literals, we have yet another solution to this. C++14 added a std::string
literal operator.
借助用户定义文字的魔力,我们还有另一种解决方案。C++14 添加了一个std::string
文字运算符。
Constructs a string of length 2, with a '\0' character (null) followed by a '0' character (the digit zero). I am not sure if it is more or less clear than the initializer_list<char>
constructor approach, but it at least gets rid of the '
and ,
characters.
构造一个长度为 2 的字符串,其中包含一个 '\0' 字符(空),后跟一个 '0' 字符(数字零)。我不确定它是否比initializer_list<char>
构造函数方法更清晰,但它至少摆脱了'
和,
字符。