C 和 C++ 中字符串文字的类型是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2245664/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 22:41:07  来源:igfitidea点击:

What is the type of string literals in C and C++?

c++cstringconst

提问by missingfaktor

What is the type of string literal in C? Is it char *or const char *or const char * const?

C中字符串文字的类型是什么?是不是char *还是const char *const char * const

What about C++?

C++ 呢?

采纳答案by Michael Burr

In C the type of a string literal is a char[]- it's not constaccording to the type, but it is undefined behavior to modify the contents. Also, 2 different string literals that have the same content (or enough of the same content) might or might not share the same array elements.

在 C 中,字符串文字的类型是 a char[]- 它不是const根据类型,而是修改内容的未定义行为。此外,具有相同内容(或足够多的相同内容)的 2 个不同字符串文字可能会或可能不会共享相同的数组元素。

From the C99 standard 6.4.5/5 "String Literals - Semantics":

来自 C99 标准 6.4.5/5 "String Literals - Semantics":

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

在转换阶段 7 中,将一个字节或值为零的代码附加到由一个或多个字符串文字产生的每个多字节字符序列。然后使用多字节字符序列初始化一个静态存储持续时间和长度刚好足以包含该序列的数组。对于字符串文字,数组元素具有 type char,并使用多字节字符序列的各个字节进行初始化;对于宽字符串文字,数组元素具有 type wchar_t,并使用宽字符序列进行初始化...

如果这些数组的元素具有适当的值,则未指定这些数组是否不同。如果程序尝试修改这样的数组,则行为未定义。

In C++, "An ordinary string literal has type 'array of n const char'" (from 2.13.4/1 "String literals"). But there's a special case in the C++ standard that makes pointer to string literals convert easily to non-const-qualified pointers (4.2/2 "Array-to-pointer conversion"):

在 C++ 中,“普通字符串文字的类型为‘n 数组const char’”(来自 2.13.4/1“字符串文字”)。但是在 C++ 标准中有一个特殊情况,它使指向字符串文字的指针很容易转换为非 const 限定的指针(4.2/2“数组到指针的转换”):

A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; a wide string literal can be converted to an rvalue of type “pointer to wchar_t”.

不是宽字符串文字的字符串文字 (2.13.4) 可以转换为“指向 char 的指针”类型的右值;宽字符串文字可以转换为“指向 wchar_t 的指针”类型的右值。

As a side note - because arrays in C/C++ convert so readily to pointers, a string literal can often be used in a pointer context, much as any array in C/C++.

作为旁注 - 因为 C/C++ 中的数组很容易转换为指针,所以字符串文字通常可以在指针上下文中使用,就像 C/C++ 中的任何数组一样。



Additional editorializing: what follows is really mostly speculation on my part about the rationale for the choices the C and C++ standards made regarding string literal types. So take it with a grain of salt (but please comment if you have corrections or additional details):

额外的社论:接下来的内容实际上主要是我对 C 和 C++ 标准对字符串文字类型所做选择的基本原理的推测。因此,请谨慎对待(但如果您有更正或其他详细信息,请发表评论):

I think that the C standard chose to make string literal non-const types because there was (and is) so much code that expects to be able to use non-const-qualified charpointers that point to literals. When the constqualifier got added (which if I'm not mistaken was done around ANSI standardization time, but long after K&R C had been around to accumulate a ton of existing code) if they made pointers to string literals only able to be be assigned to char const*types without a cast nearly every program in existence would have required changing. Not a good way to get a standard accepted...

我认为 C 标准选择使字符串文字非常量类型是因为曾经(并且现在)有很多代码希望能够使用char指向文字的非常量限定指针。当const限定符被添加时(如果我没记错的话,这是在 ANSI 标准化时间完成的,但在 K&R C 已经积累了大量现有代码之后很久),如果它们使指向字符串文字的指针只能分配给char const*没有强制转换的类型几乎每个存在的程序都需要更改。不是让标准被接受的好方法......

I believe the change to C++ that string literals are constqualified was done mainly to support allowing a literal string to more appropriately match an overload that takes a "char const*" argument. I think that there was also a desire to close a perceived hole in the type system, but the hole was largely opened back up by the special case in array-to-pointer conversions.

我相信对 C++ 进行字符串文字const限定的更改主要是为了支持允许文字字符串更合适地匹配采用“ char const*”参数的重载。我认为也有人希望关闭类型系统中的一个感知漏洞,但是这个漏洞在很大程度上被数组到指针转换中的特殊情况打开了。

Annex D of the standard indicates that the "implicit conversion from const to non-const qualification for string literals (4.2) is deprecated", but I think so much code would still break that it'll be a long time before compiler implementers or the standards committee are willing to actually pull the plug (unless some other clever technique can be devised - but then the hole would be back, wouldn't it?).

标准的附件 D 表明“不推荐使用字符串文字(4.2)从常量到非常量限定的隐式转换”,但我认为很多代码仍然会被破坏,以至于编译器实现者或标准委员会实际上愿意拔掉插头(除非可以设计出其他一些聪明的技术——但那样漏洞就会回来,不是吗?)。

回答by Christoph

A C string literal has type char [n]where nequals number of characters + 1 to account for the implicit zero at the end of the string.

AC字符串文字具有类型char [n],其中n在所述字符串的末尾等于字符+ 1到帐户的数目为隐含零。

The array will be statically allocated; it is not const, but modifying it is undefined behaviour.

该数组将被静态分配;它不是const,但修改它是未定义的行为。

If it had pointer type char *or incomplete type char [], sizeofcould not work as expected.

如果它有指针类型char *或不完整类型char [],则sizeof无法按预期工作。

Making string literals constis a C++ idiom and not part of any C standard.

制作字符串文字const是 C++ 习惯用法,而不是任何 C 标准的一部分。

回答by Lundin

For various historical reasons, string literals were always of type char[]in C.

由于各种历史原因,字符串字面量始终是char[]C 中的类型。

Early on (in C90), it was stated that modifying a string literal invokes undefined behavior.

早期(在 C90 中),有人指出修改字符串文字会调用未定义的行为。

They didn't ban such modifications though, nor did they make string literals const char[]which would have made more sense. This was for backwards-compatibility reasons with old code. Some old OS (most notably DOS) didn't protest if you modified string literals, so there was plenty of such code around.

不过,他们并没有禁止这种修改,也没有const char[]制作更有意义的字符串文字。这是出于与旧代码向后兼容的原因。如果你修改了字符串文字,一些旧的操作系统(尤其是 DOS)不会抗议,所以有很多这样的代码。

C still has this defect today, even in the most recent C standard.

即使在最新的 C 标准中,C 仍然存在这个缺陷。

C++ inherited the same very same defect from C, but in later C++ standards, they have finally made string literals const(flagged obsolete in C++03, finally fixed in C++11).

C++ 继承了 C 的相同缺陷,但在后来的 C++ 标准中,他们终于制作了字符串文字const(在 C++03 中标记为过时,最终在 C++11 中修复)。

回答by 0xfe

They used to be of type char[]. Now they are of type const char[].

他们曾经是char[]. 现在它们属于const char[].