C++ 标准是否允许未初始化的 bool 使程序崩溃?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/54120862/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 16:06:52  来源:igfitidea点击:

Does the C++ standard allow for an uninitialized bool to crash a program?

c++llvmundefined-behaviorabillvm-codegen

提问by Remz

I know that an "undefined behaviour"in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I assumed that the code was safe enough.

我知道C++中的“未定义行为”几乎可以让编译器做任何它想做的事情。然而,我遇到了一次让我感到惊讶的崩溃,因为我认为代码足够安全。

In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization was enabled.

在这种情况下,真正的问题只发生在使用特定编译器的特定平台上,并且只有在启用优化的情况下才会发生。

I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string trueor falseto an existing destination buffer.

我尝试了几种方法来重现问题并将其简化到最大程度。这是一个名为 的函数的摘录Serialize,它接受一个 bool 参数,并将字符串true或复制false到现有的目标缓冲区。

Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value?

这个函数会在代码中吗,如果 bool 参数是一个未初始化的值,实际上没有办法告诉它会崩溃吗?

// Zero-filled global buffer of 16 characters
char destBuffer[16];

void Serialize(bool boolValue) {
    // Determine which string to print based on boolValue
    const char* whichString = boolValue ? "true" : "false";

    // Compute the length of the string we selected
    const size_t len = strlen(whichString);

    // Copy string into destination buffer, which is zero-filled (thus already null-terminated)
    memcpy(destBuffer, whichString, len);
}

If this code is executed with clang 5.0.0 + optimizations, it will/can crash.

如果使用 clang 5.0.0 + 优化执行此代码,它将/可能会崩溃。

The expected ternary-operator boolValue ? "true" : "false"looked safe enough for me, I was assuming, "Whatever garbage value is in boolValuedoesn't matter, since it will evaluate to true or false anyhow."

预期的三元运算符boolValue ? "true" : "false"对我来说看起来足够安全,我假设,“无论垃圾值是什么boolValue都无关紧要,因为无论如何它都会评估为真或假。”

I have setup a Compiler Explorer examplethat shows the problem in the disassembly, here the complete example. Note: in order to repro the issue, the combination I've found that worked is by using Clang 5.0.0 with -O2 optimisation.

我已经设置了一个Compiler Explorer 示例,它显示了反汇编中的问题,这里是完整的示例。注意:为了重现这个问题,我发现有效的组合是使用带有 -O2 优化的 Clang 5.0.0。

#include <iostream>
#include <cstring>

// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
    bool uninitializedBool;

   __attribute__ ((noinline))  // Note: the constructor must be declared noinline to trigger the problem
   FStruct() {};
};

char destBuffer[16];

// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter
void Serialize(bool boolValue) {
    // Determine which string to print depending if 'boolValue' is evaluated as true or false
    const char* whichString = boolValue ? "true" : "false";

    // Compute the length of the string we selected
    size_t len = strlen(whichString);

    memcpy(destBuffer, whichString, len);
}

int main()
{
    // Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
    FStruct structInstance;

    // Output "true" or "false" to stdout
    Serialize(structInstance.uninitializedBool);
    return 0;
}

The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which shouldtechnically be either 0 or 1, and goes like this:

问题是由优化器引起的:它足够聪明地推断出字符串“true”和“false”的长度仅相差 1。因此,它不是真正计算长度,而是使用 bool 本身的值,这应该从技术上讲,要么是 0,要么是 1,就像这样:

const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue;       // clang clever optimization

While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

虽然这很“聪明”,但可以这么说,我的问题是:C++ 标准是否允许编译器假设 bool 只能具有“0”或“1”的内部数字表示并以这种方式使用它?

Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?

或者这是一个实现定义的情况,在这种情况下,实现假设它的所有布尔值都只包含 0 或 1,并且任何其他值都是未定义的行为领域?

采纳答案by Peter Cordes

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

是的,ISO C++ 允许(但不要求)实现来做出这个选择。

But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t.Even though that's required to be a fixed-layout type with no trap representations.

但也要注意,如果程序遇到 UB,ISO C++ 允许编译器发出故意崩溃的代码(例如,使用非法指令),例如作为帮助您查找错误的一种方式。(或者因为它是 DeathStation 9000。严格遵守并不足以使 C++ 实现对任何实际目的有用)。 因此,即使在读取未初始化的uint32_t. 即使这需要是没有陷阱表示的固定布局类型。

It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.

这是一个关于实际实现如何工作的有趣问题,但请记住,即使答案不同,您的代码仍然不安全,因为现代 C++ 不是汇编语言的可移植版本。



You're compiling for the x86-64 System V ABI, which specifies that a boolas a function arg in a register is represented by the bit-patterns false=0and true=1in the low 8 bits of the register1. In memory, boolis a 1-byte type that again must have an integer value of 0 or 1.

您正在为x86-64 System V ABI 进行编译,它指定 abool作为寄存器中的函数 arg 由位模式false=0true=1寄存器1的低 8 位表示。在内存中,bool是一个 1 字节的类型,它也必须具有 0 或 1 的整数值。

(An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)

(ABI 是同一平台的编译器同意的一组实现选择,因此它们可以编写调用彼此函数的代码,包括类型大小、结构布局规则和调用约定。)

ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !myboolwith xor eax,1to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&bto a bitwise AND for booltypes. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.

ISO C++ 没有指定它,但是这个 ABI 决定很普遍,因为它使 bool->int 转换便宜(只是零扩展)。我不知道有任何 ABI 不允许编译器bool为任何体系结构(不仅仅是 x86)假定 0 或 1 for 。它允许像优化!myboolxor eax,1翻转低比特:任何可能的代码,能够在单个CPU指令翻转0和1之间的位/整数/布尔。或者编译a&&b为按位 ANDbool类型。一些编译器确实在编译器中利用布尔值作为 8 位。对它们的操作效率低下吗?.

In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)

一般来说,as-if 规则允许编译器利用正在编译的目标平台上的真实情况,因为最终结果将是实现与 C++ 源代码相同的外部可见行为的可执行代码。(由于未定义行为对实际“外部可见”的所有限制:不是使用调试器,而是来自格式良好/合法的 C++ 程序中的另一个线程。)

The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString)to
5U - boolValue.
(BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data2.)

绝对允许编译器在其代码生成中充分利用 ABI 保证,并生成像您发现的那样优化strlen(whichString)
5U - boolValue.
(顺便说一句,这种优化有点聪明,但与memcpy作为即时数据存储的分支和内联相比,可能是短视的2。)

Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)

或者编译器可以创建一个指针表并使用 的整数值对其进行索引bool,再次假设它是 0 或 1。(这种可能性是@Barmar 的回答所建议的。)



Your __attribute((noinline))constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in mainwith push rax(which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to mainis the value it used for uninitializedBool. This is why you actually got values that weren't just 0.

__attribute((noinline))与优化构造启用导致铛只是加载从堆栈中一个字节作为uninitializedBool。它为mainwith 中的对象腾出了空间push rax(由于各种原因,它更小,并且效率与 一样sub rsp, 8),因此无论进入 AL 中的垃圾是什么,main都是它用于 的值uninitializedBool。这就是为什么您实际上获得的值不仅仅是0.

5U - random garbagecan easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.

5U - random garbage可以很容易地包装到一个大的无符号值,导致 memcpy 进入未映射的内存。目标位于静态存储中,而不是堆栈中,因此您不会覆盖返回地址或其他内容。



Other implementations could make different choices, e.g. false=0and true=any non-zero value. Then clang probably wouldn't make code that crashes for thisspecific instance of UB. (But it would still be allowed to if it wanted to.)I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.

其他实现可以做出不同的选择,例如false=0true=any non-zero value。那么 clang 可能不会为这个特定的 UB 实例编写崩溃的代码。(但如果它愿意,它仍然可以被允许。)我不知道有什么实现可以选择 x86-64 所做的任何其他事情bool,但是 C++ 标准允许许多没有人做甚至不想做的事情类似于当前 CPU 的硬件。

ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the boolinto unsigned char, which you're allowed to do because char*can alias anything. And unsigned charis guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)

ISO C++ 未指定当您检查或修改bool. (例如,memcpy通过使用boolinto unsigned char,您可以这样做,因为char*可以为任何东西添加别名。并且unsigned char保证没有填充位,因此 C++ 标准确实正式允许您在没有任何 UB 的情况下进行 hexdump 对象表示。指针转换来复制对象char foo = my_bool当然,表示与分配不同,因此不会发生布尔化为 0 或 1,并且您将获得原始对象表示。)

You've partially"hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{}definition so all translation units must have the same definition. Like with the inlinekeyword.)

您已经部分地“隐藏”从编译器此执行路径上的UBnoinline。但是,即使它没有内联,过程间优化仍然可以创建一个依赖于另一个函数定义的函数版本。(首先,clang 正在制作一个可执行文件,而不是一个可以发生符号插入的 Unix 共享库。其次,定义中的class{}定义因此所有翻译单元必须具有相同的定义。就像inline关键字一样。)

So a compiler could emit just a retor ud2(illegal instruction) as the definition for main, because the path of execution starting at the top of mainunavoidably encounters Undefined Behaviour.(Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)

所以编译器可以只发出一个retor ud2(非法指令)作为 的定义main,因为从顶部开始的执行路径main不可避免地会遇到未定义的行为。(如果编译器决定遵循非内联构造函数的路径,则编译器可以在编译时看到它。)

Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if()branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.

任何遇到 UB 的程序在其整个存在中都是完全未定义的。但是if()从未实际运行的函数或分支中的UB不会破坏程序的其余部分。在实践中,这意味着编译器可以决定发出非法指令,或 a ret,或不发出任何内容并落入下一个块/函数,对于可以在编译时证明包含或导致 UB 的整个基本块。

GCC and Clang in practice doactually sometimes emit ud2on UB, instead of even trying to generate code for paths of execution that make no sense.Or for cases like falling off the end of a non-voidfunction, gcc will sometimes omit a retinstruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.

GCC和锵在实践中确实有时会发出ud2关于UB,而不是甚至还试图生成,使没有意义的执行路径代码。或者对于像从非void函数的末尾脱落这样的情况,gcc 有时会省略一条ret指令。如果您认为“我的函数只会返回 RAX 中的任何垃圾”,那您就大错特错了。 现代 C++ 编译器不再将语言视为可移植的汇编语言。您的程序确实必须是有效的 C++,而无需假设您的函数的独立非内联版本在 asm 中的外观。

Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t*be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.

另一个有趣的例子是为什么在 AMD64 上对 mmap'ed 内存的未对齐访问有时会出现段错误?. x86 在未对齐的整数上不会出错,对吧?那么为什么未对齐会成uint16_t*为问题呢?因为alignof(uint16_t) == 2,违反该假设会导致在使用 SSE2 进行自动矢量化时出现段错误。

See alsoWhat Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.

另请参阅什么每个 C 程序员都应该了解未定义行为#1/3,clang 开发人员的一篇文章。

Key point: if the compiler noticed the UB at compile time, it could"break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.

关键点:如果编译器在编译时注意到 UB,它可能会“破坏”(发出令人惊讶的 asm)导致 UB 的代码路径,即使目标是 ABI,其中任何位模式都是bool.

Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Walland fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)

期望对程序员的许多错误充满敌意,尤其是现代编译器警告的事情。这就是您应该使用-Wall和修复警告的原因。C++ 不是一种用户友好的语言,并且 C++ 中的某些内容可能是不安全的,即使它在您正在编译的目标上的 asm 中是安全的。(例如,有符号溢出是 C++ 中的 UB,编译器会假设它不会发生,即使在为 2 的补码 x86 编译时,除非您使用clang/gcc -fwrapv.)

Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.

编译时可见的 UB 总是危险的,并且很难确定(通过链接时优化)您真的对编译器隐藏了 UB,因此可以推断它将生成什么样的 asm。

Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MINdoesn't optimize a<0as always-false, only that tmpis always negative. (Because INT_MIN+ a=INT_MAXis negative on this 2's complement target, and acan't be any higher than that.)

不要过于戏剧化;通常,编译器确实会让您摆脱某些事情并发出您期望的代码,即使某些东西是 UB。但是,如果编译器开发人员实施一些优化以获取有关值范围的更多信息(例如,变量是非负的,可能允许它优化符号扩展以在 x86 上释放零扩展),那么将来可能会出现问题- 64)。例如,在当前的 gcc 和 clang 中,doingtmp = a+INT_MIN并没有优化a<0为 always-false,只是tmp始终为负。(因为INT_MIN+a=INT_MAX在这个 2 的补码目标上是负数,并且a不能高于那个。)

So gcc/clang don't currently backtrack to derive range info for the inputs of a calculation, only on the results based on the assumption of no signed overflow: example on Godbolt. I don't know if this is optimization is intentionally "missed" in the name of user-friendliness or what.

因此,gcc/clang 目前不会回溯以获取计算输入的范围信息,仅基于基于无符号溢出假设的结果:例如 Godbolt。我不知道这是优化是否以用户友好的名义故意“错过”还是什么。

Also note that implementations (aka compilers) are allowed to define behaviour that ISO C++ leaves undefined. For example, all compilers that support Intel's intrinsics (like _mm_add_ps(__m128, __m128)for manual SIMD vectorization) must allow forming mis-aligned pointers, which is UB in C++ even if you don'tdereference them. __m128i _mm_loadu_si128(const __m128i *)does unaligned loads by taking a misaligned __m128i*arg, not a void*or char*. Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?

另请注意,允许实现(又名编译器)定义 ISO C++ 未定义的行为。例如,所有支持 Intel 内在函数(例如_mm_add_ps(__m128, __m128)手动 SIMD 矢量化)的编译器都必须允许形成未对齐的指针,即使您取消引用它们,这也是 C++ 中的 UB 。 __m128i _mm_loadu_si128(const __m128i *)通过采用未对齐的__m128i*arg 而不是 avoid*或 来执行未对齐的加载char*硬件向量指针和相应类型之间的“reinterpret_cast”是否是未定义的行为?

GNU C/C++ also defines the behaviour of left-shifting a negative signed number (even without -fwrapv), separately from the normal signed-overflow UB rules. (This is UB in ISO C++, while right shifts of signed numbers are implementation-defined (logical vs. arithmetic); good quality implementations choose arithmetic on HW that has arithmetic right shifts, but ISO C++ doesn't specify). This is documented in the GCC manual's Integer section, along with defining implementation-defined behaviour that C standards require implementations to define one way or another.

GNU C/C++ 还定义了左移负有符号数(即使没有-fwrapv)的行为,与正常的有符号溢出 UB 规则分开。(这是 ISO C++ 中的 UB,而有符号数的右移是实现定义的(逻辑与算术);高质量的实现选择具有算术右移的硬件上的算术,但 ISO C++ 未指定)。这记录在GCC 手册的整数部分中,以及定义实现定义的行为,C 标准要求实现定义一种或另一种方式。

There are definitely quality-of-implementation issues that compiler developers care about; they generally aren't tryingto make compilers that are intentionally hostile, but taking advantage of all the UB potholes in C++ (except ones they choose to define) to optimize better can be nearly indistinguishable at times.

肯定存在编译器开发人员关心的实现质量问题;他们通常不会试图制造故意敌对的编译器,但利用 C++ 中的所有 UB 坑洞(除了他们选择定义的坑洞)来更好地优化有时几乎无法区分。



Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.

脚注 1:高 56 位可能是被调用者必须忽略的垃圾,这对于比寄存器窄的类型来说是常见的。

(Other ABIs domake different choices here. Some do require narrow integer types to be zero- or sign-extended to fill a register when passed to or returned from functions, like MIPS64 and PowerPC64. See the last section of this x86-64 answer which compares vs. those earlier ISAs.)

其他 ABI确实在这里做出了不同的选择。有些确实要求窄整数类型在传递给函数或从函数返回时进行零或符号扩展以填充寄存器,例如 MIPS64 和 PowerPC64。请参阅此 x86-64 答案的最后一部分与那些早期的 ISA 进行比较。)

For example, a caller might have calculated a & 0x01010101in RDI and used it for something else, before calling bool_func(a&1). The caller could optimize away the &1because it already did that to the low byte as part of and edi, 0x01010101, and it knows the callee is required to ignore the high bytes.

例如a & 0x01010101,在调用bool_func(a&1). 调用者可以优化掉 ,&1因为它已经作为 的一部分对低字节进行了优化and edi, 0x01010101,并且它知道被调用者需要忽略高字节。

Or if a bool is passed as the 3rd arg, maybe a caller optimizing for code-size loads it with mov dl, [mem]instead of movzx edx, [mem], saving 1 byte at the cost of a false dependency on the old value of RDX (or other partial-register effect, depending on CPU model). Or for the first arg, mov dil, byte [r10]instead of movzx edi, byte [r10], because both require a REX prefix anyway.

或者,如果将 bool 作为第三个参数传递,则可能是优化代码大小的调用者加载它mov dl, [mem]而不是movzx edx, [mem],以对 RDX 的旧值(或其他部分寄存器效果,取决于在 CPU 型号上)。或者对于第一个 arg,mov dil, byte [r10]而不是movzx edi, byte [r10],因为两者都需要一个 REX 前缀。

This is why clang emits movzx eax, dilin Serialize, instead of sub eax, edi. (For integer args, clang violates this ABI rule, instead depending on the undocumented behaviour of gcc and clang to zero- or sign-extend narrow integers to 32 bits. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?So I was interested to see that it doesn't do the same thing for bool.)

这就是为什么 clang 发出movzx eax, dilin Serialize,而不是sub eax, edi. (对于整数 args,clang 违反了此 ABI 规则,而是取决于 gcc 和 clang 的未记录行为,将窄整数为零或符号扩展为 32 位。 将 32 位偏移量添加到指针时是否需要符号或零扩展? x86-64 ABI?所以我很想知道它对bool.)



Footnote 2:After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.

脚注 2:分支后,您将只有一个 4 字节的mov立即数,或一个 4 字节 + 1 字节的存储。长度隐含在存储宽度 + 偏移量中。

OTOH, glibc memcpy will do two 4-byte loads/stores with an overlap that depends on length, so this really does end up making the whole thing free of conditional branches on the boolean. See the L(between_4_7):block in glibc's memcpy/memmove. Or at least, go the same way for either boolean in memcpy's branching to select a chunk size.

OTOH,glibc memcpy 将执行两个 4 字节的加载/存储,其中重叠取决于长度,因此这确实最终使整个事情摆脱了布尔值的条件分支。请参阅glibc 的 memcpy/memmove 中的L(between_4_7):。或者至少,对 memcpy 分支中的任一布尔值采用相同的方式来选择块大小。

If inlining, you could use 2x mov-immediate + cmovand a conditional offset, or you could leave the string data in memory.

如果内联,您可以使用 2x mov-immediate +cmov和条件偏移量,或者您可以将字符串数据保留在内存中。

Or if tuning for Intel Ice Lake (with the Fast Short REP MOV feature), an actual rep movsbmight be optimal. glibc memcpymight start using rep movsbfor small sizes on CPUs with that feature, saving a lot of branching.

或者,如果针对 Intel Ice Lake(使用 Fast Short REP MOV 功能)进行调整,则实际rep movsb可能是最佳的。glibcmemcpy可能会开始rep movsb在具有该功能的 CPU 上使用小尺寸,从而节省大量分支。



Tools for detecting UB and usage of uninitialized values

用于检测 UB 和使用未初始化值的工具

In gcc and clang, you can compile with -fsanitize=undefinedto add run-time instrumentation that will warn or error out on UB that happens at runtime. That won't catch unitialized variables, though. (Because it doesn't increase type sizes to make room for an "uninitialized" bit).

在 gcc 和 clang 中,您可以编译-fsanitize=undefined以添加运行时检测,该检测将在运行时发生的 UB 上发出警告或错误。但是,这不会捕获未初始化的变量。(因为它不会增加类型大小来为“未初始化”位腾出空间)。

See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

To find usage of uninitialized data, there's Address Sanitizer and Memory Sanitizer in clang/LLVM.https://github.com/google/sanitizers/wiki/MemorySanitizershows examples of clang -fsanitize=memory -fPIE -piedetecting uninitialized memory reads. It might work best if you compile withoutoptimization, so all reads of variables end up actually loading from memory in the asm. They show it being used at -O2in a case where the load wouldn't optimize away. I haven't tried it myself. (In some cases, e.g. not initializing an accumulator before summing an array, clang -O3 will emit code that sums into a vector register that it never initialized. So with optimization, you can have a case where there's no memory read associated with the UB. But -fsanitize=memorychanges the generated asm, and might result in a check for this.)

要查找未初始化数据的使用情况,clang/LLVM 中有 Address Sanitizer 和 Memory Sanitizer。https://github.com/google/sanitizers/wiki/MemorySanitizer显示了clang -fsanitize=memory -fPIE -pie检测未初始化内存读取的示例。如果您在没有优化的情况下进行编译,它可能会工作得最好,因此所有变量的读取最终实际上是从 asm 中的内存加载。他们表明它-O2在负载不会优化的情况下使用。我自己没试过。(在某些情况下,例如在对数组求和之前不初始化累加器,clang -O3 将发出求和到从未初始化的向量寄存器中的代码。因此,通过优化,您可能会遇到没有与 UB 关联的内存读取的情况。 但-fsanitize=memory更改生成的 asm,并可能导致对此进行检查。)

It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.

MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).

它将容忍复制未初始化的内存,以及简单的逻辑和算术运算。通常,MemorySanitizer 会静默跟踪内存中未初始化数据的传播,并在根据未初始化值采用(或未采用)代码分支时报告警告。

MemorySanitizer 实现了 Valgrind(Memcheck 工具)中的功能子集。

It should work for this case because the call to glibc memcpywith a lengthcalculated from uninitialized memory will (inside the library) result in a branch based on length. If it had inlined a fully branchless version that just used cmov, indexing, and two stores, it might not have worked.

它应该适用于这种情况,因为memcpy使用length从未初始化的内存计算的glibc 调用将(在库内)导致基于length. 如果它内联了一个仅使用cmov、索引和两个存储的完全无分支版本,它可能无法工作。

Valgrind's memcheckwill also look for this kind of problem, again not complaining if the program simply copies around uninitialized data. But it says it will detect when a "Conditional jump or move depends on uninitialised value(s)", to try to catch any externally-visible behaviour that depends on uninitialized data.

Valgrindmemcheck也会寻找这种问题,如果程序只是简单地复制未初始化的数据,也不会抱怨。但它表示它将检测何时“条件跳转或移动取决于未初始化的值”,以尝试捕获任何依赖于未初始化数据的外部可见行为。

Perhaps the idea behind not flagging just a load is that structs can have padding, and copying the whole struct (including padding) with a wide vector load/store is not an error even if the individual members were only written one at a time. At the asm level, the information about what was padding and what is actually part of the value has been lost.

也许不标记负载背后的想法是结构可以有填充,并且即使单个成员一次只写入一个,使用宽向量加载/存储复制整个结构(包括填充)也不是错误。在 asm 级别,关于什么是填充以及什么实际上是值的一部分的信息已经丢失。

回答by rici

The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to trueor false). The truevalue doesn't have to be the same as the integer 1 -- indeed, there can be various representations of trueand false-- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.

允许编译器假定作为参数传递的布尔值是有效的布尔值(即已初始化或转换为trueor 的值false)。该true值不必与整数 1 相同——实际上,trueand可以有多种表示形式false——但参数必须是这两个值之一的某种有效表示,其中“有效表示”是实现——定义。

So if you fail to initialise a bool, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:

因此,如果您未能初始化 a bool,或者您成功地通过某个不同类型的指针覆盖了它,那么编译器的假设将是错误的,并且会出现未定义的行为。你被警告过:

50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)

50) 以本国际标准描述为“未定义”的方式使用 bool 值,例如通过检查未初始化的自动对象的值,可能会导致它表现得既不真也不假。(第 6.9.1 节第 6 段的脚注,基本类型)

回答by M.M

The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.

函数本身是正确的,但在您的测试程序中,调用该函数的语句使用未初始化变量的值导致未定义行为。

The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).

错误在调用函数中,可以通过代码或调用函数的静态分析检测到。使用您的编译器资源管理器链接,gcc 8.2 编译器确实检测到该错误。(也许您可以针对 clang 提交错误报告,说明它没有发现问题)。

Undefined behaviour means anythingcan happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.

未定义行为意味着任何事情都可能发生,包括程序在触发未定义行为的事件发生后几行崩溃。

NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.

注意。“未定义的行为会导致_____吗?”的答案 总是“是”。这实际上就是未定义行为的定义。

回答by Barmar

A bool is only allowed to hold the implementation-dependent values used internally for trueand false, and the generated code can assume that it will only hold one of these two values.

一个 bool 只允许保存内部用于trueand的实现相关的值false,并且生成的代码可以假设它只保存这两个值之一。

Typically, the implementation will use the integer 0for falseand 1for true, to simplify conversions between booland int, and make if (boolvar)generate the same code as if (intvar). In that case, one can imagine that the code generated for the ternary in the assignment would use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:

通常,实现将使用整数0forfalse1for true,以简化bool和之间的转换int,并if (boolvar)生成与 相同的代码if (intvar)。在这种情况下,可以想象为赋值中的三元生成的代码将使用该值作为指向两个字符串的指针数组的索引,即它可能被转换为类似的内容:

// the compile could make asm that "looks" like this, from your source
const static char *strings[] = {"false", "true"};
const char *whichString = strings[boolValue];

If boolValueis uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the stringsarray.

如果boolValue未初始化,它实际上可以保存任何整数值,这将导致访问strings数组边界之外。

回答by Tom Tanner

Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a boolcan only have an internal numerical representation of '0' or '1' and use it in such a way?

总结你的问题很多,你问 C++ 标准是否允许编译器假设 abool只能具有“0”或“1”的内部数字表示并以这种方式使用它?

The standard says nothing about the internal representation of a bool. It only defines what happens when casting a boolto an int(or vice versa). Mostly, because of these integral conversions (and the fact that people rely rather heavily on them), the compiler will use 0 and 1, but it doesn't have to (although it has to respect the constraints of any lower level ABI it uses).

该标准没有说明 a 的内部表示bool。它仅定义将 a 转换bool为 an时会发生什么int(反之亦然)。大多数情况下,由于这些整数转换(以及人们非常依赖它们的事实),编译器将使用 0 和 1,但它不必(尽管它必须尊重它使用的任何较低级别 ABI 的约束) )。

So, the compiler, when it sees a boolis entitled to consider that said boolcontains either of the 'true' or 'false' bit patterns and do anything it feels like. So if the values for trueand falseare 1 and 0, respectively, the compiler is indeed allowed to optimise strlento 5 - <boolean value>. Other fun behaviours are possible!

因此,编译器在看到 a 时bool有权考虑所述bool包含 ' true' 或 ' false' 位模式中的任何一个,并执行任何它感觉像的操作。因此,如果对值truefalse为1和0,分别,编译器确实允许优化strlen5 - <boolean value>。其他有趣的行为是可能的!

As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to

正如这里反复说明的那样,未定义的行为会产生未定义的结果。包括但不仅限于

  • Your code working as you expected it to
  • Your code failing at random times
  • Your code not being run at all.
  • 您的代码按预期工作
  • 您的代码随机失败
  • 您的代码根本没有运行。

See What every programmer should know about undefined behavior

查看每个程序员应该了解的未定义行为