Linux 内核中可能/不可能的宏如何工作,它们的好处是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/109710/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 16:29:22  来源:igfitidea点击:

How do the likely/unlikely macros in the Linux kernel work and what is their benefit?

linuxgcclinux-kernellikely-unlikely

提问by terminus

I've been digging through some parts of the Linux kernel, and found calls like this:

我一直在挖掘 Linux 内核的某些部分,发现了这样的调用:

if (unlikely(fd < 0))
{
    /* Do something */
}

or

或者

if (likely(!err))
{
    /* Do something */
}

I've found the definition of them:

我找到了它们的定义:

#define likely(x)       __builtin_expect((x),1)
#define unlikely(x)     __builtin_expect((x),0)

I know that they are for optimization, but how do they work? And how much performance/size decrease can be expected from using them? And is it worth the hassle (and losing the portability probably) at least in bottleneck code (in userspace, of course).

我知道它们是为了优化,但它们是如何工作的?使用它们可以预期会降低多少性能/尺寸?至少在瓶颈代码中(当然,在用户空间中)是否值得麻烦(并且可能会失去可移植性)。

采纳答案by 1800 INFORMATION

They are hint to the compiler to emit instructions that will cause branch prediction to favour the "likely" side of a jump instruction. This can be a big win, if the prediction is correct it means that the jump instruction is basically free and will take zero cycles. On the other hand if the prediction is wrong, then it means the processor pipeline needs to be flushed and it can cost several cycles. So long as the prediction is correct most of the time, this will tend to be good for performance.

它们提示编译器发出指令,这些指令将导致分支预测有利于跳转指令的“可能”一侧。这可能是一个巨大的胜利,如果预测正确,则意味着跳转指令基本上是免费的,并且将占用零个周期。另一方面,如果预测错误,则意味着需要刷新处理器管道,并且可能会花费几个周期。只要预测在大多数情况下都是正确的,这往往有利于性能。

Like all such performance optimisations you should only do it after extensive profiling to ensure the code really is in a bottleneck, and probably given the micro nature, that it is being run in a tight loop. Generally the Linux developers are pretty experienced so I would imagine they would have done that. They don't really care too much about portability as they only target gcc, and they have a very close idea of the assembly they want it to generate.

像所有此类性能优化一样,您应该只在进行大量分析以确保代码确实处于瓶颈之后才能进行,并且可能考虑到微观性质,它正在紧密循环中运行。通常 Linux 开发人员都非常有经验,所以我想他们会这样做。他们并不太关心可移植性,因为他们只针对 gcc,而且他们对他们想要生成的程序集有非常接近的想法。

回答by Cody Brocious

They're hints to the compiler to generate the hint prefixes on branches. On x86/x64, they take up one byte, so you'll get at most a one-byte increase for each branch. As for performance, it entirely depends on the application -- in most cases, the branch predictor on the processor will ignore them, these days.

它们提示编译器在分支上生成提示前缀。在 x86/x64 上,它们占用一个字节,因此每个分支最多增加一个字节。至于性能,它完全取决于应用程序——如今,在大多数情况下,处理器上的分支预测器会忽略它们。

Edit: Forgot about one place they can actually really help with. It can allow the compiler to reorder the control-flow graph to reduce the number of branches taken for the 'likely' path. This can have a marked improvement in loops where you're checking multiple exit cases.

编辑:忘记了一个他们实际上可以真正提供帮助的地方。它可以允许编译器对控制流图重新排序,以减少为“可能”路径采用的分支数量。这可以显着改善您正在检查多个退出案例的循环。

回答by dcgibbons

These are GCC functions for the programmer to give a hint to the compiler about what the most likely branch condition will be in a given expression. This allows the compiler to build the branch instructions so that the most common case takes the fewest number of instructions to execute.

这些是程序员的 GCC 函数,用于向编译器提示给定表达式中最可能的分支条件是什么。这允许编译器构建分支指令,以便最常见的情况需要最少数量的指令来执行。

How the branch instructions are built are dependent upon the processor architecture.

如何构建分支指令取决于处理器架构。

回答by moonshadow

They cause the compiler to emit the appropriate branch hints where the hardware supports them. This usually just means twiddling a few bits in the instruction opcode, so code size will not change. The CPU will start fetching instructions from the predicted location, and flush the pipeline and start over if that turns out to be wrong when the branch is reached; in the case where the hint is correct, this will make the branch much faster - precisely how much faster will depend on the hardware; and how much this affects the performance of the code will depend on what proportion of the time hint is correct.

它们使编译器在硬件支持它们的地方发出适当的分支提示。这通常只是意味着在指令操作码中摆弄几位,因此代码大小不会改变。CPU 将开始从预测位置获取指令,如果到达分支时发现错误,则刷新管道并重新开始;在提示正确的情况下,这将使分支更快 - 准确地说快多少取决于硬件;这对代码性能的影响有多大取决于时间提示的正确比例。

For instance, on a PowerPC CPU an unhinted branch might take 16 cycles, a correctly hinted one 8 and an incorrectly hinted one 24. In innermost loops good hinting can make an enormous difference.

例如,在 PowerPC CPU 上,未提示的分支可能需要 16 个周期,正确提示为 8 个,错误提示为 24 个。在最内层循环中,良好的提示可以产生巨大的差异。

Portability isn't really an issue - presumably the definition is in a per-platform header; you can simply define "likely" and "unlikely" to nothing for platforms that do not support static branch hints.

可移植性并不是真正的问题——大概定义是在每个平台的头文件中;对于不支持静态分支提示的平台,您可以简单地将“可能”和“不太可能”定义为空。

回答by dvorak

These are macros that give hints to the compiler about which way a branch may go. The macros expand to GCC specific extensions, if they're available.

这些是向编译器提示分支可能走哪条路的宏。宏扩展到 GCC 特定的扩展(如果可用)。

GCC uses these to to optimize for branch prediction. For example, if you have something like the following

GCC 使用这些来优化分支预测。例如,如果您有以下内容

if (unlikely(x)) {
  dosomething();
}

return x;

Then it can restructure this code to be something more like:

然后它可以将此代码重构为更像:

if (!x) {
  return x;
}

dosomething();
return x;

The benefit of this is that when the processor takes a branch the first time, there is significant overhead, because it may have been speculatively loading and executing code further ahead. When it determines it will take the branch, then it has to invalidate that, and start at the branch target.

这样做的好处是,当处理器第一次执行分支时,会有很大的开销,因为它可能已经推测性地提前加载和执行代码。当它确定将采用分支时,它必须使该分支无效,并从分支目标开始。

Most modern processors now have some sort of branch prediction, but that only assists when you've been through the branch before, and the branch is still in the branch prediction cache.

大多数现代处理器现在都具有某种分支预测功能,但这仅在您之前已经通过分支并且分支仍在分支预测缓存中时才有帮助。

There are a number of other strategies that the compiler and processor can use in these scenarios. You can find more details on how branch predictors work at Wikipedia: http://en.wikipedia.org/wiki/Branch_predictor

在这些场景中,编译器和处理器可以使用许多其他策略。您可以在 Wikipedia 上找到有关分支预测器如何工作的更多详细信息:http: //en.wikipedia.org/wiki/Branch_predictor

回答by Andrew Edgecombe

(general comment - other answers cover the details)

(一般性评论 - 其他答案涵盖了细节)

There's no reason that you should lose portability by using them.

没有理由因为使用它们而失去可移植性。

You always have the option of creating a simple nil-effect "inline" or macro that will allow you to compile on other platforms with other compilers.

您始终可以选择创建一个简单的无效果“内联”或宏,以允许您使用其他编译器在其他平台上进行编译。

You just won't get the benefit of the optimization if you're on other platforms.

如果您在其他平台上,您将无法获得优化的好处。

回答by Finaldie

In many linux release, you can find complier.h in /usr/linux/ , you can include it for use simply. And another opinion, unlikely() is more useful rather than likely(), because

在许多linux发行版中,您可以在 /usr/linux/ 中找到 compiler.h ,您可以将其包含在内以供简单使用。另一种观点是,不太可能()比可能()更有用,因为

if ( likely( ... ) ) {
     doSomething();
}

it can be optimized as well in many compiler.

它也可以在许多编译器中进行优化。

And by the way, if you want to observe the detail behavior of the code, you can do simply as follow:

顺便说一句,如果您想观察代码的详细行为,您可以简单地执行以下操作:

gcc -c test.c objdump -d test.o > obj.s

gcc -c test.c objdump -d test.o > obj.s

Then, open obj.s, you can find the answer.

然后,打开obj.s,就可以找到答案了。

回答by artless noise

As per the comment by Cody, this has nothing to do with Linux, but is a hint to the compiler. What happens will depend on the architecture and compiler version.

根据Cody的评论,这与 Linux 无关,而是对编译器的提示。会发生什么取决于体系结构和编译器版本。

This particular feature in Linux is somewhat mis-used in drivers. As osgxpoints out in semantics of hot attribute, any hotor coldfunction called with in a block can automatically hint that the condition is likely or not. For instance, dump_stack()is marked coldso this is redundant,

Linux 中的这个特殊功能在驱动程序中有些被误用。正如osgxhot 属性的语义中指出的那样,在块中调用的任何hotcold函数都可以自动提示条件是否可能。例如,dump_stack()被标记cold所以这是多余的,

 if(unlikely(err)) {
     printk("Driver error found. %d\n", err);
     dump_stack();
 }

Future versions of gccmay selectively inline a function based on these hints. There have also been suggestions that it is not boolean, but a score as in most likely, etc. Generally, it should be preferred to use some alternate mechanism like cold. There is no reason to use it in any place but hot paths. What a compiler will do on one architecture can be completely different on another.

的未来版本gcc可能会根据这些提示有选择地内联函数。也有人建议它不是boolean,而是最有可能的分数等。通常,应该首选使用一些替代机制,如cold。除了热路径之外,没有理由在任何地方使用它。编译器在一种架构上所做的事情在另一种架构上可能完全不同。

回答by Ashish Maurya

long __builtin_expect(long EXP, long C);

This construct tells the compiler that the expression EXP most likely will have the value C. The return value is EXP. __builtin_expectis meant to be used in an conditional expression. In almost all cases will it be used in the context of boolean expressions in which case it is much more convenient to define two helper macros:

此构造告诉编译器表达式 EXP 最有可能具有值 C。返回值是 EXP。 __builtin_expect用于条件表达式。在几乎所有情况下,它都会在布尔表达式的上下文中使用,在这种情况下,定义两个辅助宏会更方便:

#define unlikely(expr) __builtin_expect(!!(expr), 0)
#define likely(expr) __builtin_expect(!!(expr), 1)

These macros can then be used as in

然后可以使用这些宏

if (likely(a > 1))

Reference: https://www.akkadia.org/drepper/cpumemory.pdf

参考:https: //www.akkadia.org/drepper/cpumemory.pdf