理解 C++11 中的 std::atomic::compare_exchange_weak()

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25199838/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 11:16:29  来源:igfitidea点击:

Understanding std::atomic::compare_exchange_weak() in C++11

c++multithreadingc++11atomic

提问by Eric Z

bool compare_exchange_weak (T& expected, T val, ..);

compare_exchange_weak()is one of compare-exchange primitives provided in C++11. It's weakin the sense that it returns false even if the value of the object is equal to expected. This is due to spurious failureon some platforms where a sequence of instructions (instead of one as on x86) are used to implement it. On such platforms, context switch, reloading of the same address (or cache line) by another thread, etc can fail the primitive. It's spuriousas it's not the value of the object (not equal to expected) that fails the operation. Instead, it's kind of timing issues.

compare_exchange_weak()是 C++11 中提供的比较交换原语之一。这是软弱的,因为它返回false即使对象的值等于感expected。这是由于某些平台上的虚假故障,在这些平台上使用一系列指令(而不是 x86 上的指令)来实现它。在这样的平台上,上下文切换、另一个线程重新加载相同地址(或缓存行)等可能会使原语失败。这是spurious因为它不是expected该操作失败的对象(不等于)的值。相反,这是一种时间问题。

But what puzzles me is what's said in C++11 Standard (ISO/IEC 14882),

但令我困惑的是 C++11 标准 (ISO/IEC 14882) 中所说的,

29.6.5 .. A consequence of spurious failure is that nearly all uses of weak compare-and-exchange will be in a loop.

29.6.5 .. 虚假失败的后果是几乎所有弱比较和交换的使用都将处于循环中。

Why does it have to be in a loop in nearly all uses? Does that mean we shall loop when it fails because of spurious failures? If that's the case, why do we bother use compare_exchange_weak()and write the loop ourselves? We can just use compare_exchange_strong()which I think should get rid of spurious failures for us. What are the common use cases of compare_exchange_weak()?

为什么它几乎所有用途中都必须处于循环中?这是否意味着当它因为虚假故障而失败时我们将循环?如果是这样,我们为什么要费心使用compare_exchange_weak()和自己编写循环?我们可以使用compare_exchange_strong()我认为应该为我们摆脱虚假失败的方法。的常见用例是compare_exchange_weak()什么?

Another question related. In his book "C++ Concurrency In Action" Anthony says,

另一个问题相关。Anthony 在他的《C++ Concurrency In Action》一书中说,

//Because compare_exchange_weak() can fail spuriously, it must typically
//be used in a loop:

bool expected=false;
extern atomic<bool> b; // set somewhere else
while(!b.compare_exchange_weak(expected,true) && !expected);

//In this case, you keep looping as long as expected is still false,
//indicating that the compare_exchange_weak() call failed spuriously.

Why is !expectedthere in the loop condition? Does it there to prevent that all threads may starve and make no progress for some time?

为什么!expected有循环条件?是否可以防止所有线程可能会饿死并且在一段时间内没有任何进展?

Edit: (one last question)

编辑:(最后一个问题)

On platforms that no single hardware CAS instruction exist, both the weak and strong version are implemented using LL/SC (like ARM, PowerPC, etc). So is there any difference between the following two loops? Why, if any? (To me, they should have similar performance.)

在不存在单个硬件 CAS 指令的平台上,弱版本和强版本都是使用 LL/SC 实现的(如 ARM、PowerPC 等)。那么以下两个循环之间有什么区别吗?为什么,如果有?(对我来说,他们应该有类似的表现。)

// use LL/SC (or CAS on x86) and ignore/loop on spurious failures
while (!compare_exchange_weak(..))
{ .. }

// use LL/SC (or CAS on x86) and ignore/loop on spurious failures
while (!compare_exchange_strong(..)) 
{ .. }

I come up w/ this last question you guys all mention that there maybe a performance difference inside a loop. It's also mentioned by the C++11 Standard (ISO/IEC 14882):

我提出了最后一个问题,你们都提到循环内可能存在性能差异。C++11 标准 (ISO/IEC 14882) 也提到了这一点:

When a compare-and-exchange is in a loop, the weak version will yield better performance on some platforms.

当比较和交换处于循环中时,弱版本将在某些平台上产生更好的性能。

But as analyzed above, two versions in a loop should give the same/similar performance. What's the thing I miss?

但如上所述,循环中的两个版本应该提供相同/相似的性能。我想念的是什么?

采纳答案by Eric Z

I'm trying to answer this myself, after going through various online resources (e.g., this oneand this one), the C++11 Standard, as well as the answers given here.

在浏览了各种在线资源(例如,this onethis one)、C++11 标准以及此处给出的答案后,我正在尝试自己回答这个问题。

The related questions are merged (e.g., "why !expected ?" is merged with "why put compare_exchange_weak() in a loop ?") and answers are given accordingly.

相关问题被合并(例如,“为什么 !expected ?”与“为什么将 compare_exchange_weak() 放在一个循环中?”合并)并相应地给出答案。



Why does compare_exchange_weak() have to be in a loop in nearly all uses?

为什么 compare_exchange_weak() 几乎在所有用途中都必须处于循环中?

Typical Pattern A

典型模式 A

You need achieve an atomic update based on the value in the atomic variable. A failure indicates that the variable is not updated with our desired value and we want to retry it. Note that we don't really care about whether it fails due to concurrent write or spurious failure. But we do care thatit is usthat make this change.

您需要根据原子变量中的值实现原子更新。失败表明变量没有用我们想要的值更新,我们想重试它。请注意,我们并不真正关心它是否由于并发写入或虚假失败而失败。但我们确实关心是我们做出了这种改变。

expected = current.load();
do desired = function(expected);
while (!current.compare_exchange_weak(expected, desired));

A real-world example is for several threads to add an element to a singly linked list concurrently. Each thread first loads the head pointer, allocates a new node and appends the head to this new node. Finally, it tries to swap the new node with the head.

一个真实的例子是多个线程同时向单链表添加元素。每个线程首先加载头指针,分配一个新节点并将头附加到这个新节点。最后,它尝试用头部交换新节点。

Another example is to implement mutex using std::atomic<bool>. At most one thread can enter the critical section at a time, depending on which thread first set currentto trueand exit the loop.

另一个例子是使用std::atomic<bool>. 最多一个线程可以同时进入临界区,这取决于哪个线程先设置currenttrue和退出循环。

Typical Pattern B

典型模式 B

This is actually the pattern mentioned in Anthony's book. In contrary to pattern A, you want the atomic variable to be updated once, but you don't care who does it.As long as it's not updated, you try it again. This is typically used with boolean variables. E.g., you need implement a trigger for a state machine to move on. Which thread pulls the trigger is regardless.

这其实就是安东尼书中提到的模式。与模式 A 相反,您希望原子变量更新一次,但您不关心是谁做的。只要它没有更新,你就再试一次。这通常与布尔变量一起使用。例如,您需要为状态机实现一个触发器才能继续前进。哪个线程扣动扳机是无关紧要的。

expected = false;
// !expected: if expected is set to true by another thread, it's done!
// Otherwise, it fails spuriously and we should try again.
while (!current.compare_exchange_weak(expected, true) && !expected);

Note that we generally cannot use this pattern to implement a mutex. Otherwise, multiple threads may be inside the critical section at the same time.

请注意,我们通常不能使用此模式来实现互斥锁。否则,多个线程可能同时在临界区中。

That said, it should be rare to use compare_exchange_weak()outside a loop. On the contrary, there are cases that the strong version is in use. E.g.,

也就是说,compare_exchange_weak()在循环外使用应该很少见。相反,有使用强版本的情况。例如,

bool criticalSection_tryEnter(lock)
{
  bool flag = false;
  return lock.compare_exchange_strong(flag, true);
}

compare_exchange_weakis not proper here because when it returns due to spurious failure, it's likely that no one occupies the critical section yet.

compare_exchange_weak在这里不合适,因为当它由于虚假故障返回时,很可能还没有人占据临界区。

Starving Thread?

饥饿的线程?

One point worth mentioning is that what happens if spurious failures continue to happen thus starving the thread? Theoretically it could happen on platforms when compare_exchange_XXX()is implement as a sequence of instructions (e.g., LL/SC). Frequent access of the same cache line between LL and SC will produce continuous spurious failures. A more realistic example is due to a dumb scheduling where all concurrent threads are interleaved in the following way.

值得一提的一点是,如果虚假故障继续发生从而使线程挨饿,会发生什么?理论上,当compare_exchange_XXX()作为指令序列(例如,LL/SC)实现时,它可能会在平台上发生。在 LL 和 SC 之间频繁访问同一缓存行会产生连续的虚假故障。一个更现实的例子是由于一个愚蠢的调度,其中所有并发线程以下列方式交错。

Time
 |  thread 1 (LL)
 |  thread 2 (LL)
 |  thread 1 (compare, SC), fails spuriously due to thread 2's LL
 |  thread 1 (LL)
 |  thread 2 (compare, SC), fails spuriously due to thread 1's LL
 |  thread 2 (LL)
 v  ..

Can it happen?

它会发生吗?

It won't happen forever, fortunately, thanks to what C++11 requires:

幸运的是,由于 C++11 的要求,它不会永远发生:

Implementations should ensure that weak compare-and-exchange operations do not consistently return false unless either the atomic object has value different from expected or there are concurrent modifications to the atomic object.

实现应该确保弱比较和交换操作不会始终返回 false,除非原子对象的值与预期不同,或者原子对象有并发修改。

Why do we bother use compare_exchange_weak() and write the loop ourselves? We can just use compare_exchange_strong().

为什么我们要费心使用 compare_exchange_weak() 并自己编写循环?我们可以只使用 compare_exchange_strong()。

It depends.

这取决于。

Case 1: When both need to be used inside a loop.C++11 says:

情况 1:当两者都需要在循环中使用时。C++11 说:

When a compare-and-exchange is in a loop, the weak version will yield better performance on some platforms.

当比较和交换处于循环中时,弱版本将在某些平台上产生更好的性能。

On x86 (at least currently. Maybe it'll resort to a similiar scheme as LL/SC one day for performance when more cores are introduced), the weak and strong version are essentially the same because they both boil down to the single instruction cmpxchg. On some other platforms where compare_exchange_XXX()isn't implemented atomically(here meaning no single hardware primitive exists), the weak version inside the loop may win the battle because the strong one will have to handle the spurious failures and retry accordingly.

在 x86 上(至少目前是这样。也许有一天,当引入更多内核时,它会采用与 LL/SC 类似的方案来提高性能),弱版本和强版本本质上是相同的,因为它们都归结为单个指令cmpxchg。在其他一些compare_exchange_XXX()没有以原子方式实现的平台上(这里意味着不存在单个硬件原语),循环中的弱版本可能会赢得战斗,因为强版本将不得不处理虚假故障并相应地重试。

But,

但,

rarely, we may prefer compare_exchange_strong()over compare_exchange_weak()even in a loop. E.g., when there is a lot of things to do between atomic variable is loaded and a calculated new value is exchanged out (see function()above). If the atomic variable itself doesn't change frequently, we don't need repeat the costly calculation for every spurious failure. Instead, we may hope that compare_exchange_strong()"absorb" such failures and we only repeat calculation when it fails due to a real value change.

很少,我们可能更喜欢compare_exchange_strong()compare_exchange_weak()即使是在一个循环。例如,当加载原子变量和交换计算出的新值之间有很多事情要做时(见function()上文)。如果原子变量本身不经常变化,我们就不需要为每个虚假故障重复昂贵的计算。相反,我们可能希望compare_exchange_strong()“吸收”此类失败,并且仅在由于实际值变化而失败时才重复计算。

Case 2: When onlycompare_exchange_weak()need to be used inside a loop.C++11 also says:

情况二:当只compare_exchange_weak()需要在循环内使用时。C++11 还说:

When a weak compare-and-exchange would require a loop and a strong one would not, the strong one is preferable.

当弱比较和交换需要循环而强比较不需要循环时,强比较更可取。

This is typically the case when you loop just to eliminate spurious failures from the weak version. You retry until exchange is either successful or failed because of concurrent write.

当您循环只是为了消除弱版本中的虚假故障时,通常会出现这种情况。由于并发写入,您重试直到交换成功或失败。

expected = false;
// !expected: if it fails spuriously, we should try again.
while (!current.compare_exchange_weak(expected, true) && !expected);

At best, it's reinventing the wheels and perform the same as compare_exchange_strong(). Worse? This approach fails to take full advantage of machines that provide non-spurious compare-and-exchange in hardware.

充其量,它正在重新发明轮子并执行与compare_exchange_strong(). 更差?这种方法不能充分利用在硬件中提供非虚假比较和交换的机器

Last, if you loop for other things (e.g., see "Typical Pattern A" above), then there is a good chance that compare_exchange_strong()shall also be put in a loop, which brings us back to the previous case.

最后,如果你为其他事情循环(例如,参见上面的“典型模式 A”),那么很有可能它compare_exchange_strong()也应该被放入一个循环中,这让我们回到前面的情况。

回答by gexicide

Why doing exchange in a loop?

为什么要循环交换?

Usually, you want your work to be done before you move on, thus, you put compare_exchange_weakinto a loop so that it tries to exchange until it succeeds (i.e., returns true).

通常,您希望在继续之前完成您的工作,因此,您放入compare_exchange_weak一个循环中,以便它尝试交换直到成功(即返回true)。

Note that also compare_exchange_strongis often used in a loop. It does not fail due to spurious failure, but it does fail due to concurrent writes.

请注意,也compare_exchange_strong经常在循环中使用。它不会因虚假故障而失败,但会因并发写入而失败。

Why to use weakinstead of strong?

为什么要使用weak而不是strong

Quite easy: Spurious failure does not happen often, so it is no big performance hit. In constrast, tolerating such a failure allows for a much more efficient implementation of the weakversion (in comparison to strong) on some platforms: strongmust always check for spurious failure and mask it. This is expensive.

很简单:虚假故障不会经常发生,因此不会对性能造成太大影响。相比之下,容忍这种故障允许在某些平台上更有效地实现weak版本(与 相比strong):strong必须始终检查虚假故障并将其屏蔽。这是昂贵的。

Thus, weakis used because it is a lot faster than strongon some platforms

因此,weak使用它是因为它比strong在某些平台上快得多

When should you use weakand when strong?

你应该什么时候使用weak,什么时候使用strong

The referencestates hints when to use weakand when to use strong:

参考指出提示何时使用weak,何时使用strong

When a compare-and-exchange is in a loop, the weak version will yield better performance on some platforms. When a weak compare-and-exchange would require a loop and a strong one would not, the strong one is preferable.

当比较和交换处于循环中时,弱版本将在某些平台上产生更好的性能。当弱比较和交换需要循环而强比较不需要循环时,强比较更可取。

So the answer seems to be quite simple to remember: If you would have to introduce a loop only because of spurious failure, don't do it; use strong. If you have a loop anyway, then use weak.

因此,答案似乎很容易记住:如果您只因虚假故障而不得不引入循环,请不要这样做;使用strong. 如果你有一个循环,那么使用weak.

Why is !expectedin the example

为什么!expected在示例中

It depends on the situation and its desired semantics, but usually it is not needed for correctness. Omitting it would yield a very similar semantics. Only in a case where another thread might reset the value to false, the semantics could become slightly different (yet I cannot find a meaningful example where you would want that). See Tony D.'s comment for a detailed explanation.

这取决于情况及其所需的语义,但通常不需要它的正确性。省略它会产生非常相似的语义。仅在另一个线程可能将值重置为 的情况下false,语义可能会略有不同(但我找不到您想要的有意义的示例)。有关详细说明,请参阅 Tony D. 的评论。

It is simply a fast track when anotherthread writes true: Then the we abort instead of trying to write trueagain.

另一个线程写入时,它只是一个快速通道true:然后我们中止而不是true再次尝试写入。

About your last question

关于你的最后一个问题

But as analyzed above, two versions in a loop should give the same/similar performance. What's the thing I miss?

但如上所述,循环中的两个版本应该提供相同/相似的性能。我想念的是什么?

From Wikipedia:

来自维基百科

Real implementations of LL/SC do not always succeed if there are no concurrent updates to the memory location in question. Any exceptional events between the two operations, such as a context switch, another load-link, or even (on many platforms) another load or store operation, will cause the store-conditional to spuriously fail. Older implementations will fail if there are any updates broadcast over the memory bus.

如果没有对相关内存位置进行并发更新,LL/SC 的实际实现并不总是成功。两个操作之间的任何异常事件,例如上下文切换、另一个加载链接,甚至(在许多平台上)另一个加载或存储操作,都将导致 store-conditional 虚假失败。如果通过内存总线广播任何更新,旧的实现将失败。

So, LL/SC will fail spuriously on context switch, for example. Now, the strong version would bring its "own small loop" to detect that spurious failure and mask it by trying again. Note that this own loop is also more complicated than a usual CAS loop, since it must distinguish between spurious failure (and mask it) and failure due to concurrent access (which results in a return with value false). The weak version does not have such own loop.

因此,例如,LL/SC 将在上下文切换时虚假失败。现在,强版本将带来它自己的“小循环”来检测虚假故障并通过重试来掩盖它。请注意,这个自己的循环也比通常的 CAS 循环更复杂,因为它必须区分虚假失败(并屏蔽它)和由于并发访问导致的失败(导致返回值false)。弱版本没有这样自己的循环。

Since you provide an explicit loop in both examples, it is simply not necessary to have the small loop for the strong version. Consequently, in the example with the strongversion, the check for failure is done twice; once by compare_exchange_strong(which is more complicated since it must distinguish spurious failure and concurrent acces) and once by your loop. This expensive check is unnecessary and the reason why weakwill be faster here.

由于您在两个示例中都提供了显式循环,因此对于强版本,根本没有必要使用小循环。因此,在带有strong版本的示例中,失败检查进行了两次;一次通过compare_exchange_strong(这更复杂,因为它必须区分虚假故障和并发访问)和一次通过您的循环。这种昂贵的检查是不必要的,这也是为什么weak这里会更快的原因。

Also note that your argument (LL/SC) is just onepossibility to implement this. There are more platforms that have even different instruction sets. In addition (and more importantly), note that std::atomicmust support all operations for all possible data types, so even if you declare a ten million byte struct, you can use compare_exchangeon this. Even when on a CPU that does have CAS, you cannot CAS ten million bytes, so the compiler will generate other instructions (probably lock acquire, followed by a non-atomic compare and swap, followed by a lock release). Now, think of how many things can happen while swapping ten million bytes. So while a spurious error may be very rare for 8 byte exchanges, it might be more common in this case.

另请注意,您的论点(LL/SC)只是实现这一点的一种可能性。有更多的平台甚至有不同的指令集。另外(更重要的是),注意std::atomic必须支持所有可能的数据类型的所有操作,所以即使你声明了一个千万字节的结构体,你也可以使用compare_exchange它。即使在具有 CAS 的 CPU 上,您也不能 CAS 一千万字节,因此编译器会生成其他指令(可能是锁获取,然后是非原子比较和交换,然后是锁释放)。现在,想想在交换一千万字节时会发生多少事情。因此,虽然对于 8 字节交换而言,虚假错误可能非常罕见,但在这种情况下可能更常见。

So in a nutshell, C++ gives you two semantics, a "best effort" one (weak) and a "I will do it for sure, no matter how many bad things might happen inbetween" one (strong). How these are implemented on various data types and platforms is a totally different topic. Don't tie your mental model to the implementation on your specific platform; the standard library is designed to work with more architectures than you might be aware of. The only general conclusion we can draw is that guaranteeing success is usually more difficult (and thus may require additional work) than just trying and leaving room for possible failure.

因此,简而言之,C++ 为您提供了两种语义,一种是“尽力而为”( weak),另一种是“无论中间可能发生多少坏事,我都会这样做”( strong)。如何在各种数据类型和平台上实现这些是一个完全不同的话题。不要将您的心智模型与特定平台上的实现联系起来;标准库旨在处理比您可能意识到的更多的体系结构。我们可以得出的唯一一般结论是,保证成功通常比仅仅尝试并为可能的失败留出空间更困难(因此可能需要额外的工作)。

回答by Jonathan Wakely

Why does it have to be in a loop in nearly all uses?

为什么它几乎所有用途中都必须处于循环中?

Because if you don't loop and it fails spuriously your program hasn't done anything useful - you didn't update the atomic object and you don't know what its current value is (Correction: see comment below from Cameron). If the call doesn't do anything useful what's the point of doing it?

因为如果你不循环并且它虚假地失败了你的程序没有做任何有用的事情 - 你没有更新原子对象并且你不知道它的当前值是什么(更正:请参阅下面来自 Cameron 的评论)。如果电话没有做任何有用的事情,那么这样做有什么意义呢?

Does that mean we shall loop when it fails because of spurious failures?

这是否意味着当它因为虚假故障而失败时我们将循环?

Yes.

是的。

If that's the case, why do we bother use compare_exchange_weak()and write the loop ourselves? We can just use compare_exchange_strong() which I think should get rid of spurious failures for us. What are the common use cases of compare_exchange_weak()?

如果是这样,我们为什么要费心使用compare_exchange_weak()和自己编写循环?我们可以只使用 compare_exchange_strong() ,我认为它应该为我们摆脱虚假的失败。compare_exchange_weak() 的常见用例是什么?

On some architectures compare_exchange_weakis more efficient, and spurious failures should be fairly uncommon, so it might be possible to write more efficient algorithms using the weak form and a loop.

在某些体系结构compare_exchange_weak上效率更高,并且虚假故障应该相当罕见,因此可能可以使用弱形式和循环编写更有效的算法。

In general it is probably better to use the strong version instead if your algorithm doesn't need to loop, as you don't need to worry about spurious failures. If it needs to loop anyway even for the strong version (and many algorithms do need to loop anyway), then using the weak form might be more efficient on some platforms.

一般来说,如果您的算法不需要循环,那么使用强版本可能会更好,因为您无需担心虚假失败。如果即使对于强版本它仍然需要循环(并且许多算法确实需要循环),那么在某些平台上使用弱形式可能更有效。

Why is !expectedthere in the loop condition?

为什么!expected有循环条件?

The value could have got set to trueby another thread, so you don't want to keep looping trying to set it.

该值可能true已由另一个线程设置,因此您不想继续循环尝试设置它。

Edit:

编辑:

But as analyzed above, two versions in a loop should give the same/similar performance. What's the thing I miss?

但如上所述,循环中的两个版本应该提供相同/相似的性能。我想念的是什么?

Surely it's obvious that on platforms where spurious failure is possible the implementation of compare_exchange_stronghas to be more complicated, to check for spurious failure and retry.

当然,很明显,在可能出现虚假故障的平台上, 的实现compare_exchange_strong必须更加复杂,以检查虚假故障并重试。

The weak form just returns on spurious failure, it doesn't retry.

弱形式只是在虚假失败时返回,它不会重试。

回答by Sneftel

Alright, so I need a function which performs atomic left-shifting. My processor doesn't have a native operation for this, and the standard library doesn't have a function for it, so it looks like I'm writing my own. Here goes:

好的,所以我需要一个执行原子左移的函数。我的处理器没有本机操作,标准库也没有它的函数,所以看起来我正在编写自己的。开始:

void atomicLeftShift(std::atomic<int>* var, int shiftBy)
{
    do {
        int oldVal = std::atomic_load(var);
        int newVal = oldVal << shiftBy;
    } while(!std::compare_exchange_weak(oldVal, newVal));
}

Now, there's two reasons that loop might be executed more than once.

现在,循环可能被多次执行有两个原因。

  1. Someone else changed the variable while I was doing my left shift. The results of my computation should not be applied to the atomic variable, because it would effectively erase that someone else's write.
  2. My CPU burped and the weak CAS spuriously failed.
  1. 当我左移时,其他人更改了变量。我的计算结果不应该应用于原子变量,因为它会有效地擦除其他人的写入。
  2. 我的 CPU 打嗝,弱 CAS 虚假地失败了。

I honestly don't care which one. Left shifting is fast enough that I may as well just do it again, even if the failure was spurious.

老实说,我不在乎是哪一个。左移足够快,我不妨再做一次,即使失败是虚假的。

What's lessfast, though, is the extra code that strong CAS needs to wrap around weak CAS in order to be strong. That code doesn't do much when the weak CAS succeeds... but when it fails, strong CAS needs to do some detective work to determine whether it was Case 1 or Case 2. That detective work takes the form of a second loop, effectively inside my own loop. Two nested loops. Imagine your algorithms teacher glaring at you right now.

什么是快,虽然是额外的代码,强有力的CAS需求环绕弱CAS以坚强。当弱 CAS 成功时,该代码不会做太多事情……但是当它失败时,强 CAS 需要做一些检测工作来确定它是案例 1 还是案例 2。该检测工作采用第二个循环的形式,有效地在我自己的循环中。两个嵌套循环。想象一下你的算法老师现在正瞪着你。

And as I previously mentioned, I don't care about the result of that detective work! Either way I'm going to be redoing the CAS. So using strong CAS gains me precisely nothing, and loses me a small but measurable amount of efficiency.

正如我之前提到的,我不在乎侦探工作的结果!无论哪种方式,我都将重做 CAS。因此,使用强大的 CAS 对我没有任何好处,并且会损失少量但可衡量的效率。

In other words, weak CAS is used to implement atomic update operations. Strong CAS is used when you care about the result of CAS.

换句话说,弱 CAS 用于实现原子更新操作。当您关心 CAS 的结果时,使用强 CAS。

回答by Damir Shaikhutdinov

I think most of the answers above address "spurious failure" as some kind of problem, performance VS correctness tradeoff.

我认为上面的大多数答案都将“虚假失败”视为某种问题,即性能与正确性的权衡。

It can be seen as the weak version is faster most of the times, but in case of spurious failure, it becomes slower. And the strong version is a version that has no possibility of spurious failure, but it is almost always slower.

可以看出,弱版本在大多数情况下更快,但在虚假故障的情况下,它变得更慢。而强版本是一个没有虚假失败可能性的版本,但它几乎总是更慢。

For me, the main difference is how these two version handle the ABA problem:

对我来说,主要区别在于这两个版本如何处理 ABA 问题:

weak version will succeed only if noone has touched the cache line between load and store, so it will 100% detect ABA problem.

只有在没有人触及加载和存储之间的缓存线时,弱版本才会成功,因此它将 100% 检测到 ABA 问题。

strong version will fail only if the comparison fails, so it will not detect ABA problem without extra measures.

只有当比较失败时,强版本才会失败,因此如果没有额外的措施,它不会检测到 ABA 问题。

So, in theory, if you use weak version on weak-ordered architecture, you don't need ABA detection mechanism and the implementation will be much simpler, giving better performance.

因此,理论上,如果在弱序架构上使用弱版本,则不需要 ABA 检测机制,并且实现会简单得多,从而提供更好的性能。

But, on x86 (strong-ordered architecture), weak version and strong version are the same, and they both suffer from ABA problem.

但是,在x86(强序架构)上,弱版本和强版本是一样的,都存在ABA问题。

So if you write a completely cross-platform algorithm, you need to address ABA problem anyway, so there is no performance benefit from using the weak version, but there is a performance penalty for handling spurious failures.

所以如果你写一个完全跨平台的算法,无论如何你都需要解决 ABA 问题,所以使用弱版本没有性能优势,但是处理虚假故障会有性能损失。

In conclusion - for portability and performance reasons, the strong version is always a better-or-equal option.

总而言之 - 出于便携性和性能原因,强版本始终是更好或相等的选择。

Weak version can only be a better option if it lets you skip ABA countermeasures completely or your algorithm doesn't care about ABA.

如果弱版本可以让您完全跳过 ABA 对策,或者您的算法不关心 ABA,那么它只能是更好的选择。