使用 <atomic> 实现自旋锁的 C++11

Question

提问by syko

I implemented SpinLock class, as followed

我实现了 SpinLock 类，如下

struct Node {
    int number;
    std::atomic_bool latch;

    void add() {
        lock();
        number++;
        unlock();
    }
    void lock() {
        bool unlatched = false;
        while(!latch.compare_exchange_weak(unlatched, true, std::memory_order_acquire));
    }
    void unlock() {
        latch.store(false , std::memory_order_release);
    }
};

I implemented above class and made two threads which call add() method of a same instance of Node class 10 million times per thread.

我实现了上面的类并创建了两个线程，每个线程调用 Node 类的同一实例的 add() 方法 1000 万次。

the result is , unfortunately, not 20 million. What am I missing here?

不幸的是，结果不是 2000 万。我在这里缺少什么？

Answer 1

回答by gexicide

The problem is that compare_exchange_weakupdates the unlatchedvariable once it fails. From the documentation of compare_exchange_weak:

问题是一旦失败就compare_exchange_weak更新unlatched变量。从文档compare_exchange_weak：

Compares the contents of the atomic object's contained value with expected: - if true, it replaces the contained value with val (like store). - if false, it replaces expected with the contained value .

将原子对象包含的值的内容与期望值进行比较： - 如果为真，则用 val 替换包含的值（如 store）。 - 如果为 false，则用包含的 value 替换 expected 。

I.e., after the first failing compare_exchange_weak, unlatchedwill be updated to true, so the next loop iteration will try to compare_exchange_weaktruewith true. This succeeds and you just took a lock that was held by another thread.

即，在第一次失败后compare_exchange_weak，unlatched将更新为true，因此下一次循环迭代将尝试compare_exchange_weaktrue使用true。这成功了，您刚刚获得了另一个线程持有的锁。

Solution: Make sure to set unlatchedback to falsebefore each compare_exchange_weak, e.g.:

解决方法：一定要制作一套unlatched回false各之前compare_exchange_weak，例如：

while(!latch.compare_exchange_weak(unlatched, true, std::memory_order_acquire)) {
    unlatched = false;
}

Answer 2

回答by MikeMB

As mentioned by @gexicide, the problem is that the compare_exchangefunctions update the expectedvariable with the current value of the atomic variable. That is also the reason, why you have to use the local variable unlatchedin the first place. To solve this you can set unlatchedback to false in each loop iteration.

正如@gexicide 所提到的，问题在于compare_exchange函数expected使用原子变量的当前值更新变量。这也是为什么你必须首先使用局部变量的原因unlatched。为了解决这个问题，您可以unlatched在每次循环迭代中将其设置回 false。

However, instead of using compare_exchangefor something its interface is rather ill suited for, it is much simpler to use std::atomic_flaginstead:

但是，compare_exchange与其将其用于其界面非常不适合的东西，不如使用它更简单std::atomic_flag：

class SpinLock {
    std::atomic_flag locked = ATOMIC_FLAG_INIT ;
public:
    void lock() {
        while (locked.test_and_set(std::memory_order_acquire)) { ; }
    }
    void unlock() {
        locked.clear(std::memory_order_release);
    }
};

Source: cppreference

来源：cppreference

Manually specifying the memory order is just a minor potential performance tweak, which I copied from the source. If simplicity is more important than the last bit of performance, you can stick to the default values and just call locked.test_and_set() / locked.clear().

手动指定内存顺序只是一个很小的潜在性能调整，我从源代码中复制了它。如果简单性比最后一点性能更重要，您可以坚持使用默认值，只需调用locked.test_and_set() / locked.clear().

Btw.: std::atomic_flagis the only type that is guaranteed to be lock free, although I don't know any platform, where oparations on std::atomic_boolare not lock free.

顺便说一句：std::atomic_flag是唯一保证无锁的类型，尽管我不知道任何平台上std::atomic_bool的操作都不是无锁的。

Update:As explained in the comments by @David Schwartz, @Anton and @Technik Empire, the empty loop has some undesirable effects like branch missprediction, thread starvation on HT processors and overly high power consumption - so in short, it is a pretty inefficient way to wait. The impact and solution are architecture, platform and application specific. I'm no expert, but the usual solution seems to be to add either a cpu_relax()on linux or YieldProcessor()on windows to the loop body.

更新：正如@David Schwartz、@Anton 和@Technik Empire 的评论中所解释的，空循环具有一些不良影响，例如分支预测错误、HT 处理器上的线程饥饿和过高的功耗 - 所以简而言之，它的效率非常低等待的方式。影响和解决方案是特定于架构、平台和应用程序的。我不是专家，但通常的解决方案似乎是cpu_relax()在 linux 或YieldProcessor()Windows 上将a 添加到循环体。

EDIT2:Just to be clear: The portable version presented here (without the special cpu_relax etc instructions) should alredy be good enough for many applications. If your SpinLockspins a lot because someone other is holding the lock for a long time (which might already indicate a general design problem), it is probably better to use a normal mutex anyway.

EDIT2：要明确一点：这里提供的便携式版本（没有特殊的 cpu_relax 等指令）对于许多应用程序来说应该已经足够好了。如果您SpinLock因为其他人长时间持有锁而旋转很多（这可能已经表明一般设计问题），那么无论如何使用普通互斥锁可能会更好。

使用 <atomic> 实现自旋锁的 C++11

提问by syko

回答by gexicide

回答by MikeMB

相关推荐

最近更新

标签

使用 <atomic> 实现自旋锁的 C++11

提问by syko

回答by gexicide

回答by MikeMB

相关推荐

C++ 将短整数复制到字符数组

C++ 拆分一个句子，以便将每个单词添加到数组项中

C++ 中 unordered_map :: emplace 和 unordered_map :: insert 有什么区别？

C++ 查找最大元素的位置

相关推荐

最近更新

标签