理解 c++11 内存栅栏

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13632344/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 17:33:36  来源:igfitidea点击:

Understanding c++11 memory fences

c++c++11atomic

提问by jcoder

I'm trying to understand memory fences in c++11, I know there are better ways to do this, atomic variables and so on, but wondered if this usage was correct. I realize that this program doesn't do anything useful, I just wanted to make sure that the usage of the fence functions did what I thought they did.

我试图理解 c++11 中的内存栅栏,我知道有更好的方法来做到这一点,原子变量等等,但想知道这种用法是否正确。我意识到这个程序没有做任何有用的事情,我只是想确保围栏功能的使用达到了我认为的效果。

Basically that the release ensures that any changes made in this thread before the fence are visible to other threads after the fence, and that in the second thread that any changes to the variables are visible in the thread immediately after the fence?

基本上,释放确保在栅栏之前在此线程中所做的任何更改对栅栏之后的其他线程可见,并且在第二个线程中,对变量的任何更改在栅栏之后的线程中立即可见?

Is my understanding correct? Or have I missed the point entirely?

我的理解正确吗?还是我完全没有抓住重点?

#include <iostream>
#include <atomic>
#include <thread>

int a;

void func1()
{
    for(int i = 0; i < 1000000; ++i)
    {
        a = i;
        // Ensure that changes to a to this point are visible to other threads
        atomic_thread_fence(std::memory_order_release);
    }
}

void func2()
{
    for(int i = 0; i < 1000000; ++i)
    {
        // Ensure that this thread's view of a is up to date
        atomic_thread_fence(std::memory_order_acquire);
        std::cout << a;
    }
}

int main()
{
    std::thread t1 (func1);
    std::thread t2 (func2);

    t1.join(); t2.join();
}

回答by bames53

Your usage does notactually ensure the things you mention in your comments. That is, your usage of fences does not ensure that your assignments to aare visible to other threads or that the value you read from ais 'up to date.' This is because, although you seem to have the basic idea of where fences should be used, your code does not actually meet the exact requirements for those fences to "synchronize".

您的使用实际上并不能确保您在评论中提到的内容。也就是说,您对栅栏的使用并不能确保您的分配a对其他线程可见,或者您从中读取的值a是“最新的”。这是因为,尽管您似乎对应该在何处使用围栏有了基本的了解,但您的代码实际上并未满足这些围栏“同步”的确切要求。

Here's a different example that I think demonstrates correct usage better.

这是一个不同的例子,我认为它可以更好地展示正确的用法。

#include <iostream>
#include <atomic>
#include <thread>

std::atomic<bool> flag(false);
int a;

void func1()
{
    a = 100;
    atomic_thread_fence(std::memory_order_release);
    flag.store(true, std::memory_order_relaxed);
}

void func2()
{
    while(!flag.load(std::memory_order_relaxed))
        ;

    atomic_thread_fence(std::memory_order_acquire);
    std::cout << a << '\n'; // guaranteed to print 100
}

int main()
{
    std::thread t1 (func1);
    std::thread t2 (func2);

    t1.join(); t2.join();
}

The load and store on the atomic flag do not synchronize, because they both use the relaxed memory ordering. Without the fences this code would be a data race, because we're performing conflicting operations a non-atomic object in different threads, and without the fences and the synchronization they provide there would be no happens-before relationship between the conflicting operations on a.

原子标志上的加载和存储不同步,因为它们都使用宽松的内存排序。如果没有围栏,此代码将是数据竞争,因为我们正在不同线程中执行非原子对象的冲突操作,并且没有围栏和它们提供的同步,在 上的冲突操作之间将没有发生之前的关系a

However with the fences we do get synchronization because we've guaranteed that thread 2 will read the flag written by thread 1 (because we loop until we see that value), and since the atomic write happened after the release fence and the atomic read happens-before the acquire fence, the fences synchronize. (see § 29.8/2 for the specific requirements.)

然而,有了栅栏,我们确实获得了同步,因为我们保证线程 2 将读取线程 1 写入的标志(因为我们循环直到看到该值),并且因为原子写发生在释放栅栏之后,原子读发生- 在获取围栏之前,围栏同步。(具体要求见第 29.8/2 节。)

This synchronization means anything that happens-before the release fence happens-before anything that happens-after the acquire fence. Therefore the non-atomic write to ahappens-before the non-atomic read of a.

这种同步意味着任何发生的事情——在释放栅栏发生之前——在任何事情发生之前——在获取栅栏之后。因此,对 的非原子写入a发生在 的非原子读取之前a

Things get trickier when you're writing a variable in a loop, because you might establish a happens-before relation for some particular iteration, but not other iterations, causing a data race.

当您在循环中编写变量时,事情会变得更加棘手,因为您可能会为某些特定迭代而不是其他迭代建立一个发生之前的关系,从而导致数据竞争。

std::atomic<int> f(0);
int a;

void func1()
{
    for (int i = 0; i<1000000; ++i) {
        a = i;
        atomic_thread_fence(std::memory_order_release);
        f.store(i, std::memory_order_relaxed);
    }
}

void func2()
{
    int prev_value = 0;
    while (prev_value < 1000000) {
        while (true) {
            int new_val = f.load(std::memory_order_relaxed);
            if (prev_val < new_val) {
                prev_val = new_val;
                break;
            }
        }

        atomic_thread_fence(std::memory_order_acquire);
        std::cout << a << '\n';
    }
}

This code still causes the fences to synchronize but does not eliminate data races. For example if f.load()happens to return 10 then we know that a=1,a=2, ... a=10have all happened-before that particular cout<<a, but we don'tknow that cout<<ahappens-before a=11. Those are conflicting operations on different threads with no happens-before relation; a data race.

此代码仍会导致栅栏同步,但不会消除数据竞争。例如,如果f.load()碰巧返回 10,那么我们知道a=1, a=2, ...a=10都发生在那个特定的之前cout<<a,但我们知道cout<<a发生在之前a=11。这些是不同线程上的冲突操作,没有发生之前的关系;一场数据竞赛。

回答by David Schwartz

Your usage is correct, but insufficient to guarantee anything useful.

你的用法是正确的,但不足以保证任何有用的东西。

For example, the compiler is free to internally implement a = i;like this if it wants to:

例如,编译器可以自由地在内部实现,a = i;如果它想:

 while(a != i)
 {
    ++a;
    atomic_thread_fence(std::memory_order_release);
 }

So the other thread may see any values at all.

所以另一个线程可能会看到任何值。

Of course, the compiler would never implement a simple assignment like that. However, there are cases where similarly perplexing behavior is actually an optimization, so it's a very bad idea to rely on ordinary code being implemented internally in any particular way. This is why we have things like atomic operations and fences only produce guaranteed results when used with such operations.

当然,编译器永远不会实现这样的简单赋值。但是,在某些情况下,类似的令人困惑的行为实际上是一种优化,因此依赖以任何特定方式在内部实现的普通代码是一个非常糟糕的主意。这就是为什么我们有诸如原子操作和栅栏之类的东西,只有在与此类操作一起使用时才会产生有保证的结果。