并发:C++11 内存模型中的原子性和易失性
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8819095/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Concurrency: Atomic and volatile in C++11 memory model
提问by Abhijit_K
A global variable is shared across 2 concurrently running threads on 2 different cores. The threads writes to and read from the variables. For the atomic variable can one thread read a stale value? Each core might have a value of the shared variable in its cache and when one threads writes to its copy in a cache the other thread on a different core might read stale value from its own cache. Or the compiler does strong memory ordering to read the latest value from the other cache? The c++11 standard library has std::atomic support. How this is different from the volatile keyword? How volatile and atomic types will behave differently in the above scenario?
全局变量在 2 个不同内核上的 2 个并发运行线程之间共享。线程写入和读取变量。对于原子变量,一个线程可以读取一个陈旧的值吗?每个内核可能在其缓存中具有共享变量的值,并且当一个线程写入缓存中的副本时,不同内核上的另一个线程可能会从其自己的缓存中读取过时值。或者编译器执行强内存排序以从其他缓存读取最新值?c++11 标准库有 std::atomic 支持。这与 volatile 关键字有何不同?在上述情况下,volatile 和 atomic 类型的行为有何不同?
回答by Anthony Williams
Firstly, volatile
does not imply atomic access. It is designed for things like memory mapped I/O and signal handling. volatile
is completely unnecessary when used with std::atomic
, and unless your platform documents otherwise, volatile
has no bearing on atomic access or memory ordering between threads.
首先,volatile
并不意味着原子访问。它专为诸如内存映射 I/O 和信号处理之类的事情而设计。volatile
与 一起使用时完全没有必要std::atomic
,除非您的平台另有说明,否则volatile
与线程之间的原子访问或内存排序无关。
If you have a global variable which is shared between threads, such as:
如果您有一个在线程之间共享的全局变量,例如:
std::atomic<int> ai;
then the visibility and ordering constraints depend on the memory ordering parameter you use for operations, and the synchronization effects of locks, threads and accesses to other atomic variables.
那么可见性和排序约束取决于您用于操作的内存排序参数,以及锁、线程和访问其他原子变量的同步效果。
In the absence of any additional synchronization, if one thread writes a value to ai
then there is nothing that guarantees that another thread will see the value in any given time period. The standard specifies that it should be visible "in a reasonable period of time", but any given access may return a stale value.
在没有任何额外同步的情况下,如果一个线程向其中写入一个值,ai
则无法保证另一个线程在任何给定时间段内都能看到该值。该标准规定它应该在“合理的时间段内”可见,但任何给定的访问都可能返回一个陈旧的值。
The default memory ordering of std::memory_order_seq_cst
provides a single global total order for all std::memory_order_seq_cst
operations across all variables. This doesn't mean that you can't get stale values, but it does mean that the value you do get determines and is determined by where in this total order your operation lies.
默认的内存排序为所有变量的std::memory_order_seq_cst
所有std::memory_order_seq_cst
操作提供了单一的全局总顺序。这并不意味着您无法获得过时的值,但这确实意味着您获得的值决定了您的操作在总顺序中的位置。
If you have 2 shared variables x
and y
, initially zero, and have one thread write 1 to x
and another write 2 to y
, then a third thread that reads both may see either (0,0), (1,0), (0,2) or (1,2) since there is no ordering constraint between the operations, and thus the operations may appear in any order in the global order.
如果您有 2 个共享变量,x
并且y
最初为零,并且有一个线程将 1 写入 1 x
,另一个将 2 写入y
,则读取这两个变量的第三个线程可能会看到 (0,0)、(1,0)、(0,2 ) 或 (1,2) 因为操作之间没有排序约束,因此操作可能以全局顺序中的任何顺序出现。
If both writes are from the same thread, which does x=1
before y=2
and the reading thread reads y
before x
then (0,2) is no longer a valid option, since the read of y==2
implies that the earlier write to x
is visible. The other 3 pairings (0,0), (1,0) and (1,2) are still possible, depending how the 2 reads interleave with the 2 writes.
如果两个写入都来自同一个线程,这在x=1
之前执行y=2
并且读取线程y
在x
then (0,2)之前读取不再是有效选项,因为读取的y==2
意味着更早的写入x
是可见的。其他 3 对 (0,0)、(1,0) 和 (1,2) 仍然是可能的,这取决于 2 个读取与 2 个写入的交错方式。
If you use other memory orderings such as std::memory_order_relaxed
or std::memory_order_acquire
then the constraints are relaxed even further, and the single global ordering no longer applies. Threads don't even necessarily have to agree on the ordering of two stores to separate variables if there is no additional synchronization.
如果您使用其他内存排序,例如std::memory_order_relaxed
或std::memory_order_acquire
那么约束会进一步放宽,并且单个全局排序不再适用。如果没有额外的同步,线程甚至不必就两个存储的顺序达成一致以分离变量。
The only way to guarantee you have the "latest" value is to use a read-modify-write operation such as exchange()
, compare_exchange_strong()
or fetch_add()
. Read-modify-write operations have an additional constraint that they always operate on the "latest" value, so a sequence of ai.fetch_add(1)
operations by a series of threads will return a sequence of values with no duplicates or gaps. In the absence of additional constraints, there's still no guarantee which threads will see which values though.
保证您拥有“最新”值的唯一方法是使用读取-修改-写入操作,例如exchange()
,compare_exchange_strong()
或fetch_add()
。读-修改-写操作有一个额外的约束,它们总是对“最新”值进行ai.fetch_add(1)
操作,因此一系列线程的操作序列将返回一个没有重复或间隙的值序列。在没有额外约束的情况下,仍然无法保证哪些线程会看到哪些值。
Working with atomic operations is a complex topic. I suggest you read a lot of background material, and examine published code before writing production code with atomics. In most cases it is easier to write code that uses locks, and not noticeably less efficient.
使用原子操作是一个复杂的话题。我建议您阅读大量背景资料,并在使用原子编写生产代码之前检查已发布的代码。在大多数情况下,编写使用锁的代码更容易,而且效率不会明显降低。
回答by James Kanze
volatile
and the atomic operations have a different background, and
were introduced with a different intent.
volatile
并且原子操作有不同的背景,并且以不同的意图引入。
volatile
dates from way back, and is principally designed to prevent
compiler optimizations when accessing memory mapped IO. Modern
compilers tend to do no more than suppress optimizations for volatile
,
although on some machines, this isn't sufficient for even memory mapped
IO. Except for the special case of signal handlers, and setjmp
,
longjmp
and getjmp
sequences (where the C standard, and in the case
of signals, the Posix standard, gives additional guarantees), it must be
considered useless on a modern machine, where without special additional
instructions (fences or memory barriers), the hardware may reorder or
even suppress certain accesses. Since you shouldn't be using setjmp
et al. in C++, this more or less leaves signal handlers, and in a
multithreaded environment, at least under Unix, there are better
solutions for those as well. And possibly memory mapped IO, if you're
working on kernal code and can ensure that the compiler generates
whatever is needed for the platform in question. (According to the
standard, volatile
access is observable behavior, which the compiler
must respect. But the compiler gets to define what is meant by
“access”, and most seem to define it as “a load or
store machine instruction was executed”. Which, on a modern
processor, doesn't even mean that there is necessarily a read or write
cycle on the bus, much less that it's in the order you expect.)
volatile
可以追溯到很久以前,主要是为了在访问内存映射 IO 时防止编译器优化。现代编译器往往只会抑制对 的优化volatile
,尽管在某些机器上,这对于内存映射 IO 来说还不够。除了信号处理程序的特殊情况,并且setjmp
,
longjmp
和getjmp
序列(其中C标准,并且在信号的情况下,POSIX标准,提供了额外的保证),则必须将其现代机器,其中在无需特别的额外指令认为是无用的(栅栏或内存屏障),硬件可能会重新排序甚至抑制某些访问。既然你不应该使用setjmp
等。在 C++ 中,这或多或少会留下信号处理程序,而在多线程环境中,至少在 Unix 下,也有更好的解决方案。并且可能是内存映射 IO,如果您正在处理内核代码并且可以确保编译器生成相关平台所需的任何内容。(根据标准,volatile
访问是可观察的行为,编译器必须尊重这一点。但是编译器可以定义“访问”的含义,并且大多数似乎将其定义为“执行了加载或存储机器指令”。哪个,在现代处理器上,甚至并不意味着总线上一定有读或写周期,更不用说它是按照您期望的顺序进行的。)
Given this situation, the C++ standard added atomic access, which does
provide a certain number of guarantees across threads; in particular,
the code generated around an atomic access will contain the necessary
additional instructions to prevent the hardware from reordering the
accesses, and to ensure that the accesses propagate down to the global
memory shared between cores on a multicore machine. (At one point in
the standardization effort, Microsoft proposed adding these semantics to
volatile
, and I think some of their C++ compilers do. After
discussion of the issues in the committee, however, the general
consensus—including the Microsoft representative—was that it
was better to leave volatile
with its orginal meaning, and to define
the atomic types.) Or just use the system level primitives, like
mutexes, which execute whatever instructions are needed in their code.
(They have to. You can't implement a mutex without some guarantees
concerning the order of memory accesses.)
鉴于这种情况,C++ 标准增加了原子访问,这确实提供了一定数量的跨线程保证;特别是,围绕原子访问生成的代码将包含必要的附加指令,以防止硬件重新排序访问,并确保访问向下传播到多核机器上内核之间共享的全局内存。(在标准化工作的某一时刻,Microsoft 提议将这些语义添加到 中
volatile
,我认为他们的一些 C++ 编译器确实这样做了。但是,在委员会讨论这些问题之后,包括 Microsoft 代表在内的普遍共识是最好离开volatile
及其原始含义,并定义原子类型。)或者只使用系统级原语,如互斥锁,它们执行代码中所需的任何指令。(他们必须这样做。如果没有关于内存访问顺序的一些保证,你就不能实现互斥锁。)
回答by Zack Yezek
Here's a basic synopsis of what the 2 things are:
以下是两件事的基本概要:
1) Volatile keyword:
Tells the compiler that this value could alter at any moment and therefore it should not EVER cache it in a register. Look up the old "register" keyword in C. "Volatile" is basically the "-" operator to "register"'s "+". Modern compilers now do the optimization that "register" used to explicitly request by default, so you only see 'volatile' anymore. Using the volatile qualifier will guarantee that your processing never uses a stale value, but nothing more.
1) Volatile 关键字:
告诉编译器这个值可以随时改变,因此它永远不应该将它缓存在寄存器中。在 C 中查找旧的“register”关键字。“Volatile”基本上是“-”运算符来“注册”的“+”。现代编译器现在会进行“注册”用于默认情况下显式请求的优化,因此您只能再看到“易失性”。使用 volatile 限定符将保证您的处理永远不会使用陈旧的值,仅此而已。
2) Atomic:
Atomic operations modify data in a single clock tick, so that it is impossible for ANY other thread to access the data in the middle of such an update. They're usually limited to whatever single-clock assembly instructions the hardware supports; things like ++,--, and swapping 2 pointers. Note that this says nothing about the ORDER the different threads will RUN the atomic instructions, only that they will never run in parallel. That's why you have all those additional options for forcing an ordering.
2) 原子性:
原子性操作在单个时钟滴答中修改数据,因此任何其他线程都不可能在更新过程中访问数据。它们通常仅限于硬件支持的任何单时钟汇编指令;诸如 ++、-- 和交换 2 个指针之类的东西。请注意,这并没有说明不同线程将运行原子指令的 ORDER,只是它们永远不会并行运行。这就是为什么您拥有所有这些附加选项来强制订购的原因。
回答by Karthik Balaguru
Volatile and Atomic serve different purposes.
Volatile 和 Atomic 用于不同的目的。
Volatile : Informs the compiler to avoid optimization. This keyword is used for variables that shall change unexpectedly. So, it can be used to represent the Hardware status registers, variables of ISR, Variables shared in a multi-threaded application.
Volatile :通知编译器避免优化。此关键字用于将意外更改的变量。因此,它可用于表示硬件状态寄存器、ISR 的变量、多线程应用程序中共享的变量。
Atomic : It is also used in case of multi-threaded application. However, this ensures that there is no lock/stall while using in a multi-threaded application. Atomic operations are free of races and indivisble. Few of the key scenario of usage is to check whether a lock is free or used, atomically add to the value and return the added value etc. in multi-threaded application.
Atomic :它也用于多线程应用程序的情况。但是,这可确保在多线程应用程序中使用时不会出现锁定/停止。原子操作是没有种族和不可分割的。使用的关键场景很少是在多线程应用程序中检查锁是否空闲或已使用,原子地添加值并返回添加的值等。