multithreading 哪个更有效,基本互斥锁或原子整数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15056237/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Which is more efficient, basic mutex lock or atomic integer?
提问by Matt
For something simple like a counter if multiple threads will be increasing the number. I read that mutex locks can decrease efficiency since the threads have to wait. So, to me, an atomic counter would be the most efficient, but I read that internally it is basically a lock? So I guess I'm confused how either could be more efficient than the other.
对于像计数器这样简单的东西,如果多个线程将增加数量。我读到互斥锁会降低效率,因为线程必须等待。所以,对我来说,原子计数器是最有效的,但我在内部读到它基本上是一个锁?所以我想我很困惑如何才能比另一个更有效率。
回答by Cort Ammon
If you have a counter for which atomic operations are supported, it will be more efficient than a mutex.
如果您有一个支持原子操作的计数器,它将比互斥锁更有效。
Technically, the atomic will lock the memory bus on most platforms. However, there are two ameliorating details:
从技术上讲,原子将锁定大多数平台上的内存总线。但是,有两个改进的细节:
- It is impossible to suspend a thread during the memory bus lock, but it is possible to suspend a thread during a mutex lock. This is what lets you get a lock-free guarantee (which doesn't say anything about not locking - it just guarantees that at least one thread makes progress).
- Mutexes eventually end up being implemented with atomics. Since you need at least one atomic operation to lock a mutex, and one atomic operation to unlock a mutex, it takes at least twice long to do a mutex lock, even in the best of cases.
- 在内存总线锁定期间挂起线程是不可能的,但是在互斥锁期间可以挂起线程。这就是让您获得无锁保证的原因(这并没有说明不锁定 - 它只是保证至少一个线程取得进展)。
- 互斥体最终以原子实现。由于您至少需要一个原子操作来锁定互斥锁,并且需要一个原子操作来解锁互斥锁,因此即使在最好的情况下,执行互斥锁也至少需要两倍的时间。
回答by yahe
Atomic operations leverage processor support (compare and swap instructions) and don't use locks at all, whereas locks are more OS-dependent and perform differently on, for example, Win and Linux.
原子操作利用处理器支持(比较和交换指令)并且根本不使用锁,而锁更依赖于操作系统并且在例如 Win 和 Linux 上执行不同。
Locks actually suspend thread execution, freeing up cpu resources for other tasks, but incurring in obvious context-switching overhead when stopping/restarting the thread. On the contrary, threads attempting atomic operations don't wait and keep trying until success (so-called busy-waiting), so they don't incur in context-switching overhead, but neither free up cpu resources.
锁实际上挂起线程执行,为其他任务释放 CPU 资源,但在停止/重新启动线程时会产生明显的上下文切换开销。相反,尝试原子操作的线程不会等待并一直尝试直到成功(所谓的忙等待),因此它们不会产生上下文切换开销,也不会释放 CPU 资源。
Summing up, in general atomic operations are faster if contention between threads is sufficiently low. You should definitely do benchmarking as there's no other reliable method of knowing what's the lowest overhead between context-switching and busy-waiting.
总而言之,如果线程之间的争用足够低,通常原子操作会更快。您绝对应该进行基准测试,因为没有其他可靠的方法可以了解上下文切换和忙等待之间的最低开销是多少。
回答by LWimsey
A minimal (standards compliant) mutex implementation requires 2 basic ingredients:
最小的(符合标准的)互斥体实现需要 2 个基本成分:
- A way to atomically convey a state change between threads (the 'locked' state)
- memory barriers to enforce memory operations protected by the mutex to stay inside the protected area.
- 一种在线程之间原子地传递状态变化的方法(“锁定”状态)
- 内存屏障强制执行受互斥锁保护的内存操作留在保护区内。
There is no way you can make it any simpler than this because of the 'synchronizes-with' relationship the C++ standard requires.
由于 C++ 标准要求的“同步”关系,没有比这更简单的方法了。
A minimal (correct) implementation might look like this:
最小(正确)的实现可能如下所示:
class mutex {
std::atomic<bool> flag{false};
public:
void lock()
{
while (flag.exchange(true, std::memory_order_relaxed));
std::atomic_thread_fence(std::memory_order_acquire);
}
void unlock()
{
std::atomic_thread_fence(std::memory_order_release);
flag.store(false, std::memory_order_relaxed);
}
};
Due to its simplicity (it cannot suspend the thread of execution), it is likely that, under low contention, this implementation outperforms a std::mutex
.
But even then, it is easy to see that each integer increment, protected by this mutex, requires the following operations:
由于它的简单性(它不能挂起执行线程),在低争用情况下,此实现很可能优于std::mutex
. 但即便如此,也很容易看出受此互斥锁保护的每个整数增量都需要以下操作:
- an
atomic
store to release the mutex - an
atomic
compare-and-swap (read-modify-write) to acquire the mutex (possibly multiple times) - an integer increment
atomic
释放互斥锁的存储- 一个
atomic
比较和交换(读-修改-写)来获取互斥锁(可能多次) - 整数增量
If you compare that with a standalone std::atomic<int>
that is incremented with a single (unconditional) read-modify-write (eg. fetch_add
),
it is reasonable to expect that an atomic operation (using the same ordering model) will outperform the case whereby a mutex is used.
如果你比较,与独立std::atomic<int>
时递增与单(无条件)读-修改-写(例如fetch_add
),它是合理的预期,一个原子操作(使用相同的排序模型)将跑赢的情况下,其中一个互斥用过的。
回答by RonTLV
atomic integer is a user modeobject there for it's much more efficient than a mutex which runs in kernel mode. The scope of atomic integer is a single application while the scope of the mutex is for all running software on the machine.
原子整数是一个用户模式对象,因为它比在内核模式下运行的互斥锁更有效。原子整数的范围是单个应用程序,而互斥体的范围是机器上所有正在运行的软件。
回答by Ajay
The atomic variable classes in Java are able to take advantage of Compare and swap instructions provided by the processor.
Java 中的原子变量类能够利用处理器提供的比较和交换指令。
Here's a detailed description of the differences: http://www.ibm.com/developerworks/library/j-jtp11234/
这里是差异的详细描述:http: //www.ibm.com/developerworks/library/j-jtp11234/
回答by Gem Taylor
Most processors have supported an atomic read or write, and often an atomic cmp&swap. This means that the processor itself writes or reads the latest value in a single operation, and there might be a few cycles lost compared to a normal integer access, especially as the compiler can't optimise around atomic operations nearly as well as normal.
大多数处理器都支持原子读或写,并且通常支持原子 cmp&swap。这意味着处理器本身在单个操作中写入或读取最新值,与正常的整数访问相比,可能会丢失一些周期,特别是当编译器无法像正常那样围绕原子操作进行优化时。
On the other hand a mutex is a number of lines of code to enter and leave, and during that execution other processors that access the same location are totally stalled, so clearly a big overhead on them. In unoptimised high-level code, the mutex enter/exit and the atomic will be function calls, but for mutex, any competing processor will be locked out while your mutex enter function returns, and while your exit function is started. For atomic, it is only the duration of the actual operation which is locked out. Optimisation should reduce that cost, but not all of it.
另一方面,互斥体是要进入和离开的多行代码,在执行期间,访问同一位置的其他处理器完全停止,因此显然它们的开销很大。在未优化的高级代码中,互斥体进入/退出和原子将是函数调用,但对于互斥体,当互斥体进入函数返回时以及退出函数启动时,任何竞争处理器都将被锁定。对于atomic,锁定的只是实际操作的持续时间。优化应该降低成本,但不是全部。
If you are trying to increment, then your modern processor probably supports atomic increment/decrement, which will be great.
如果您正在尝试递增,那么您的现代处理器可能支持原子递增/递减,这会很棒。
If it does not, then it is either implemented using the processor atomic cmp&swap, or using a mutex.
如果没有,那么它要么使用处理器原子 cmp&swap 实现,要么使用互斥锁。
Mutex:
互斥体:
get the lock
read
increment
write
release the lock
Atomic cmp&swap:
原子cmp&swap:
atomic read the value
calc the increment
do{
atomic cmpswap value, increment
recalc the increment
}while the cmp&swap did not see the expected value
So this second version has a loop [incase another processor increments the value between our atomic operations, so value no longer matches, and increment would be wrong] that can get long [if there are many competitors], but generally should still be quicker than the mutex version, but the mutex version may allow that processor to task switch.
所以这第二个版本有一个循环[如果另一个处理器增加我们原子操作之间的值,所以值不再匹配,并且增加会是错误的]可能会很长[如果有很多竞争对手],但通常应该仍然比互斥体版本,但互斥体版本可能允许该处理器进行任务切换。
回答by Sunil Singhal
Mutex
is a kernel level semantic which provides mutual exclusion even at the Process level
. Note that it can be helpful in extending mutual exclusion across process boundaries and not just within a process (for threads). It is costlier.
Mutex
是内核级语义,即使在Process level
. 请注意,它有助于跨进程边界扩展互斥,而不仅仅是在进程内(对于线程)。它更贵。
Atomic Counter, AtomicInteger
for e.g., is based on CAS, and usually try attempting to do operation until succeed. Basically, in this case, threads race or compete to increment\decrement the value atomically. Here, you may see good CPU cycles being used by a thread trying to operate on a current value.
AtomicInteger
例如,Atomic Counter是基于 CAS 的,通常会尝试尝试执行操作,直到成功。基本上,在这种情况下,线程竞争或竞争以原子方式递增\递减值。在这里,您可能会看到试图对当前值进行操作的线程正在使用良好的 CPU 周期。
Since you want to maintain the counter, AtomicInteger\AtomicLong will be the best for your use case.
由于您想维护计数器,因此 AtomicInteger\AtomicLong 将是您的用例的最佳选择。