multithreading 什么是竞态条件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34510/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 00:57:30  来源:igfitidea点击:

What is a race condition?

multithreadingconcurrencyterminologyrace-condition

提问by bmurphy1976

When writing multithreaded applications, one of the most common problems experienced is race conditions.

在编写多线程应用程序时,遇到的最常见问题之一是竞争条件。

My questions to the community are:

我向社区提出的问题是:

What is the race condition?
How do you detect them?
How do you handle them?
Finally, how do you prevent them from occurring?

什么是比赛条件?
你如何检测它们?
你如何处理它们?
最后,您如何防止它们发生?

回答by Lehane

A race condition occurs when two or more threads can access shared data and they try to change it at the same time. Because the thread scheduling algorithm can swap between threads at any time, you don't know the order in which the threads will attempt to access the shared data. Therefore, the result of the change in data is dependent on the thread scheduling algorithm, i.e. both threads are "racing" to access/change the data.

当两个或多个线程可以访问共享数据并尝试同时更改它时,就会发生竞争条件。由于线程调度算法可以随时在线程之间交换,因此您不知道线程尝试访问共享数据的顺序。因此,数据变化的结果取决于线程调度算法,即两个线程都在“竞相”访问/更改数据。

Problems often occur when one thread does a "check-then-act" (e.g. "check" if the value is X, then "act" to do something that depends on the value being X) and another thread does something to the value in between the "check" and the "act". E.g:

当一个线程执行“检查然后行动”(例如“检查”值是否为 X,然后“行动”以执行取决于值是 X 的值)而另一个线程对中的值执行某些操作时,通常会出现问题在“检查”和“行为”之间。例如:

if (x == 5) // The "Check"
{
   y = x * 2; // The "Act"

   // If another thread changed x in between "if (x == 5)" and "y = x * 2" above,
   // y will not be equal to 10.
}

The point being, y could be 10, or it could be anything, depending on whether another thread changed x in between the check and act. You have no real way of knowing.

关键是,y 可以是 10,也可以是任何值,这取决于另一个线程是否在检查和操作之间更改了 x。你没有真正的了解方式。

In order to prevent race conditions from occurring, you would typically put a lock around the shared data to ensure only one thread can access the data at a time. This would mean something like this:

为了防止竞争条件的发生,您通常会在共享数据周围放置一个锁,以确保一次只有一个线程可以访问数据。这将意味着这样的事情:

// Obtain lock for x
if (x == 5)
{
   y = x * 2; // Now, nothing can change x until the lock is released. 
              // Therefore y = 10
}
// release lock for x

回答by privatehuff

A "race condition" exists when multithreaded (or otherwise parallel) code that would access a shared resource could do so in such a way as to cause unexpected results.

当访问共享资源的多线程(或其他并行)代码可能以导致意外结果的方式执行此操作时,就会存在“竞争条件”。

Take this example:

拿这个例子:

for ( int i = 0; i < 10000000; i++ )
{
   x = x + 1; 
}

If you had 5 threads executing this code at once, the value of x WOULD NOT end up being 50,000,000. It would in fact vary with each run.

如果您有 5 个线程同时执行此代码,则 x 的值最终不会是 50,000,000。事实上,它会随着每次运行而变化。

This is because, in order for each thread to increment the value of x, they have to do the following: (simplified, obviously)

这是因为,为了让每个线程增加 x 的值,他们必须执行以下操作:(显然是简化了)

Retrieve the value of x
Add 1 to this value
Store this value to x

Any thread can be at any step in this process at any time, and they can step on each other when a shared resource is involved. The state of x can be changed by another thread during the time between x is being read and when it is written back.

任何线程都可以随时处于这个过程的任何一步,当涉及到共享资源时,它们可以相互踩踏。在读取 x 和写回 x 之间的时间内,另一个线程可以更改 x 的状态。

Let's say a thread retrieves the value of x, but hasn't stored it yet. Another thread can also retrieve the samevalue of x (because no thread has changed it yet) and then they would both be storing the samevalue (x+1) back in x!

假设一个线程检索 x 的值,但尚未存储它。另一个线程也可以检索x的相同值(因为还没有线程更改它),然后它们都会将相同的值 (x+1) 存储回 x!

Example:

例子:

Thread 1: reads x, value is 7
Thread 1: add 1 to x, value is now 8
Thread 2: reads x, value is 7
Thread 1: stores 8 in x
Thread 2: adds 1 to x, value is now 8
Thread 2: stores 8 in x

Race conditions can be avoided by employing some sort of lockingmechanism before the code that accesses the shared resource:

可以通过在访问共享资源的代码之前使用某种锁定机制来避免竞争条件:

for ( int i = 0; i < 10000000; i++ )
{
   //lock x
   x = x + 1; 
   //unlock x
}

Here, the answer comes out as 50,000,000 every time.

在这里,答案每次都是 50,000,000。

For more on locking, search for: mutex, semaphore, critical section, shared resource.

有关锁定的更多信息,请搜索:互斥锁、信号量、临界区、共享资源。

回答by Vishal Shukla

What is a Race Condition?

什么是竞争条件?

You are planning to go to a movie at 5 pm. You inquire about the availability of the tickets at 4 pm. The representative says that they are available. You relax and reach the ticket window 5 minutes before the show. I'm sure you can guess what happens: it's a full house. The problem here was in the duration between the check and the action. You inquired at 4 and acted at 5. In the meantime, someone else grabbed the tickets. That's a race condition - specifically a "check-then-act" scenario of race conditions.

你打算下午 5 点去看电影。您在下午 4 点询问门票的可用性。代表说他们是可用的。您放松并在演出前 5 分钟到达售票窗口。我相信你能猜到会发生什么:这是一个完整的房子。这里的问题在于检查和行动之间的持续时间。你4点询问,5点行动,这期间有人抢了票。这是一种竞争条件 - 特别是竞争条件的“检查然后行动”场景。

How do you detect them?

你如何检测它们?

Religious code review, multi-threaded unit tests. There is no shortcut. There are few Eclipse plugin emerging on this, but nothing stable yet.

宗教代码,多线程单元测试。没有捷径可走。很少有 Eclipse 插件出现在这方面,但还没有稳定。

How do you handle and prevent them?

你如何处理和预防它们?

The best thing would be to create side-effect free and stateless functions, use immutables as much as possible. But that is not always possible. So using java.util.concurrent.atomic, concurrent data structures, proper synchronization, and actor based concurrency will help.

最好的办法是创建无副作用和无状态的函数,尽可能多地使用不可变的。但这并不总是可能的。因此,使用 java.util.concurrent.atomic、并发数据结构、适当的同步和基于 actor 的并发将有所帮助。

The best resource for concurrency is JCIP. You can also get some more details on above explanation here.

并发的最佳资源是 JCIP。您还可以在此处获得有关上述解释的更多详细信息

回答by Baris Kasikci

There is an important technical difference between race conditions and data races. Most answers seem to make the assumption that these terms are equivalent, but they are not.

竞争条件和数据竞争之间存在重要的技术差异。大多数答案似乎都假设这些术语是等效的,但事实并非如此。

A data race occurs when 2 instructions access the same memory location, at least one of these accesses is a write and there is no happens before orderingamong these accesses. Now what constitutes a happens before ordering is subject to a lot of debate, but in general ulock-lock pairs on the same lock variable and wait-signal pairs on the same condition variable induce a happens-before order.

当 2 条指令访问同一内存位置时,就会发生数据竞争,这些访问中至少有一个是写操作,并且在这些访问之间排序之前不会发生任何情况。现在什么构成一个发生在排序之前有很多争论,但一般来说,同一锁变量上的 ulock-lock 对和同一条件变量上的等待信号对会导致一个发生前顺序。

A race condition is a semantic error. It is a flaw that occurs in the timing or the ordering of events that leads to erroneous program behavior.

竞争条件是语义错误。发生在时间或事件顺序中的缺陷会导致错误的程序行为

Many race conditions can be (and in fact are) caused by data races, but this is not necessary. As a matter of fact, data races and race conditions are neither the necessary, nor the sufficient condition for one another. Thisblog post also explains the difference very well, with a simple bank transaction example. Here is another simple examplethat explains the difference.

许多竞争条件可能(实际上是)由数据竞争引起,但这不是必需的。事实上,数据竞争和竞争条件既不是必要条件,也不是充分条件。这篇博文也很好地解释了差异,举了一个简单的银行交易示例。这是另一个解释差异的简单示例

Now that we nailed down the terminology, let us try to answer the original question.

现在我们确定了术语,让我们尝试回答最初的问题。

Given that race conditions are semantic bugs, there is no general way of detecting them. This is because there is no way of having an automated oracle that can distinguish correct vs. incorrect program behavior in the general case. Race detection is an undecidable problem.

鉴于竞争条件是语义错误,因此没有检测它们的通用方法。这是因为没有办法让自动化的预言机能够在一般情况下区分正确与不正确的程序行为。种族检测是一个不可判定的问题。

On the other hand, data races have a precise definition that does not necessarily relate to correctness, and therefore one can detect them. There are many flavors of data race detectors (static/dynamic data race detection, lockset-based data race detection, happens-before based data race detection, hybrid data race detection). A state of the art dynamic data race detector is ThreadSanitizerwhich works very well in practice.

另一方面,数据竞争有一个精确的定义,不一定与正确性有关,因此可以检测到它们。数据争用检测器有很多种(静态/动态数据争用检测、基于锁集的数据争用检测、基于先发生的数据争用检测、混合数据争用检测)。最先进的动态数据竞争检测器是ThreadSanitizer,它在实践中运行良好。

Handling data races in general requires some programming discipline to induce happens-before edges between accesses to shared data (either during development, or once they are detected using the above mentioned tools). this can be done through locks, condition variables, semaphores, etc. However, one can also employ different programming paradigms like message passing (instead of shared memory) that avoid data races by construction.

处理数据竞争一般需要一些编程规则来诱导访问共享数据之间的发生前边缘(在开发过程中,或者一旦使用上述工具检测到它们)。这可以通过锁、条件变量、信号量等来完成。然而,也可以采用不同的编程范式,如消息传递(而不是共享内存),通过构造避免数据竞争。

回答by Chris Conway

A sort-of-canonical definition is "when two threads access the same location in memory at the same time, and at least one of the accesses is a write." In the situation the "reader" thread may get the old value or the new value, depending on which thread "wins the race." This is not always a bug—in fact, some really hairy low-level algorithms do this on purpose—but it should generally be avoided. @Steve Gury give's a good example of when it might be a problem.

一种规范的定义是“当两个线程同时访问内存中的同一位置,并且至少有一个访问是 write ”。在这种情况下,“读者”线程可能会获得旧值或新值,具体取决于哪个线程“赢得比赛”。这并不总是一个错误——事实上,一些真正毛茸茸的低级算法是故意这样做的——但通常应该避免。@Steve Gury 给出了一个很好的例子,说明什么时候可能会出现问题。

回答by Steve Gury

A race condition is a kind of bug, that happens only with certain temporal conditions.

竞争条件是一种错误,仅在某些时间条件下才会发生。

Example: Imagine you have two threads, A and B.

示例:假设您有两个线程,A 和 B。

In Thread A:

在线程 A 中:

if( object.a != 0 )
    object.avg = total / object.a

In Thread B:

在线程 B 中:

object.a = 0

If thread A is preempted just after having check that object.a is not null, B will do a = 0, and when thread A will gain the processor, it will do a "divide by zero".

如果线程 A 在检查 object.a 不为空后被抢占,B 将执行a = 0,并且当线程 A 获得处理器时,它将执行“除以零”。

This bug only happen when thread A is preempted just after the if statement, it's very rare, but it can happen.

此错误仅在线程 A 在 if 语句之后被抢占时发生,这种情况非常罕见,但可能会发生。

回答by nybon

Race condition is not only related with software but also related with hardware too. Actually the term was initially coined by the hardware industry.

竞争条件不仅与软件有关,也与硬件有关。实际上,该术语最初是由硬件行业创造的。

According to wikipedia:

根据维基百科

The term originates with the idea of two signals racing each otherto influence the output first.

Race condition in a logic circuit:

enter image description here

该术语起源于两个信号相互竞争首先影响输出的想法。

逻辑电路中的竞争条件:

在此处输入图片说明

Software industry took this term without modification, which makes it a little bit difficult to understand.

软件行业没有修改就拿了这个词,这让人有点难以理解。

You need to do some replacement to map it to the software world:

您需要做一些替换以将其映射到软件世界:

  • "two signals" => "two threads"/"two processes"
  • "influence the output" => "influence some shared state"
  • “两个信号”=>“两个线程”/“两个进程”
  • “影响输出”=>“影响一些共享状态”

So race condition in software industry means "two threads"/"two processes" racing each other to "influence some shared state", and the final result of the shared state will depend on some subtle timing difference, which could be caused by some specific thread/process launching order, thread/process scheduling, etc.

所以在软件行业竞争条件是指“两个线程”/“两个进程”相互竞争以“影响某些共享状态”,共享状态的最终结果将取决于一些微妙的时序差异,这可能是由某些特定的原因引起的。线程/进程启动顺序、线程/进程调度等。

回答by Jorge Córdoba

A race condition is a situation on concurrent programming where two concurrent threads or processes compete for a resource and the resulting final state depends on who gets the resource first.

竞争条件是并发编程中的一种情况,其中两个并发线程或进程竞争资源,结果最终状态取决于谁先获得资源。

回答by tsellon

Race conditions occur in multi-threaded applications or multi-process systems. A race condition, at its most basic, is anything that makes the assumption that two things not in the same thread or process will happen in a particular order, without taking steps to ensure that they do. This happens commonly when two threads are passing messages by setting and checking member variables of a class both can access. There's almost always a race condition when one thread calls sleep to give another thread time to finish a task (unless that sleep is in a loop, with some checking mechanism).

竞争条件发生在多线程应用程序或多进程系统中。最基本的竞争条件是假设不在同一线程或进程中的两件事将按特定顺序发生,而不采取措施确保它们发生。当两个线程通过设置和检查都可以访问的类的成员变量来传递消息时,通常会发生这种情况。当一个线程调用 sleep 来给另一个线程完成任务的时间时,几乎总是存在竞争条件(除非 sleep 处于循环中,并且有一些检查机制)。

Tools for preventing race conditions are dependent on the language and OS, but some comon ones are mutexes, critical sections, and signals. Mutexes are good when you want to make sure you're the only one doing something. Signals are good when you want to make sure someone else has finished doing something. Minimizing shared resources can also help prevent unexpected behaviors

防止竞争条件的工具取决于语言和操作系统,但一些常见的工具是互斥锁、临界区和信号。当你想确保你是唯一一个在做某事时,互斥是很好的。当您想确保其他人已完成某事时,信号很好。最小化共享资源还有助于防止意外行为

Detecting race conditions can be difficult, but there are a couple signs. Code which relies heavily on sleeps is prone to race conditions, so first check for calls to sleep in the affected code. Adding particularly long sleeps can also be used for debugging to try and force a particular order of events. This can be useful for reproducing the behavior, seeing if you can make it disappear by changing the timing of things, and for testing solutions put in place. The sleeps should be removed after debugging.

检测竞争条件可能很困难,但有几个迹象。严重依赖睡眠的代码容易出现竞争条件,因此首先检查受影响代码中的睡眠调用。添加特别长的睡眠也可用于调试以尝试强制特定的事件顺序。这对于重现行为、查看是否可以通过改变事物的时间安排使其消失以及测试已部署的解决方案非常有用。调试后应删除睡眠。

The signature sign that one has a race condition though, is if there's an issue that only occurs intermittently on some machines. Common bugs would be crashes and deadlocks. With logging, you should be able to find the affected area and work back from there.

但是,存在竞争条件的标志是,是否存在仅在某些机器上间歇性发生的问题。常见的错误是崩溃和死锁。通过日志记录,您应该能够找到受影响的区域并从那里返回。

回答by Konstantin Dinev

Microsoft actually have published a really detailed articleon this matter of race conditions and deadlocks. The most summarized abstract from it would be the title paragraph:

微软实际上已经发表了一篇关于竞争条件和死锁问题的非常详细的文章。最概括的摘要是标题段落:

A race condition occurs when two threads access a shared variable at the same time. The first thread reads the variable, and the second thread reads the same value from the variable. Then the first thread and second thread perform their operations on the value, and they race to see which thread can write the value last to the shared variable. The value of the thread that writes its value last is preserved, because the thread is writing over the value that the previous thread wrote.

当两个线程同时访问共享变量时,会发生竞争条件。第一个线程读取变量,第二个线程从变量中读取相同的值。然后第一个线程和第二个线程对该值执行它们的操作,并且它们竞相查看哪个线程可以将值最后写入共享变量。最后写入其值的线程的值被保留,因为该线程正在写入前一个线程写入的值。