Linux 如何检测并发现程序处于死锁状态?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9389777/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 04:44:10  来源:igfitidea点击:

How to detect and find out a program is in deadlock?

linuxmultithreadingunixmultiprocessingdeadlock

提问by user1002288

This is an interview question.

这是一道面试题。

How to detect and find out if a program is in deadlock? Are there some tools that can be used to do that on Linux/Unix systems?

如何检测并找出程序是否处于死锁状态?是否有一些工具可用于在 Linux/Unix 系统上执行此操作?

My idea:

我的点子:

If a program makes no progress and its status is running, it is deadlock. But, other reasons can also cause this problem. Open source tools are valgrind (halgrind) can do that. Right?

如果一个程序没有任何进展并且它的状态是running,那么它就是死锁。但是,其他原因也可能导致此问题。开源工具有 valgrind (halgrind) 可以做到这一点。对?

采纳答案by daijo

I would suggest you look at Helgrind: a thread error detector.

我建议你看看Helgrind: a thread error detection

The simplest example of such a problem is as follows.

Imagine some shared resource R, which, for whatever reason, is guarded by two locks, L1 and L2, which must both be held when R is accessed.

Suppose a thread acquires L1, then L2, and proceeds to access R. The implication of this is that all threads in the program must acquire the two locks in the order first L1 then L2. Not doing so risks deadlock.

The deadlock could happen if two threads -- call them T1 and T2 -- both want to access R. Suppose T1 acquires L1 first, and T2 acquires L2 first. Then T1 tries to acquire L2, and T2 tries to acquire L1, but those locks are both already held. So T1 and T2 become deadlocked."

此类问题的最简单示例如下。

想象一些共享资源 R,无论出于何种原因,它由两个锁 L1 和 L2 保护,当访问 R 时,这两个锁都必须持有。

假设一个线程先获取 L1,然后是 L2,然后继续访问 R。这意味着程序中的所有线程都必须按照先 L1 然后 L2 的顺序获取这两个锁。不这样做可能会导致僵局。

如果两个线程(称为 T1 和 T2)都想访问 R,则可能会发生死锁。假设 T1 先获取 L1,T2 先获取 L2。然后 T1 尝试获取 L2,T2 尝试获取 L1,但这些锁都已被持有。所以T1和T2陷入僵局。”

回答by brokenfoot

If you suspect a deadlock, do a ps aux | grep <exe name>, if in output, the PROCESS STATE CODEis D(Uninterruptible sleep) means it is a deadlock. Because as @daijo explained, say you have two threads T1& T2and two critical sections each protected by semaphores S1 & S2then if T1acquires S1and T2acquires S2and after that they try to acquire the other lock before relinquishing the one already held by them, this will lead to a deadlock and on doing a ps aux | grep <exe name>, the process state codewill be D(ie Uninterruptible sleep).

如果您怀疑死锁,请执行 a ps aux | grep <exe name>,如果在输出中,PROCESS STATE CODEis D(Uninterruptible sleep) 表示这是一个死锁。因为随着@daijo解释,说你有两个线程T1T2每个受保护的两个关键部分semaphores S1 & S2这时如果T1获取S1T2获取S2后,他们尝试已经放弃其持有人之前获取其他锁,这将导致死锁和在做一个ps aux | grep <exe name>process state code将是D(即不间断睡眠)。

Tools:

工具:

Valgrind, Lockdep (linux kernel utility)

Valgrind、Lockdep(Linux 内核实用程序)

Check this link on types of deadlocks and how to avoid them : http://cmdlinelinux.blogspot.com/2014/01/linux-kernel-deadlocks-and-how-to-avoid.html

检查有关死锁类型以及如何避免它们的链接:http: //cmdlinelinux.blogspot.com/2014/01/linux-kernel-deadlocks-and-how-to-avoid.html

Edit: ps auxoutput D"could" mean process is in deadlock, from this redhat doc:

编辑:ps aux输出D“可能”意味着进程处于死锁,来自这个redhat doc

Uninterruptible Sleep State
An Uninterruptible sleep state is one that won't handle a signal right away. It will wake only as a result of a waited-upon resource becoming available or after a time-out occurs during that wait (if the time-out is specified when the process is put to sleep).

不间断睡眠状态
不间断睡眠状态是一种不会立即处理信号的状态。只有在等待的资源变得可用或在该等待期间发生超时后(如果在进程进入睡眠状态时指定了超时),它才会唤醒。