正在运行的 bash 脚本挂在某处。我能找出它在哪条线上吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4640794/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
A running bash script is hung somewhere. Can I find out what line it is on?
提问by nealmcb
E.g. does the bash debugger support attaching to existing processes and examining the current state?
例如,bash 调试器是否支持附加到现有进程并检查当前状态?
Or can I easily find out by looking at the bash process entries in /proc? Is there a convenient tool to give line numbers in active files?
或者我可以通过查看 /proc 中的 bash 进程条目轻松找到吗?有没有方便的工具可以在活动文件中给出行号?
I don't want to have to kill and restart the process.
我不想终止并重新启动该过程。
This is on Linux - Ubuntu 10.04.
这是在 Linux - Ubuntu 10.04 上。
回答by Mei
I recently found myself in a similar position. I had a shell script that was not identifiable through other means (such as arguments, etc.)
我最近发现自己处于类似的位置。我有一个无法通过其他方式(例如参数等)识别的 shell 脚本
There are ways to find out a lot more about a running process than you would expect.
有很多方法可以比您预期的更多地了解正在运行的进程。
Use lsof -p $pidto see what files are open, which may give you some clues. Note that some files, while "deleted", can still be kept open by the script. As long as the script doesn't close the file, it can still read and write from it - and the file still takes up room on the file system.
使用lsof -p $pid以查看哪些文件是开放的,这可能会给你一些线索。请注意,某些文件虽然已被“删除”,但脚本仍可保持打开状态。只要脚本不关闭文件,它仍然可以从中读取和写入 - 并且文件仍然占用文件系统上的空间。
Use straceto actively trace the system calls used by the script. The script will read the script file, so you can see some of the commands as they are read prior to execution. Look for readcommands with this command:
使用strace积极跟踪脚本使用的系统调用。该脚本将读取脚本文件,因此您可以看到在执行之前读取的一些命令。read使用以下命令查找命令:
strace -p $pid -s 1024
This makes the commands print strings up to 1024 characters long (normally, the stracecommand would truncate strings much shorter than that).
这使得命令可以打印长达 1024 个字符的字符串(通常,该strace命令会截断比这更短的字符串)。
Examine the directory /proc/$pidin order to see details about the script; in particular note, see /proc/$pid/environwhich will give you the process environment separated by nulls. To read this "file" properly, use this command:
检查目录/proc/$pid以查看有关脚本的详细信息;特别注意,查看/proc/$pid/environ哪个将为您提供由空值分隔的进程环境。要正确读取此“文件”,请使用以下命令:
xargs -0 -i{} < /proc/$pid/environ
You can pipe that into lessor save it in a file. There is also /proc/$pid/cmdlinebut it is possible that that will only give you the shell name (-bashfor instance).
您可以通过管道将其导入less或保存在文件中。还有,/proc/$pid/cmdline但它可能只会给你外壳名称(-bash例如)。
回答by Jürgen H?tzel
No real solution. But in most cases a script is waiting for a child process to terminate:
没有真正的解决方案。但在大多数情况下,脚本正在等待子进程终止:
ps --ppid $(pidof yourscript)
You could also setup signal handlers in you shell skript do toggle the printing of commands:
您还可以在 shell skript 中设置信号处理程序来切换命令的打印:
#!/bin/bash
trap "set -x" SIGUSR1
trap "set +x" SIGUSR2
while true; do
sleep 1
done
Then use
然后使用
kill -USR1 $(pidof yourscript)
kill -USR2 $(pidof yourscript)
回答by Eric Ren
Use pstreeto show what linux command/executable your script is calling. For example, 21156is the pid of my hanging script:
使用pstree显示什么Linux的命令/可执行脚本呼唤。例如,21156是我的挂脚本的pid:
ocfs2cts1:~ # pstree -pl 21156
activate_discon(21156)───mpirun(15146)─┬─fillup_contig_b(15149)───sudo(15231)───chmod(15232)
├─ssh(15148)
└─{mpirun}(15147)
So that, I know it's hanging at chmodcommand. Then, show the stack trace by:
所以,我知道它挂在chmod命令。然后,通过以下方式显示堆栈跟踪:
ocfs2cts1:~ # cat /proc/15232/stack
[<ffffffffa05377ef>] __ocfs2_cluster_lock.isra.39+0x1bf/0x620 [ocfs2]
[<ffffffffa053856d>] ocfs2_inode_lock_full_nested+0x12d/0x840 [ocfs2]
[<ffffffffa0538dbb>] ocfs2_inode_lock_atime+0xcb/0x170 [ocfs2]
[<ffffffffa0531e61>] ocfs2_readdir+0x41/0x1b0 [ocfs2]
[<ffffffff8120d03c>] iterate_dir+0x9c/0x110
[<ffffffff8120d453>] SyS_getdents+0x83/0xf0
[<ffffffff815e126e>] entry_SYSCALL_64_fastpath+0x12/0x6d
[<ffffffffffffffff>] 0xffffffffffffffff
Oh, boy, it's likely a deadlock bug...
哦,男孩,这可能是一个死锁错误......

