Linux 如何在 Bash 中给定超时后杀死子进程?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5161193/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 00:26:12  来源:igfitidea点击:

How to kill a child process after a given timeout in Bash?

linuxbashunix

提问by Greg

I have a bash script that launches a child process that crashes (actually, hangs) from time to time and with no apparent reason (closed source, so there isn't much I can do about it). As a result, I would like to be able to launch this process for a given amount of time, and kill it if it did not return successfully after a given amount of time.

我有一个 bash 脚本,它启动一个子进程,该进程不时崩溃(实际上,挂起)并且没有明显的原因(封闭源代码,所以我无能为力)。因此,我希望能够在给定的时间内启动此进程,如果在给定的时间后没有成功返回,则将其终止。

Is there a simpleand robustway to achieve that using bash?

有没有一种简单强大的方法可以使用 bash 来实现?

P.S.: tell me if this question is better suited to serverfault or superuser.

PS:告诉我这个问题是否更适合 serverfault 或超级用户。

采纳答案by Ignacio Vazquez-Abrams

(As seen in: BASH FAQ entry #68: "How do I run a command, and have it abort (timeout) after N seconds?")

(如: BASH 常见问题条目 #68:“如何运行命令,并让它在 N 秒后中止(超时)?”

If you don't mind downloading something, use timeout(sudo apt-get install timeout) and use it like: (most Systems have it already installed otherwise use sudo apt-get install coreutils)

如果您不介意下载某些内容,请使用timeout( sudo apt-get install timeout) 并像这样使用它:(大多数系统已经安装了它,否则请使用sudo apt-get install coreutils

timeout 10 ping www.goooooogle.com

If you don't want to download something, do what timeout does internally:

如果您不想下载某些内容,请执行 timeout 在内部执行的操作:

( cmdpid=$BASHPID; (sleep 10; kill $cmdpid) & exec ping www.goooooogle.com )

In case that you want to do a timeout for longer bash code, use the second option as such:

如果您想为更长的 bash 代码设置超时,请使用第二个选项:

( cmdpid=$BASHPID; 
    (sleep 10; kill $cmdpid) \
   & while ! ping -w 1 www.goooooogle.com 
     do 
         echo crap; 
     done )

回答by kojiro

Assuming you have (or can easily make) a pid file for tracking the child's pid, you could then create a script that checks the modtime of the pid file and kills/respawns the process as needed. Then just put the script in crontab to run at approximately the period you need.

假设您有(或可以轻松制作)一个用于跟踪孩子的 pid 的 pid 文件,那么您可以创建一个脚本来检查 pid 文件的 modtime 并根据需要终止/重新生成进程。然后只需将脚本放在 crontab 中即可在大约您需要的时间段运行。

Let me know if you need more details. If that doesn't sound like it'd suit your needs, what about upstart?

如果您需要更多详细信息,请告诉我。如果这听起来不适合您的需求,那么新贵呢?

回答by DigitalRoss

sleep 999&
t=$!
sleep 10
kill $t

回答by Dan

# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) &

or to get the exit codes as well:

或获取退出代码:

# Spawn a child process:
(dosmth) & pid=$!
# in the background, sleep for 10 secs then kill that process
(sleep 10 && kill -9 $pid) & waiter=$!
# wait on our worker process and return the exitcode
exitcode=$(wait $pid && echo $?)
# kill the waiter subshell, if it still runs
kill -9 $waiter 2>/dev/null
# 0 if we killed the waiter, cause that means the process finished before the waiter
finished_gracefully=$?

回答by Gavin Smith

One way is to run the program in a subshell, and communicate with the subshell through a named pipe with the readcommand. This way you can check the exit status of the process being run and communicate this back through the pipe.

一种方法是在子shell中运行程序,并通过带有read命令的命名管道与子shell通信。通过这种方式,您可以检查正在运行的进程的退出状态并通过管道将其返回。

Here's an example of timing out the yescommand after 3 seconds. It gets the PID of the process using pgrep(possibly only works on Linux). There is also some problem with using a pipe in that a process opening a pipe for read will hang until it is also opened for write, and vice versa. So to prevent the readcommand hanging, I've "wedged" open the pipe for read with a background subshell. (Another way to prevent a freeze to open the pipe read-write, i.e. read -t 5 <>finished.pipe- however, that also may not work except with Linux.)

这是一个yes在 3 秒后超时命令的示例。它获取正在使用的进程的 PID pgrep(可能仅适用于 Linux)。使用管道也存在一些问题,因为打开管道进行读取的进程将挂起,直到它也打开进行写入,反之亦然。因此,为了防止read命令挂起,我“楔入”了打开管道以使用后台子shell 进行读取。(另一种防止冻结打开管道读写的方法,即read -t 5 <>finished.pipe- 但是,除了 Linux,这也可能不起作用。)

rm -f finished.pipe
mkfifo finished.pipe

{ yes >/dev/null; echo finished >finished.pipe ; } &
SUBSHELL=$!

# Get command PID
while : ; do
    PID=$( pgrep -P $SUBSHELL yes )
    test "$PID" = "" || break
    sleep 1
done

# Open pipe for writing
{ exec 4>finished.pipe ; while : ; do sleep 1000; done } &  

read -t 3 FINISHED <finished.pipe

if [ "$FINISHED" = finished ] ; then
  echo 'Subprocess finished'
else
  echo 'Subprocess timed out'
  kill $PID
fi

rm finished.pipe

回答by Ulrich

I also had this question and found two more things very useful:

我也有这个问题,并发现另外两件事非常有用:

  1. The SECONDS variable in bash.
  2. The command "pgrep".
  1. bash 中的 SECONDS 变量。
  2. 命令“pgrep”。

So I use something like this on the command line (OSX 10.9):

所以我在命令行(OSX 10.9)上使用了这样的东西:

ping www.goooooogle.com & PING_PID=$(pgrep 'ping'); SECONDS=0; while pgrep -q 'ping'; do sleep 0.2; if [ $SECONDS = 10 ]; then kill $PING_PID; fi; done

As this is a loop I included a "sleep 0.2" to keep the CPU cool. ;-)

由于这是一个循环,我包含了一个“睡眠 0.2”来保持 CPU 凉爽。;-)

(BTW: ping is a bad example anyway, you just would use the built-in "-t" (timeout) option.)

(顺便说一句:无论如何,ping 都是一个不好的例子,您只需使用内置的“-t”(超时)选项。)

回答by Gavin Smith

Here's an attempt which tries to avoid killing a process after it has already exited, which reduces the chance of killing another process with the same process ID (although it's probably impossible to avoid this kind of error completely).

这是尝试避免在进程退出后杀死进程的尝试,这减少了杀死具有相同进程 ID 的另一个进程的机会(尽管可能无法完全避免这种错误)。

run_with_timeout ()
{
  t=
  shift

  echo "running \"$*\" with timeout $t"

  (
  # first, run process in background
  (exec sh -c "$*") &
  pid=$!
  echo $pid

  # the timeout shell
  (sleep $t ; echo timeout) &
  waiter=$!
  echo $waiter

  # finally, allow process to end naturally
  wait $pid
  echo $?
  ) \
  | (read pid
     read waiter

     if test $waiter != timeout ; then
       read status
     else
       status=timeout
     fi

     # if we timed out, kill the process
     if test $status = timeout ; then
       kill $pid
       exit 99
     else
       # if the program exited normally, kill the waiting shell
       kill $waiter
       exit $status
     fi
  )
}

Use like run_with_timeout 3 sleep 10000, which runs sleep 10000but ends it after 3 seconds.

使用 like run_with_timeout 3 sleep 10000,它运行sleep 10000但在 3 秒后结束。

This is like other answers which use a background timeout process to kill the child process after a delay. I think this is almost the same as Dan's extended answer (https://stackoverflow.com/a/5161274/1351983), except the timeout shell will not be killed if it has already ended.

这就像其他使用后台超时进程在延迟后终止子进程的答案一样。我认为这与 Dan 的扩展答案(https://stackoverflow.com/a/5161274/1351983)几乎相同,除了超时外壳如果已经结束则不会被杀死。

After this program has ended, there will still be a few lingering "sleep" processes running, but they should be harmless.

这个程序结束后,仍然会有一些挥之不去的“睡眠”进程在运行,但它们应该是无害的。

This may be a better solution than my other answer because it does not use the non-portable shell feature read -tand does not use pgrep.

这可能是比我的其他答案更好的解决方案,因为它不使用非便携式 shell 功能read -t并且不使用pgrep.

回答by Gavin Smith

Here's the third answer I've submitted here. This one handles signal interrupts and cleans up background processes when SIGINTis received. It uses the $BASHPIDand exectrick used in the top answerto get the PID of a process (in this case $$in a shinvocation). It uses a FIFO to communicate with a subshell that is responsible for killing and cleanup. (This is like the pipe in my second answer, but having a named pipe means that the signal handler can write into it too.)

这是我在这里提交的第三个答案。这个处理信号中断并在SIGINT接收到时清理后台进程。它使用顶部答案中使用的$BASHPIDexec技巧来获取进程的 PID(在本例中为调用)。它使用 FIFO 与负责终止和清理的子 shell 进行通信。(这就像我的第二个答案中的管道,但拥有命名管道意味着信号处理程序也可以写入其中。)$$sh

run_with_timeout ()
{
  t= ; shift

  trap cleanup 2

  F=$$.fifo ; rm -f $F ; mkfifo $F

  # first, run main process in background
  "$@" & pid=$!

  # sleeper process to time out
  ( sh -c "echo $$ >$F ; exec sleep $t" ; echo timeout >$F ) &
  read sleeper <$F

  # control shell. read from fifo.
  # final input is "finished".  after that
  # we clean up.  we can get a timeout or a
  # signal first.
  ( exec 0<$F
    while : ; do
      read input
      case $input in
        finished)
          test $sleeper != 0 && kill $sleeper
          rm -f $F
          exit 0
          ;;
        timeout)
          test $pid != 0 && kill $pid
          sleeper=0
          ;;
        signal)
          test $pid != 0 && kill $pid
          ;;
      esac
    done
  ) &

  # wait for process to end
  wait $pid
  status=$?
  echo finished >$F
  return $status
}

cleanup ()
{
  echo signal >$$.fifo
}

I've tried to avoid race conditions as far as I can. However, one source of error I couldn't remove is when the process ends near the same time as the timeout. For example, run_with_timeout 2 sleep 2or run_with_timeout 0 sleep 0. For me, the latter gives an error:

我尽量避免竞争条件。但是,我无法消除的一个错误来源是该过程与超时几乎同时结束。例如,run_with_timeout 2 sleep 2run_with_timeout 0 sleep 0。对我来说,后者给出了一个错误:

timeout.sh: line 250: kill: (23248) - No such process

as it is trying to kill a process that has already exited by itself.

因为它试图杀死一个已经自行退出的进程。