检查从同一 bash 脚本启动的后台进程的运行状态

Question

提问by parthasarathy

I have to write a bash script that launches a process in backgroundin accordance to command line argument passed and returns if it were successfully able to ~~run~~launch the program.

我必须编写一个 bash 脚本，根据传递的命令行参数在后台启动一个进程，如果它能够成功运行启动程序，则返回。

Here is a pseudo code of what I am trying to achieve

这是我想要实现的伪代码

if [ "" = "PROG_1" ] ; then
    ./launchProg1 &
    if [ isLaunchSuccess ] ; then
        echo "Success"
    else
        echo "failed"
        exit 1
    fi
elif [ "" = "PROG_2" ] ; then
    ./launchProg2 &
    if [ isLaunchSuccess ] ; then
        echo "Success"
    else
        echo "failed"
        exit 1
    fi
fi

Script cannot waitor sleepsince it will be called by another mission critical c++ program and needs high throughput ( wrt no of processes started per second ) and moreover running time of processes are unknown. Script neither needs to capture any input/output nor waits for launched process' completion.

脚本不能wait或sleep因为它将被另一个关键任务 C++ 程序调用并且需要高吞吐量（每秒启动的进程数），而且进程的运行时间是未知的。脚本既不需要捕获任何输入/输出，也不需要等待启动的进程完成。

I have unsuccessfully tried the following:

我尝试了以下方法失败：

#Method 1
if [ "" = "KP1" ] ; then
    echo "The Arguement is KP1"
    ./kp 'this is text' &
    if [ $? = "0" ] ; then
        echo "Success"
    else
        echo "failed"
        exit 1
    fi
elif [ "" = "KP2" ] ; then
    echo "The Arguement is KP2"
    ./NoSuchCommand 'this is text' &
    if [ $? = "0" ] ; then
        echo "Success"
    else
        echo "failed"
        exit 1
    fi
#Method 2
elif [ "" = "CD5" ] ; then
    echo "The Arguement is CD5"
    cd "doesNotExist" &
    PROC_ID=$!
    echo "PID is $PROC_ID"
    if kill -0 "$PROC_ID" ; then
        echo "Success"
    else
        echo "failed"
        exit 1
    fi
#Method 3
elif [ "" = "CD6" ] ; then
    echo "The Arguement is CD6"
    cd .. &
    PROC_ID=$!
    echo "PID is $PROC_ID"
    ps -eo pid | grep "$PROC_ID" && { echo "Success"; exit 0; }
    ps -eo pid | grep  "$PROC_ID" || { echo "failed" ; exit 1; }
else
    echo "Unknown Argument"
    exit 1
fi

Running the script gives unreliable output. Method 1, 2 always return Successwhile Method 3 returns failedwhen process execution finishes before the checks.

运行脚本会给出不可靠的输出。方法 1、2 总是返回，Success而方法 3failed在检查之前进程执行完成时返回。

Here is sample tested on GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)and GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)

这是在GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)和GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)

[scripts]$ ./processStarted3.sh KP1
The Arguement is KP1
Success
[scripts]$ ./processStarted3.sh KP2
The Arguement is KP2
Success
./processStarted3.sh: line 13: ./NoSuchCommand: No such file or directory
[scripts]$ ./processStarted3.sh CD6
The Arguement is CD6
PID is 25050
failed

As suggested in similar questions, I cannot use process names as one process may be executed several timesand otherscan't be applied.

正如在类似问题中所建议的那样，我不能使用进程名称，因为一个进程可能会执行多次而其他进程无法应用。

I have not tried screenand tmux, since getting permission to install them on production servers wont be easy ( but will do so if that is the only option left )

我没有尝试过screen和tmux，因为获得在生产服务器上安装它们的许可并不容易（但如果这是唯一的选择，我会这样做）

UPDATE
@ghoti
./kpis program which exists and launching the program returns Success. ./NoSuchCommanddoes not exist. Still as you can see from (edited)output, script incorrectly returns Success.

UPDATE
@goti
./kp是存在的程序并且启动该程序返回Success。./NoSuchCommand不存在。仍然如您从（编辑的）输出中看到的那样，脚本错误地返回Success.

It does not matter when the process completes execution or program abnormally terminates. Programs launchedvia script are not tracked in any way ( hence we do not store pidin any table nor necessity arises to use deamontools).

进程何时完成执行或程序异常终止都没有关系。不会以任何方式跟踪通过脚本启动的程序（因此我们不存储pid在任何表中，也没有必要使用deamontools）。

@Etan Reisner
Example of a program which fails to launchwill be ./NoSuchCommand,which does not exist. Or maybe a corrupted program which fails to start.

@Etan Reisner
无法启动的程序示例将是./NoSuchCommand不存在的。或者可能是无法启动的损坏程序。

@Vorsprung
Calling a script which launches a program in background does not take alot of time ( and is manageable as per our expectations). But sleep 1will accumulate over time to cause issues.

@Vorsprung
调用在后台启动程序的脚本不会花费很多时间（并且可以按照我们的预期进行管理）。但sleep 1会随着时间的推移积累而导致问题。

Aforementioned #Method3works fine barring processes which terminate before ps -eo pid | grep "$PROC_ID" && { echo "Success"; exit 0; }check can be performed.

上述#Method3工作很好，ps -eo pid | grep "$PROC_ID" && { echo "Success"; exit 0; }可以执行检查之前终止的进程。

Answer 1

采纳答案by rajenpandit

Here is an example which will show the result of a process whether it is started successfully or not.

这是一个示例，它将显示进程是否成功启动的结果。

#!/bin/bash
 & #executes a program in background which is provided as an argument
pid=$! #stores executed process id in pid
count=$(ps -A| grep $pid |wc -l) #check whether process is still running
if [[ $count -eq 0 ]] #if process is already terminated, then there can be two cases, the process executed and stop successfully or it is terminated abnormally
then
        if wait $pid; then #checks if process executed successfully or not
                echo "success"
        else                    #process terminated abnormally
                echo "failed (returned $?)"
        fi
else
        echo "success"  #process is still running
fi

#Note: The above script will only provide a result whether process started successfully or not. If porcess starts successfully and later it terminates abnormally then this sciptwill not provide a correct result

Answer 2

回答by Andrew Feren

The accepted answer doesn't work as advertised.

接受的答案不像宣传的那样有效。

The count in this check will always be at least 1 because "grep $pid" will find both the process with $pid if it exists and the grep.

此检查中的计数将始终至少为 1，因为“grep $pid”将找到带有 $pid 的进程（如果存在）和 grep。

count=$(ps -A| grep $pid |wc -l)
if [[ $count -eq 0 ]]
then
    ### We can never get here
else
    echo "success"  #process is still running
fi

Changing the above to check for a count of 1 or excluding the grep from the count should make the original work.

更改上述内容以检查计数为 1 或从计数中排除 grep 应该使原始工作正常。

Here is an alternate (maybe simpler) implementation of the original example.

这是原始示例的替代（可能更简单）实现。

#!/bin/bash
 & # executes a program in background which is provided as an argument
pid=$! # stores executed process id in pid

# check whether process is still running
# The "[^[]" excludes the grep from finding itself in the ps output
if ps | grep "$pid[^[]" >/dev/null
then
    echo "success (running)"  # process is still running
else
    # If the process is already terminated, then there are 2 cases:
    # 1) the process executed and stop successfully
    # 2) it is terminated abnormally

    if wait $pid # check if process executed successfully or not
    then
        echo "success (ran)"
    else
        echo "failed (returned $?)" # process terminated abnormally
    fi
fi

# Note: The above script will detect if a process started successfully or not. If process is running when we check, but later it terminates abnormally then this script will not detect this.

Answer 3

回答by lesmana

use jobs.

使用jobs.

put the following in a bash script and execute

将以下内容放入 bash 脚本并执行

#!/bin/bash

{ sleep 1 ; echo sleep1 ; } &
sleep 0
jobs
wait

echo nosleep &
sleep 0
jobs
wait

echo exit1
false &
sleep 0
jobs
wait

notexisting &
sleep 0
jobs
wait

./existingbutnotexecutable &
sleep 0
jobs
wait

output

输出

$ ./testrun.sh 
[1]+  Running                 { sleep 1; echo sleep1; } &
sleep1
nosleep
[1]+  Done                    echo nosleep
exit1
[1]+  Exit 1                  false
./testrun.sh: line 19: notexisting: command not found
[1]+  Exit 127                notexisting
./testrun.sh: line 24: ./existingbutnotexecutable: Permission denied
[1]+  Exit 126                ./existingbutnotexecutable

from the output of jobswe can differ between:

从jobs我们的输出中，我们可以有所不同：

a background job that is still running
a job that is done running
a job that is done running with nonezero exitstatus
a job that could not run because command not found
and a job that could not run because not executable.

仍在运行的后台作业
已完成运行的作业
以非零退出状态运行的作业
由于找不到命令而无法运行的作业
以及因不可执行而无法运行的作业。

maybe there are even more cases but i did not research more.

也许还有更多案例，但我没有研究更多。

the waitis just to make sure that there are no more than one background jobs at once.

这wait只是为了确保一次不超过一个后台作业。

the sleep 0is necessary otherwise jobswill report process is running even before the shell is able to report error command not found. i tried echobut it seems to be not enough delay.

这sleep 0是必要的，否则jobs将报告进程正在运行，甚至在 shell 能够报告错误命令未找到之前。我试过了，echo但似乎还不够延迟。

remove the sleepand you get this output

删除sleep，你会得到这个输出

$ ./testrun.sh 
[1]+  Running                 { sleep 1; echo sleep1; } &
sleep1
[1]+  Running                 echo nosleep &
nosleep
exit1
[1]+  Running                 false &
[1]+  Running                 notexisting &
./testrun.sh: line 19: notexisting: command not found
[1]+  Running                 ./existingbutnotexecutable &
./testrun.sh: line 24: ./existingbutnotexecutable: Permission denied

notice that jobsalways says "running" and always comes before the result of the commands. error or not.

请注意，它jobs总是说“正在运行”并且总是出现在命令的结果之前。错误与否。

here is one possibility to act based on the output of jobs

这是根据输出采取行动的一种可能性 jobs

#!/bin/bash

isrunsuccess() {
  case $(jobs) in
    *Running*)   echo ">>> running" ;;
    *Done*)      echo ">>> done" ;;
    *Exit\ 127*) echo ">>> not found" ;;
    *Exit\ 126*) echo ">>> not executable" ;;
    *Exit*)      echo ">>> done nonzero exitstatus" ;;
  esac
}

{ sleep 1 ; echo sleep1 ; } &
sleep 0
isrunsuccess
wait

echo nosleep &
sleep 0
isrunsuccess
wait

echo exit1
false &
sleep 0
isrunsuccess
wait

notexisting &
sleep 0
isrunsuccess
wait

./existingbutnotexecutable &
sleep 0
isrunsuccess
wait

output

输出

$ ./testrun.sh 
>>> running
sleep1
nosleep
>>> done
exit1
>>> done nonzero exitstatus
./testrun.sh: line 29: notexisting: command not found
>>> not found
./testrun.sh: line 34: ./existingbutnotexecutable: Permission denied
>>> not executable

you can merge the "did run" and "did not run" cases

您可以合并“did run”和“did not run”情况

isrunsuccess() {
  case $(jobs) in
    *Exit\ 127*|*Exit\ 126*) echo ">>> did not run" ;;
    *Running*|*Done*|*Exit*) echo ">>> still running or was running" ;;
  esac
}

output

输出

$ ./testrun.sh 
>>> still running or was running
sleep1
nosleep
>>> still running or was running
exit1
>>> still running or was running
./testrun.sh: line 26: notexisting: command not found
>>> did not run
./testrun.sh: line 31: ./existingbutnotexecutable: Permission denied
>>> did not run

other methods to check contents of string in bash: How do you tell if a string contains another string in Unix shell scripting?

在 bash 中检查字符串内容的其他方法：如何在 Unix shell 脚本中判断一个字符串是否包含另一个字符串？

documentation of bash stating that exitstatus 127 for not found and 126 for not executable: https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html

bash 文档说明 exitstatus 127 未找到，126 不可执行：https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html

Answer 4

回答by Vorsprung

sorry missed this requirement "Script cannot wait or sleep"

抱歉错过了这个要求“脚本不能等待或睡眠”

launch the background program, get it's pid. Wait a second. Then check it is still running with kill -0

启动后台程序，获取它的pid。等一等。然后用 kill -0 检查它是否仍在运行

kill -0 status is taken from $? and this is used to decide if the process is still running

kill -0 状态取自 $? 这用于确定进程是否仍在运行

#!/bin/bash

./ &
pid=$!

sleep 1;

kill -0 $pid
stat=$?
if [ $stat -eq 0 ] ; then
  echo "running as $!"
  exit 0
else
  echo "$! did not start"
  exit 1
fi

Maybe if your super speedy C++ program cannot wait for a second, it also cannot expect to be able to launch a load of shell commands at a high rate per second?

也许如果你的超高速 C++ 程序不能等待一秒钟，它也不能期望能够以每秒高速率启动大量 shell 命令？

Maybe you need to implement a queue here?

也许你需要在这里实现一个队列？

Sorry for more questions than answers

抱歉，问题多于答案

检查从同一 bash 脚本启动的后台进程的运行状态

提问by parthasarathy

采纳答案by rajenpandit

回答by Andrew Feren

回答by lesmana

回答by Vorsprung

相关推荐

最近更新

标签

检查从同一 bash 脚本启动的后台进程的运行状态

提问by parthasarathy

采纳答案by rajenpandit

回答by Andrew Feren

回答by lesmana

回答by Vorsprung

相关推荐

bash Bash中没有空格的字符串连接

bash 为 zsh + Prezto 主题安装电力线字体

bash 中的情况：“第 4 行：意外标记附近的语法错误‘)’”

bash 我怎样才能*只*获得bash中磁盘上可用的字节数？

相关推荐

最近更新

标签

bash 我怎样才能只获得bash中磁盘上可用的字节数？