Bash 脚本,用于监控进程和发送邮件(如果失败)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24289768/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Bash script to monitor process and sendmail if failed
提问by JeremyA1
I realize that I can't reliably count on ps | grep or variants to accurately tell me what PID is started. However I know what I need for interim until this problem is resolved in the next release.
我意识到我不能可靠地指望 ps | grep 或变体来准确地告诉我启动了哪个 PID。但是,我知道在下一个版本中解决此问题之前我需要什么。
I have a process named Foo that is the parent, TEST1 and TEST2 are the child processes. If TEST1 and/or TEST2 dies off Foo will continue to run and will not respawn TEST1 and/or TEST2 which is needed to function properly. How do I know this because the program to restart TEST1 and/or TEST2 requires Foo to be restarted first.
我有一个名为 Foo 的进程,它是父进程,TEST1 和 TEST2 是子进程。如果 TEST1 和/或 TEST2 死掉,Foo 将继续运行并且不会重新生成正常运行所需的 TEST1 和/或 TEST2。我怎么知道这一点,因为重新启动 TEST1 和/或 TEST2 的程序需要先重新启动 Foo。
So when I want to monitor a child process, if failed sendemail that it failed then restart the service and send another email that it is started again. I plan to run the script via CRON every 5 minutes.
因此,当我想监视子进程时,如果 sendemail 失败,则它失败了,然后重新启动服务并发送另一封电子邮件,它再次启动。我计划每 5 分钟通过 CRON 运行一次脚本。
The check works independently and so does the sendmail. The problem is when I create a if else statement. When TEST1 or TEST2 dies it still logs that it is running when it is not. Can someone help me on this please.
检查独立工作,sendmail 也是如此。问题是当我创建一个 if else 语句时。当 TEST1 或 TEST2 终止时,它仍会记录它正在运行,而实际上并未运行。有人可以帮助我吗?
#!/bin/bash
#Check if process is running
VAL1=`/usr/ucb/ps aux | grep "[P]ROCESS TEST1" >/dev/null`
VAL2=`/usr/ucb/ps aux | grep "[P]ROCESS TEST2" >/dev/null`
if $VAL1 && $VAL2; then
echo "$(date) - $VAL1 & $VAL2 is Running" >> /var/tmp/Log.txt;
else
SUBJ="Process has stopped"
FROM="Server"
TO="[email protected]"
(
cat << !
To : ${TO}
From : ${FROM}
Subject : ${SUBJ}
!
cat << !
The $VAL1 and $VAL2 went down at $(date) please login to the server to restart
!
) | sendmail -v ${TO}
elseif
/usr/sbin/svcadm disable Foo;
wait 10;
/usr/sbin/svcadm enable Foo;
fi
采纳答案by Tim Kennedy
So, one thing about your tests is that you're pushing the output to /dev/null
, which means that VAL1 and VAL2 will always be empty.
因此,关于您的测试的一件事是您将输出推送到/dev/null
,这意味着 VAL1 和 VAL2 将始终为空。
Secondly, you don't need the elif. You have two basic conditions. Either things are running, or they are not. If anything is not running, send an email. You could do some additional testing to determine whether it's PROCESS TEST1 or PROCESS TEST2 that died, but that wouldn't strictly be necessary.
其次,你不需要 elif。你有两个基本条件。事情要么正在运行,要么没有。如果有任何未运行,请发送电子邮件。您可以进行一些额外的测试,以确定是 PROCESS TEST1 还是 PROCESS TEST2 死亡,但这并不是绝对必要的。
Here's how I might write a script to do the same thing.
这是我如何编写脚本来做同样的事情。
#!/usr/bin/env bash
#Check if process is running
PID1=$(/usr/ucb/ps aux | grep "[P]ROCESS TEST1" | awk '{print }')
PID2=$(/usr/ucb/ps aux | grep "[P]ROCESS TEST2" | awk '{print }')
err=0
if [ "x$PID1" == "x" ]; then
# PROCESS TEST1 died
err=$(( err + 1 ))
else
echo "$(date) - PROCESS TEST1 $VAL2 is Running" >> /var/tmp/Log.txt;
fi
if [ "x$PID2" == "x" ]; then
# PROCESS TEST2 died
err=$(( err + 2 ))
else
echo "$(date) - PROCESS TEST2 is Running" >> /var/tmp/Log.txt;
fi
if (( $err > 0 )); then
# identify which PROCESS TEST had the problem.
if $(( err == 1 )); then
condition="PROCESS TEST1 is down"
elif (( $err == 2 )); then
condition="PROCESS TEST2 is down"
else
condition="PROCESS TEST1 and PROCESS TEST2 are down"
fi
# let's send an email to get eyes on the issue, but we will restart the process after
# we send the email.
SUBJ="Process Error Detected"
FROM="Server"
TO="[email protected]"
(
cat <<-EOT
To : ${TO}
From : ${FROM}
Subject : ${SUBJ}
$condition at $(date) please login to the server to check that the processes were restarted successfully.
EOT
) | sendmail -v ${TO}
# we reached an error condition, and we sent mail
# now let's restart the svc.
/usr/sbin/svcadm restart Foo
fi
回答by V H
elseif ? do you mean elif ?
否则?你是说 elif 吗?
also you thought about using functions and putting the sendmail part within a function that gets called out from within the if statement?
您还考虑过使用函数并将 sendmail 部分放在从 if 语句中调用的函数中吗?