bash 带有 netcat 的 linux 脚本在 x 小时后停止工作

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18105299/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 06:10:56  来源:igfitidea点击:

linux script with netcat stops working after x hours

linuxbashraspberry-pinetcat

提问by Lectere

I've have to scripts:

我必须编写脚本:

#!/bin/bash

netcat -lk -p 12345 | while read line
do
    match=$(echo $line | grep -c 'Keep-Alive')
    if [ $match -eq 1 ]; then
        [start a command]
    fi
done

and

#!/bin/bash

netcat -lk -p 12346 | while read line
do
    match=$(echo $line | grep -c 'Keep-Alive')
    if [ $match -eq 1 ]; then
        [start a command]
    fi
done

I've put the two scripts in the '/etc/init.d/'

我已经把这两个脚本放在了'/etc/init.d/'

When I restart my Linux machine (RasbPi), both the scripts work fine.

当我重新启动 Linux 机器 (RasbPi) 时,两个脚本都可以正常工作。

I've tried them like 20 times, and they keep working fine.

我已经尝试了 20 次,它们一直工作正常。

But after around 12 hours, the whole system stops working. I've put in some loggin, but it seems that the scripts are not reacting anymore. But when I;

但大约 12 小时后,整个系统停止工作。我已经输入了一些登录信息,但脚本似乎不再有反应了。但是当我;

ps aux

I can see that the scripts are still running:

我可以看到脚本仍在运行:

root      1686  0.0  0.2   2740  1184 ?        S    Aug12   0:00 /bin/bash /etc/init.d/script1.sh start
root      1689  0.0  0.1   2268   512 ?        S    Aug12   0:00 netcat -lk 12345
root      1690  0.0  0.1   2744   784 ?        S    Aug12   0:00 /bin/bash /etc/init.d/script1.sh start
root      1691  0.0  0.2   2740  1184 ?        S    Aug12   0:00 /bin/bash /etc/init.d/script2.sh start
root      1694  0.0  0.1   2268   512 ?        S    Aug12   0:00 netcat -lk 12346
root      1695  0.0  0.1   2744   784 ?        S    Aug12   0:00 /bin/bash /etc/init.d/script2.sh start

After a reboot they start working again... But thats a sin, rebooting a Linux machine periodically...

重新启动后,他们又开始工作了……但那是一种罪过,定期重新启动 Linux 机器……

I've inserted some loggin, here's the outcome;

我已经插入了一些登录,这是结果;

Listening on [0.0.0.0] (family 0, port 12345)
[2013-08-14 11:55:00] Starting loop.
[2013-08-14 11:55:00] Starting netcat.
netcat: Address already in use
[2013-08-14 11:55:00] Netcat has stopped or crashed.
[2013-08-14 11:49:52] Starting loop.
[2013-08-14 11:49:52] Starting netcat.
Listening on [0.0.0.0] (family 0, port 12345)
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6333)
Connection closed, listening again.
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6334)
[2013-08-14 12:40:02] Starting loop.
[2013-08-14 12:40:02] Starting netcat.
netcat: Address already in use
[2013-08-14 12:40:02] Netcat has stopped or crashed.
[2013-08-14 12:17:16] Starting loop.
[2013-08-14 12:17:16] Starting netcat.
Listening on [0.0.0.0] (family 0, port 12345)
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6387)
Connection closed, listening again.
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6388)
[2013-08-14 13:10:08] Starting loop.
[2013-08-14 13:10:08] Starting netcat.
netcat: Address already in use
[2013-08-14 13:10:08] Netcat has stopped or crashed.
[2013-08-14 12:17:16] Starting loop.
[2013-08-14 12:17:16] Starting netcat.
Listening on [0.0.0.0] (family 0, port 12345)
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6167)
Connection closed, listening again.
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6168)

Thanks

谢谢

采纳答案by konsolebox

If none of your commands including netcat reads input from stdin you can completely make it run independent of the terminal. Sometimes background process that are still dependent on the terminal pauses (S) when they try to read input from it on a background. Actually since you're running a daemon, you should make sure that none of your commands reads input from it (terminal).

如果包括 netcat 在内的所有命令都没有从 stdin 读取输入,则可以完全使其独立于终端运行。有时,仍然依赖于终端的后台进程在尝试从后台读取输入时会暂停 (S)。实际上,由于您正在运行守护程序,因此您应该确保没有任何命令从它(终端)读取输入。

#!/bin/bash

set +o monitor # Make sure job control is disabled.

(
    : # Make sure the shell runs a subshell.
    exec netcat -lk -p 12345 | while read line  ## Use exec to overwrite the subshell.
    do
        match=$(echo $line | grep -c 'Keep-Alive')
        if [ $match -eq 1 ]; then
            [start a command]
        fi
    done
) <&- >&- 2>&- </dev/null &>/dev/null &

TASKPID=$!
sleep 1s ## Let the task initialize a bit before we disown it.
disown "$TASKPID"

And I think we could try the logging thing again:

我认为我们可以再次尝试记录日志:

set +o monitor

(
    echo "[$(date "+%F %T")] Starting loop with PID $BASHPID."

    for (( ;; ))
    do
        echo "[$(date "+%F %T")] Starting netcat."

        netcat -vv -lk -p 12345 | while read line
        do
            match=$(echo "$line" | grep -c 'Keep-Alive')
            if [ "$match" -eq 1 ]; then
                [start a command]
            fi
        done

        echo "[$(date "+%F %T")] Netcat has stopped or crashed."

        sleep 4s
    done
) <&- >&- 2>&- </dev/null >> "/var/log/something.log" 2>&1 &

TASKPID=$!
sleep 1s
disown "$TASKPID"

回答by konsolebox

About the loop it could look like this.

关于循环,它可能看起来像这样。

#!/bin/bash

for (( ;; ))
do
    netcat -lk -p 12345 | while read line
    do
        match=$(echo "$line" | grep -c 'Keep-Alive')
        if [ "$match" -eq 1 ]; then
            [start a command]
        fi
    done
    sleep 4s
done

with added double quotes to keep it safer.

添加双引号以使其更安全。

And you could try capturing errors and add some logging with this format:

您可以尝试捕获错误并使用以下格式添加一些日志记录:

#!/bin/bash

{
    echo "[$(date "+%F %T")] Starting loop."

    for (( ;; ))
    do
        echo "[$(date "+%F %T")] Starting netcat."

        netcat -lk -p 12345 | while read line
        do
            match=$(echo "$line" | grep -c 'Keep-Alive')
            if [ "$match" -eq 1 ]; then
                [start a command]
            fi
        done

        echo "[$(date "+%F %T")] Netcat has stopped or crashed."

        sleep 4s
    done
} >> "/var/log/something.log" 2>&1

Your read command could also be better in this format since it would read lines unmodified:

您的读取命令在这种格式下也可能更好,因为它会读取未修改的行:

... | while IFS= read -r line

Some could also suggest the use of process substitution but I don't recommend it this time since through the | while ...method the whileloop would be able to run on a subshell and keep the outer forloop safe just in case it crashes. Besides there isn't really a variable from the whileloop that would be needed outside of it.

有些人还建议使用进程替换,但这次我不推荐它,因为通过该| while ...方法while循环将能够在子外壳上运行并保持外for循环安全,以防万一它崩溃。此外,while循环之外并没有真正需要的变量。

I'm actually having the idea now that the issue might actually have been related to the input and how the while read line; do ...; doneblock handles it and not netcat itself. Your variables not being quoted properly around "" could be one of it, or could probably be the actual reasonwhy your netcat is crashing.

我现在实际上有了这个想法,这个问题实际上可能与输入以及while read line; do ...; done块如何处理它有关,而不是 netcat 本身。您的变量没有在 "" 周围正确引用可能是其中之一,或者可能是您的 netcat 崩溃的实际原因

回答by SSaikia_JtheRocker

You mentioned "after around 12 hours, the whole system stops working"- It is likely that the scripts are executing whatever you have in [start a command]and is bloating the memory. Are you sure the [start a command]is not forking out many processes very frequently and releasing memory?

您提到“大约 12 小时后,整个系统停止工作”- 脚本很可能正在执行您所拥有的任何内容[start a command]并且使内存膨胀。您确定[start a command]不是非常频繁地分叉出许多进程并释放内存吗?

回答by hashier

I have often experienced strange behaviour with ncor netcat. You should have a look at ncatit's almost the same tool but it behaves the same on all platforms (ncand netcatbehave differently depending on distri, linux, BSD, Mac).

我经常遇到nc或 的奇怪行为netcat。您应该看看ncat它几乎是相同的工具,但它在所有平台上的行为都相同(nc并且netcat行为取决于发行版、linux、BSD、Mac)。

回答by Dru

Periodically netcat will print, not a line, but a block of binary data. The read builtin will likely fail as a result.

netcat 会定期打印,不是一行,而是一个二进制数据块。因此, read 内置函数可能会失败。

I think you're using this program to verify that a remote host is still connected to port 12345 and 12346 and hasn't been rebooted.

我认为您正在使用此程序来验证远程主机是否仍连接到端口 12345 和 12346 并且尚未重新启动。

My solution for you is to pipe the output of netcat to sed, then pipe that (much reduced) line to the read builtin...

我的解决方案是将 netcat 的输出通过管道传输到 sed,然后将该(大大减少的)行通过管道传输到 read 内置...

#!/bin/bash

{
    echo "[$(date "+%F %T")] Starting loop."

    for (( ;; ))
    do
        echo "[$(date "+%F %T")] Starting netcat."

        netcat -lk -p 12345 | sed 's/.*Keep-Alive.*/Keep-Alive/g' | \
        \
        while read line
        do
            match=$(echo "$line" | grep -c 'Keep-Alive')
            if [ "$match" -eq 1 ]; then
                [start a command]
            fi
        done

        echo "[$(date "+%F %T")] Netcat has stopped or crashed."

        sleep 4s
    done
} >> "/var/log/something.log" 2>&1

Also, you'll need to review some of the other startup programs in /etc/init.d to make sure they are compatible with whatever version of rc the system uses, though, it would be much easier to call your script2.sh from a copy of some simple file in init.d. As it stands script2 is the startup script but doesn't conform to the init package you use.

此外,您还需要查看 /etc/init.d 中的一些其他启动程序,以确保它们与系统使用的任何版本的 rc 兼容,不过,从init.d 中一些简单文件的副本。就目前而言,script2 是启动脚本,但不符合您使用的 init 包。

That sounds more complicated that I mean... Let me explain better:

我的意思听起来更复杂……让我更好地解释一下:

/etc/init.d/syslogd        ## a standard init script that calls syslogd
/etc/init.d/start-monitor   ## a copy of a standard init script that calls script2.sh

As an additional note, I think you could bind netcat to the specific IP that you are monitoring, instead of binding it to the all address 0.0.0.0

作为附加说明,我认为您可以将 netcat 绑定到您正在监视的特定 IP,而不是将其绑定到所有地址 0.0.0.0

回答by tue

you may not use the -p option in the case you will wait for an incoming connect request. (see man page of nc) Hostname and Port are the last two arguments of the command line.

如果您将等待传入的连接请求,则不能使用 -p 选项。(参见 nc 的手册页)主机名和端口是命令行的最后两个参数。

May be it connects to the own port and after some hours there is some resource missing??

可能是它连接到自己的端口,几个小时后缺少一些资源?