并行运行 bash 命令,跟踪结果和计数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6384013/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 20:39:44  来源:igfitidea点击:

Run bash commands in parallel, track results and count

bash

提问by edA-qa mort-ora-y

I was wondering how, if possible, I can create a simple job management in BASH to process several commands in parallel. That is, I have a big list of commands to run, and I'd like to have two of them running at any given time.

我想知道如果可能的话,我如何在 BASH 中创建一个简单的作业管理来并行处理多个命令。也就是说,我有一大堆命令要运行,我希望在任何给定时间运行其中的两个命令。

I know quite a bit about bash, so here are the requirements that make it tricky:

我对 bash 了解很多,所以这里有一些使它变得棘手的要求:

  • The commands have variable running time so I can't just spawn 2, wait, and then continue with the next two. As soon as one command is done a next command must be run.
  • The controlling process needs to know the exit code of each command so that it can keep a total of how many failed
  • 这些命令的运行时间是可变的,所以我不能只生成 2 个,等待,然后继续接下来的两个。一旦完成一个命令,就必须运行下一个命令。
  • 控制进程需要知道每个命令的退出代码,以便它可以保持总共有多少失败

I'm thinking somehow I can use trapbut I don't see an easy way to get the exit value of a child inside the handler.

我在想以某种方式我可以使用,trap但我没有看到一种简单的方法来获取处理程序中孩子的退出值。

So, any ideas on how this can be done?

那么,关于如何做到这一点的任何想法?



Well, here is some proof of concept code that should probably work, but it breaks bash: invalid command lines generated, hanging, and sometimes a core dump.

好吧,这里是一些应该可以工作的概念代码证明,但它破坏了 bash:生成的命令行无效、挂起,有时还有核心转储。

# need monitor mode for trap CHLD to work
set -m
# store the PIDs of the children being watched
declare -a child_pids

function child_done
{
    echo "Child  result = "
}

function check_pid
{
    # check if running
    kill -s 0 
    if [ $? == 0 ]; then
        child_pids=("${child_pids[@]}" "")
    else
        wait 
        ret=$?
        child_done  $ret
    fi
}

# check by copying pids, clearing list and then checking each, check_pid
# will add back to the list if it is still running
function check_done
{
    to_check=("${child_pids[@]}")
    child_pids=()

    for ((i=0;$i<${#to_check};i++)); do
        check_pid ${to_check[$i]}
    done
}

function run_command
{
    "$@" &
    pid=$!
    # check this pid now (this will add to the child_pids list if still running)
    check_pid $pid
}

# run check on all pids anytime some child exits
trap 'check_done' CHLD

# test
for ((tl=0;tl<10;tl++)); do
    run_command bash -c "echo FAIL; sleep 1; exit 1;"
    run_command bash -c "echo OKAY;"
done

# wait for all children to be done
wait

Note that this isn't what I ultimately want, but would be groundwork to getting what I want.

请注意,这不是我最终想要的,而是获得我想要的东西的基础。



Followup: I've implemented a system to do this in Python. So anybody using Python for scripting can have the above functionality. Refer to shelljob

后续:我已经在 Python 中实现了一个系统来执行此操作。所以任何使用 Python 编写脚本的人都可以拥有上述功能。参考shelljob

回答by Jay Hacker

GNU Parallelis awesomesauce:

GNU Parallel很棒:

$ parallel -j2 < commands.txt
$ echo $?

It will set the exit status to the number of commands that failed. If you have more than 253 commands, check out --joblog. If you don't know all the commands up front, check out --bg.

它将退出状态设置为失败的命令数。如果您有超过 253 个命令,请查看--joblog. 如果您事先不知道所有命令,请查看--bg.

回答by linuts

Can I persuade you to use make? This has the advantage that you can tell it how many commands to run in parallel (modify the -j number)

我可以说服你使用 make 吗?这样做的好处是您可以告诉它并行运行多少个命令(修改 -j 数字)

echo -e ".PHONY: c1 c2 c3 c4\nall: c1 c2 c3 c4\nc1:\n\tsleep 2; echo c1\nc2:\n\tsleep 2; echo c2\nc3:\n\tsleep 2; echo c3\nc4:\n\tsleep 2; echo c4" | make -f - -j2

Stick it in a Makefile and it will be much more readable

将其粘贴在 Makefile 中,它将更具可读性

.PHONY: c1 c2 c3 c4
all: c1 c2 c3 c4
c1:
        sleep 2; echo c1
c2:
        sleep 2; echo c2
c3:
        sleep 2; echo c3
c4:
        sleep 2; echo c4

Beware, those are not spaces at the beginning of the lines, they're a TAB, so a cut and paste won't work here.

请注意,这些不是行首的空格,它们是制表符,因此剪切和粘贴在这里不起作用。

Put an "@" infront of each command if you don't the command echoed. e.g.:

如果没有回显命令,请在每个命令前面放置一个“@”。例如:

        @sleep 2; echo c1

This would stop on the first command that failed. If you need a count of the failures you'd need to engineer that in the makefile somehow. Perhaps something like

这将在第一个失败的命令上停止。如果您需要对失败进行计数,则需要以某种方式在 makefile 中对其进行设计。也许像

command || echo F >> failed

Then check the length of failed.

然后检查失败的长度。

回答by qbert220

The problem you have is that you cannot wait for one of multiple background processes to complete. If you observe job status (using jobs) then finished background jobs are removed from the job list. You need another mechanism to determine whether a background job has finished.

您遇到的问题是您不能等待多个后台进程之一完成。如果您观察作业状态(使用作业),则完成的后台作业将从作业列表中删除。您需要另一种机制来确定后台作业是否已完成。

The following example uses starts to background processes (sleeps). It then loops using ps to see if they are still running. If not it uses wait to gather the exit code and starts a new background process.

以下示例使用启动到后台进程(睡眠)。然后它使用 ps 循环以查看它们是否仍在运行。如果不是,它使用等待收集退出代码并启动一个新的后台进程。

#!/bin/bash

sleep 3 &
pid1=$!
sleep 6 &
pid2=$!

while ( true ) do
    running1=`ps -p $pid1 --no-headers | wc -l`
    if [ $running1 == 0 ]
    then
        wait $pid1
        echo process 1 finished with exit code $?
        sleep 3 &
        pid1=$!
    else
        echo process 1 running
    fi

    running2=`ps -p $pid2 --no-headers | wc -l`
    if [ $running2 == 0 ]
    then
        wait $pid2
        echo process 2 finished with exit code $?
        sleep 6 &
        pid2=$!
    else
        echo process 2 running
    fi
    sleep 1
done

Edit: Using SIGCHLD (without polling):

编辑:使用 SIGCHLD(无轮询):

#!/bin/bash

set -bm
trap 'ChildFinished' SIGCHLD

function ChildFinished() {
    running1=`ps -p $pid1 --no-headers | wc -l`
    if [ $running1 == 0 ]
    then
        wait $pid1
        echo process 1 finished with exit code $?
        sleep 3 &
        pid1=$!
    else
        echo process 1 running
    fi

    running2=`ps -p $pid2 --no-headers | wc -l`
    if [ $running2 == 0 ]
    then
        wait $pid2
        echo process 2 finished with exit code $?
        sleep 6 &
        pid2=$!
    else
        echo process 2 running
    fi
    sleep 1
}

sleep 3 &
pid1=$!
sleep 6 &
pid2=$!

sleep 1000d

回答by Ozair Kafray

I think the following example answers some of your questions, I am looking into the rest of question

我认为以下示例回答了您的一些问题,我正在研究其余问题

(cat list1 list2 list3 | sort | uniq > list123) &
(cat list4 list5 list6 | sort | uniq > list456) &

from:

从:

Running parallel processes in subshells

在子shell中运行并行进程

回答by dkr

There is another package for debian systems named xjobs.

还有另一个名为xjobs 的debian 系统包。

You might want to check it out:

你可能想看看:

http://packages.debian.org/wheezy/xjobs

http://packages.debian.org/wheezy/xjobs

回答by crizCraig

If you cannot install parallelfor some reason this will work in plain shell or bash

如果parallel由于某种原因无法安装,这将在普通 shell 或 bash 中工作

# String to detect failure in subprocess
FAIL_STR=failed_cmd

result=$(
    (false || echo ${FAIL_STR}1) &
    (true  || echo ${FAIL_STR}2) &
    (false || echo ${FAIL_STR}3)
)
wait

if [[ ${result} == *"$FAIL_STR"* ]]; then
    failure=`echo ${result} | grep -E -o "$FAIL_STR[^[:space:]]+"`
    echo The following commands failed:
    echo "${failure}"
    echo See above output of these commands for details.
    exit 1
fi

Where true& falseare placeholders for your commands. You can also echo $? along with the FAIL_STRto get the command status.

其中true&false是您的命令的占位符。你也可以 echo $? 连同FAIL_STR以获取命令状态。