Linux Bash 脚本并行处理有限数量的命令

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19543139/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-07 01:10:21  来源:igfitidea点击:

Bash script processing limited number of commands in parallel

linuxbashshell

提问by AL-Kateb

I have a bash script that looks like this:

我有一个看起来像这样的 bash 脚本:

#!/bin/bash
wget LINK1 >/dev/null 2>&1
wget LINK2 >/dev/null 2>&1
wget LINK3 >/dev/null 2>&1
wget LINK4 >/dev/null 2>&1
# ..
# ..
wget LINK4000 >/dev/null 2>&1

But processing each line until the command is finished then moving to the next one is very time consuming, I want to process for instance 20 lines at once then when they're finished another 20 lines are processed.

但是处理每一行直到命令完成然后移动到下一行是非常耗时的,我想一次处理例如 20 行,然后当它们完成时再处理 20 行。

I thought of wget LINK1 >/dev/null 2>&1 &to send the command to the background and carry on, but there are 4000 lines here this means I will have performance issues, not to mention being limited in how many processes I should start at the same time so this is not a good idea.

我想把wget LINK1 >/dev/null 2>&1 &命令发送到后台继续执行,但是这里有 4000 行这意味着我会有性能问题,更不用说我应该同时启动多少进程的限制,所以这不是一个好方法主意。

One solution that I'm thinking of right now is checking whether one of the commands is still running or not, for instance after 20 lines I can add this loop:

我现在正在考虑的一种解决方案是检查其中一个命令是否仍在运行,例如在 20 行之后我可以添加这个循环:

while [  $(ps -ef | grep KEYWORD | grep -v grep | wc -l) -gt 0 ]; do
sleep 1
done

Of course in this case I will need to append & to the end of the line! But I'm feeling this is not the right way to do it.

当然,在这种情况下,我需要将 & 附加到行尾!但我觉得这不是正确的做法。

So how do I actually group each 20 lines together and wait for them to finish before going to the next 20 lines, this script is dynamically generated so I can do whatever math I want on it while it's being generated, but it DOES NOT have to use wget, it was just an example so any solution that is wget specific is not gonna do me any good.

那么我实际上如何将每 20 行组合在一起并等待它们完成然后再转到下 20 行,这个脚本是动态生成的,所以我可以在它生成时做任何我想做的数学运算,但它不必使用 wget,这只是一个例子,所以任何特定于 wget 的解决方案都不会对我有任何好处。

采纳答案by devnull

Use the waitbuilt-in:

使用wait内置:

process1 &
process2 &
process3 &
process4 &
wait
process5 &
process6 &
process7 &
process8 &
wait

For the above example, 4 processes process1... process4would be started in the background, and the shell would wait until those are completed before starting the next set.

对于上面的例子,4 个进程process1……process4将在后台启动,shell 将等待这些进程完成,然后再开始下一组。

From the GNU manual:

GNU 手册

wait [jobspec or pid ...]

Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.

wait [jobspec or pid ...]

等到每个进程 ID pid 或作业规范 jobspec 指定的子进程退出并返回上一个等待的命令的退出状态。如果给出了作业规范,则等待作业中的所有进程。如果没有给出参数,则等待所有当前活动的子进程,返回状态为零。如果 jobspec 和 pid 都没有指定 shell 的活动子进程,则返回状态为 127。

回答by choroba

See parallel. Its syntax is similar to xargs, but it runs the commands in parallel.

平行。它的语法类似于xargs,但它并行运行命令。

回答by Binpix

You can run 20 processes and use the command:

您可以运行 20 个进程并使用以下命令:

wait

Your script will wait and continue when all your background jobs are finished.

当所有后台作业完成后,您的脚本将等待并继续。

回答by Vader B

In fact, xargscanrun commands in parallel for you. There is a special -P max_procscommand-line option for that. See man xargs.

事实上,xargs可以为您并行运行命令。有一个特殊的-P max_procs命令行选项。见man xargs