bash 使用 xargs 并行运行程序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28357997/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Running programs in parallel using xargs
提问by Olivier
I currently have the current script.
我目前有当前的脚本。
#!/bin/bash
# script.sh
for i in {0..99}; do
script-to-run.sh input/ output/ $i
done
I wish to run it in parallel using xargs. I have tried
我希望使用 xargs 并行运行它。我试过了
script.sh | xargs -P8
But doing the above only executed once at the time. No luck with -n8 as well. Adding & at the end of the line to be executed in the script for loop would try to run the script 99 times at once. How do I execute the loop only 8 at the time, up to 100 total.
但是执行上述操作一次只执行一次。-n8 也不走运。在要在脚本 for 循环中执行的行的末尾添加 & 将尝试一次运行脚本 99 次。我如何一次只执行 8 个循环,总共执行 100 个。
回答by Etan Reisner
From the xargs
man page:
从xargs
手册页:
This manual page documents the GNU version of xargs. xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines, and executes the command (default is /bin/echo) one or more times with any initial- arguments followed by items read from standard input. Blank lines on the standard input are ignored.
本手册页记录了 xargs 的 GNU 版本。xargs 从标准输入读取项目,以空格(可以用双引号或单引号或反斜杠保护)或换行符分隔,并执行命令(默认为 /bin/echo)一次或多次,后跟任何初始参数通过从标准输入读取的项目。标准输入上的空行被忽略。
Which means that for your example xargs
is waiting and collecting all of the output from your script and then running echo <that output>
. Not exactly all that useful nor what you wanted.
这意味着对于您的示例,xargs
正在等待并收集脚本的所有输出,然后运行echo <that output>
. 不是那么有用,也不是你想要的。
The -n
argument is how many items from the input to use with each command that gets run (nothing, by itself, about parallelism here).
该-n
参数是如何从输入的许多项目与每个被运行(没什么,本身有关并行这里)命令使用。
To do what you want with xargs
you would need to do something more like this (untested):
要做你想做的xargs
事情,你需要做更多这样的事情(未经测试):
printf %s\n {0..99} | xargs -n 1 -P 8 script-to-run.sh input/ output/
Which breaks down like this.
像这样崩溃了。
printf %s\\n {0..99}
- Print one number per-line from0
to99
.- Run
xargs
- taking at mostone argument per run command line
- and run up toeight processes at a time
printf %s\\n {0..99}
- 从0
到每行打印一个数字99
。- 跑
xargs
- 以最多每次运行命令行一个参数
- 并且一次最多运行八个进程
回答by Ole Tange
With GNU Parallel you would do:
使用 GNU Parallel,您可以:
parallel script-to-run.sh input/ output/ {} ::: {0..99}
Add in -P8
if you do notwant to run one job per CPU core.
在添加-P8
如果你不希望运行每个CPU核心一个作业。
Opposite xargs
it will do The Right Thing, even if the input contain space, ', or " (not the case here, though). It also makes sure the output from different jobs are not mixed together, so if you use the output you are guaranteed that you will not get half-a-line from two different jobs.
相反,xargs
它会做正确的事情,即使输入包含空格、' 或 "(但这里不是这种情况)。它还确保来自不同作业的输出不会混合在一起,因此如果您使用输出保证你不会从两个不同的工作中得到半条线。
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
GNU Parallel 是一个通用的并行器,可以很容易地在同一台机器上或在您有 ssh 访问权限的多台机器上并行运行作业。
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
如果您有 32 个不同的作业要在 4 个 CPU 上运行,一个直接的并行化方法是在每个 CPU 上运行 8 个作业:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
GNU Parallel 会在完成后生成一个新进程 - 保持 CPU 处于活动状态,从而节省时间:
Installation
安装
If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
如果没有为您的发行版打包 GNU Parallel,您可以进行个人安装,这不需要 root 访问权限。这样做可以在 10 秒内完成:
$ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
fetch -o - http://pi.dk/3 ) > install.sh
$ sha1sum install.sh | grep 3374ec53bacb199b245af2dda86df6c9
12345678 3374ec53 bacb199b 245af2dd a86df6c9
$ md5sum install.sh | grep 029a9ac06e8b5bc6052eac57b2c3c9ca
029a9ac0 6e8b5bc6 052eac57 b2c3c9ca
$ sha512sum install.sh | grep f517006d9897747bed8a4694b1acba1b
40f53af6 9e20dae5 713ba06c f517006d 9897747b ed8a4694 b1acba1b 1464beb4
60055629 3f2356f3 3e9c4e3c 76e3f3af a9db4b32 bd33322b 975696fc e6b23cfb
$ bash install.sh
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
有关其他安装选项,请参阅http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
了解更多
See more examples: http://www.gnu.org/software/parallel/man.html
查看更多示例:http: //www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
观看介绍视频:https: //www.youtube.com/playlist?list =PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
演练教程:http: //www.gnu.org/software/parallel/parallel_tutorial.html
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
注册电子邮件列表以获得支持:https: //lists.gnu.org/mailman/listinfo/parallel