如何在我的 bash 脚本中使用并行编程/多线程?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18384505/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I use parallel programming/multi threading in my bash script?
提问by Komal Rathi
This is my script:
这是我的脚本:
#!/bin/bash
#script to loop through directories to merge fastq files
sourcedir=/path/to/source
destdir=/path/to/dest
for f in $sourcedir/*
do
fbase=$(basename "$f")
echo "Inside $fbase"
zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz
zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz
done
Here there are about 30 sub-directories in the directory 'source'. Each sub-directory has certain R1.fastq.gz files and R2.fastq.gz that I want to merge into one R1.fastq.gz and R2.fastq.gz file, then save the merged file to the destination directory. My code works fine but I need to speed it up because of the amount of data. I just want to know is there any way I can implement multi threaded programming in my script? How can I run my script so that multiple jobs run in parallel? New to bash scripting, so any help would be appreciated.
这里有大约 30 个子目录在目录 'source' 中。每个子目录都有特定的R1.fastq.gz 文件和R2.fastq.gz,我想将它们合并为一个 R1.fastq.gz 和 R2.fastq.gz 文件,然后将合并后的文件保存到目标目录。我的代码运行良好,但由于数据量太大,我需要加快速度。我只想知道有什么方法可以在我的脚本中实现多线程编程?如何运行我的脚本以便多个作业并行运行?bash 脚本的新手,所以任何帮助将不胜感激。
采纳答案by Zero Piraeus
The simplest way is to execute the commands in the background, by adding &
to the end of the command:
最简单的方法是在后台执行命令&
,在命令末尾添加:
#!/bin/bash
#script to loop through directories to merge fastq files
sourcedir=/path/to/source
destdir=/path/to/dest
for f in $sourcedir/*
do
fbase=$(basename "$f")
echo "Inside $fbase"
zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz &
zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz &
done
From the bash manual:
从bash 手册:
If a command is terminated by the control operator ‘&', the shell executes the command asynchronously in a subshell. This is known as executing the command in the background. The shell does not wait for the command to finish, and the return status is 0 (true). When job control is not active (see Job Control), the standard input for asynchronous commands, in the absence of any explicit redirections, is redirected from /dev/null.
如果命令由控制运算符“&”终止,则外壳在子外壳中异步执行命令。这称为在后台执行命令。shell 不等待命令完成,返回状态为 0(真)。当作业控制未激活时(请参阅作业控制),在没有任何显式重定向的情况下,异步命令的标准输入从 /dev/null 重定向。
回答by tejas
I am not sure but you can try using &
at the end of the command like this
我不确定,但您可以尝试&
在这样的命令末尾使用
zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz &
zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz &