bash 读取 unix 上的文件列表并运行命令
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18028643/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Read list of files on unix and run command
提问by user2647734
I am pretty new at shell scripting and I have been struggling all day to figure out how to perform a "for" command. Essentially, what I am trying to do is the following:
我对 shell 脚本很陌生,我整天都在努力弄清楚如何执行“for”命令。本质上,我想做的是以下内容:
I have a list.txt file with a bunch of names:
我有一个带有一堆名称的 list.txt 文件:
name1
name2
name3
for every name in the list, there are two different files, each with a different ending to the name. Ex:
对于列表中的每个名称,都有两个不同的文件,每个文件都有不同的名称结尾。前任:
name1_R1
name1_R2
The program I am trying to run is called sickle
. Basically, it takes two files (that correspond to each other) and runs an analysis on them, hence requiring me to have this naming scheme. The sickle command is as follow:
我试图运行的程序称为sickle
. 基本上,它需要两个文件(彼此对应)并对它们运行分析,因此需要我有这个命名方案。镰刀命令如下:
sickle pe -f input_file1.fastq -r input_file2.fastq -t sanger \
If someone could help me out, at least just by telling me how to get unix to read the list of files and treat each line independently I think I could go from there. I tried a few things, but none of them worked.
如果有人可以帮助我,至少只是告诉我如何让 unix 读取文件列表并独立处理每一行,我想我可以从那里开始。我尝试了几件事,但没有一个奏效。
回答by Jonathan Leffler
There are a couple of ways to do it. Since the names are 'one per line' in the data file, we can assume there are no newlines in the file names.
有几种方法可以做到这一点。由于数据文件中的名称是“每行一个”,我们可以假设文件名中没有换行符。
for
loop
for
环形
for file in $(<list.txt)
do
sickle pe -f "${file}_file1.fastq" -r "${file}_file2.fastq" -t sanger
done
while
loop with read
while
循环 read
while read file
do
sickle pe -f "${file}_file1.fastq" -r "${file}_file2.fastq" -t sanger
done < list.txt
The for
loop only works if there are no blanks in the names (nor other white-space characters such as tabs). The while
loop is clean as long as you don't have newlines in the names, though using while read -r file
would give you even better protection against the unexpected. The double quotes around the file name in the for
loop are decorative (but harmless) because the file names cannot contain blanks, but those in the while
loop prevent file names containing blanks from being split when they should not be split. It's often a good idea to quote variables every time you use them, though it strictly only matters when the variable might contain blanks but you don't want the value split up.
for
只有在名称中没有空格(也没有其他空白字符,如制表符)时,循环才有效。该while
环是干净的,只要你没有在名称换行,虽然使用while read -r file
会给你应对不可预知的甚至更好的保护。for
循环中文件名周围的双引号是装饰性的(但无害),因为文件名不能包含空格,但循环中的双引号while
防止包含空格的文件名在不应拆分时被拆分。每次使用变量时都引用它们通常是个好主意,尽管严格来说这仅在变量可能包含空格但您不希望将值拆分时才重要。
I've had to guess what names should be passed to the sickle
command since your question is not clear about it — I'm 99% sure I've guessed wrong, but it matches the different suffixes in your sample command assuming the base name of file is input
. I've omitted the trailing backslash; it is the 'escape' character and it is not clear what you really want there.
我不得不猜测应该将哪些名称传递给sickle
命令,因为您的问题不清楚 - 我 99% 确定我猜错了,但它与您的示例命令中的不同后缀匹配,假设基本名称为文件是input
. 我省略了尾部反斜杠;它是“逃生”字符,目前尚不清楚您真正想要的是什么。
回答by Todd A. Jacobs
Use a Bash For-Loop
使用 Bash For 循环
Bash has a very reasonable for-loop as one of its looping constructs. You can replace the echo command below with whatever custom command you want. For example:
Bash 有一个非常合理的 for 循环作为其循环结构之一。你可以用你想要的任何自定义命令替换下面的 echo 命令。例如:
for file in name1 name2 name3; do
echo "${file}_R1" "${file}_R2"
done
The idea is that the loop assigns each filename to the filevariable, then you append the _R1 and _R2 suffixes to them. Note that quoting may be important, and does no harm if it isn't needed, so you ought to use it as a defensive programming measure.
这个想法是循环将每个文件名分配给文件变量,然后将 _R1 和 _R2 后缀附加到它们。请注意,引用可能很重要,如果不需要它也没有害处,因此您应该将其用作防御性编程措施。
Use xargsfor Argument Lists
将xargs用于参数列表
If you want to read from a file instead of using the for-loop directly, you can use Bash's read builtin, but xargsis often more portable across shells. For example, the following uses flags available in the version of xargsfrom GNU findutilsto read in arguments from a file and then append a suffix to each of them:
如果您想从文件中读取而不是直接使用 for 循环,您可以使用 Bash 的read builtin,但xargs通常在 shell 之间更具可移植性。例如,在版本可用下列用途标志xargs的从GNU的findutils到的参数从文件中读取,然后一个后缀追加到它们:
$ xargs --arg-file=list.txt --max-args=1 -I{} /bin/echo "{}_R1" "{}_R2"
name1_R1 name1_R2
name2_R1 name2_R2
name3_R1 name3_R2
Again, you can replace "echo" with the command line of your choice.
同样,您可以用您选择的命令行替换“echo”。
回答by nneonneo
Use a while
loop with read
:
使用while
循环read
:
while read fn; do
<command> "${fn}_R1" "${fn}_R2"
done < list.txt