bash 如何在bash中处理每隔一行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11560544/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:27:27  来源:igfitidea点击:

How to process every other line in bash

bashawk

提问by Perlnika

I would like to print odd lines (1,3,5,7..) without any change, but even lines (2,4,6,8) process with pipeline beginning with grep. I would like to write everything to new file (odd lines without any change and new values for even lines).

我想打印奇数行 (1,3,5,7..) 不做任何更改,但偶数行 (2,4,6,8) 处理以 grep 开头的管道。我想将所有内容都写入新文件(没有任何更改的奇数行和偶数行的新值)。

I know how to print every other line in awk:

我知道如何在 awk 中打印每隔一行:

awk ' NR % 2 == 1 { print; } NR % 2 ==0 {print; }' file.fasta

However, for even lines, I dont want to use {print; }but I want to use my grep pipeline instead.

但是,对于偶数行,我不想使用{print; }但我想改用我的 grep 管道。

An advice will be appreciated. Thanks a lot.

建议将不胜感激。非常感谢。

采纳答案by Shawn Chin

If you're planning to do a simple grep, you can do away with the additional step and do the filtering within awk itself, e.g.:

如果您打算做一个简单的grep,您可以取消额外的步骤并在 awk 本身内进行过滤,例如:

awk 'NR % 2 {print} !(NR % 2) && /pattern/ {print}' file.fasta

However, if you intend to do a lot more then, as chepner already pointer out, you can indeed pipe from inside awk. For example:

但是,如果您打算做更多的事情,正如chepner 已经指出的那样,您确实可以从 awk 内部进行管道传输。例如:

awk 'NR % 2 {print} !(NR % 2) {print | "grep pattern | rev" }' file.fasta

That opens a pipe to the command "pattern | rev"(note the surrounding quotes) and redirects the print output to it. Do note that the output in this case may not be as you might expect; you will end up with all odd lines being output first followed by the output of the piped command (which consumes the even lines).

这会打开一个指向命令的管道"pattern | rev"(注意周围的引号)并将打印输出重定向到它。请注意,这种情况下的输出可能与您预期的不同;您最终将首先输出所有奇数行,然后是管道命令的输出(消耗偶数行)。



(In response to your comments) to count the number of chars in each even line, try:

(回应您的评论)要计算每个偶数行中的字符数,请尝试:

awk 'NR % 2 {print} !(NR % 2) {print length(
awk ' NR % 2 == 1 { print; } NR % 2 ==0 {print | "grep -o [actgnACTGN] | wc -l"; }' file.fasta
)}' file.fasta

回答by chepner

You can pipe directly from inside awk:

您可以直接从内部管道awk

awk 'BEGIN{ cmd = "grep -io 7[actgn]7 | wc -l" } NR % 2 { print } NR % 2 == 0 { print | cmd; close(cmd) }' file.fasta

Be aware, however, that this will not preserve the order of your input file.

但是请注意,这不会保留输入文件的顺序。

(The selected answer is better for the task at hand, but I'll leave this answer here as an example of piping the print statement to an external command.)

(选定的答案更适合手头的任务,但我将把这个答案留在这里作为将打印语句传递给外部命令的示例。)

回答by Paused until further notice.

In order to have your pipeline output appear in order with your AWK output, you need to close the pipeline at each iteration. This is, of course, very inefficient.

为了让您的管道输出与您的 AWK 输出按顺序出现,您需要在每次迭代时关闭管道。当然,这是非常低效的。

awk 'NR % 2 { print } NR % 2 == 0 {n = split(##代码##, a, /[^actgnACTGN]/); print length(##代码##) - n + 1}' file.fasta

You apparently don't want to count characters that are not in the specified list, so length($0)won't work. This will work and should be a lot faster than the pipeline method:

您显然不想计算不在指定列表中的字符,因此length($0)不起作用。这将起作用并且应该比管道方法快得多:

##代码##

It works by splitting the line using the characters you don'twant as delimiters and subtracting the count of the substrings from the length of the line and adding 1. In essence, it subtracts the number of unwanted characters from the length of the line leaving the number of wanted characters as the result.

它的工作原理是使用不需要的字符作为分隔符分割行,并从行的长度中减去子字符串的计数并加 1。本质上,它从离开的行的长度中减去不需要的字符的数量结果所需的字符数。