Linux 使用“|”进行搜索 替代运算符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6775904/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 05:10:50  来源:igfitidea点击:

grepping using the "|" alternative operator

regexlinuxgrep

提问by MattLBeck

The following is a sample of a large file named AT5G60410.gff:

以下是一个名为 AT5G60410.gff 的大文件示例:

Chr5    TAIR10  gene    24294890    24301147    .   +   .   ID=AT5G60410;Note=protein_coding_gene;Name=AT5G60410
Chr5    TAIR10  mRNA    24294890    24301147    .   +   .   ID=AT5G60410.1;Parent=AT5G60410;Name=AT5G60410.1;Index=1
Chr5    TAIR10  protein 24295226    24300671    .   +   .   ID=AT5G60410.1-Protein;Name=AT5G60410.1;Derives_from=AT5G60410.1
Chr5    TAIR10  exon    24294890    24295035    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  five_prime_UTR  24294890    24295035    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  exon    24295134    24295249    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  five_prime_UTR  24295134    24295225    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  CDS 24295226    24295249    .   +   0   Parent=AT5G60410.1,AT5G60410.1-Protein;
Chr5    TAIR10  exon    24295518    24295598    .   +   .   Parent=AT5G60410.1

I am having some trouble extracting specific lines from this using grep. I wanted to extract all lines that are of type "gene" or type "exon", specified in the third column. I was suprised when this did not work:

我在使用 grep 从中提取特定行时遇到了一些麻烦。我想提取第三列中指定的“基因”类型或“外显子”类型的所有行。当这不起作用时,我感到很惊讶:

grep 'gene|exon' AT5G60410.gff

No results are returned. Where have I gone wrong?

不返回任何结果。我哪里错了?

采纳答案by Jeff Foster

You need to escape the |. The following should do the job.

你需要逃离|. 以下应该完成这项工作。

grep "gene\|exon" AT5G60410.gff

回答by a'r

By default, grep treats the typical special characters as normal characters unless they are escaped. So you could use the following:

默认情况下,grep 将典型的特殊字符视为普通字符,除非它们被转义。所以你可以使用以下内容:

grep 'gene\|exon' AT5G60410.gff

However, you can change its mode by using the following forms to do what you are expecting:

但是,您可以通过使用以下表单来更改其模式以执行您期望的操作:

egrep 'gene|exon' AT5G60410.gff
grep -E 'gene|exon' AT5G60410.gff

回答by Nathan Fellman

This is a different way of grepping for a few choices:

这是对几个选择进行搜索的不同方式:

grep -e gene -e exon AT5G60410.gff

the -eswitch specifies different patterns to match.

-e开关指定不同的图案相匹配。

回答by ennuikiller

This will work:

这将起作用:

grep "gene\|exon" AT5G60410.gff

回答by entpnerd

I found this question while googling for a particular problem I was having involving a piped commandto a grepcommand that used the alternation operator in a regex, so I thought that I would contribute my more specialized answer.

我在谷歌搜索一个特定问题时发现了这个问题,我涉及一个管道命令到一个grep在正则表达式中使用交替运算符的命令,所以我想我会贡献我更专业的答案。

The error I faced turned out to be with the previous pipe operator (i.e. |) and not the alternation operator (i.e. |identical to pipe operator) in the grep regex at all. The answer for me was to properly escape and quote as necessary special shell characters such as &before assuming the issue was with my grep regex that involved the alternation operator.

我遇到的错误原来是与之前的管道操作符(即|),而不是|grep 正则表达式中的交替操作符(即与管道操作符相同)。对我来说,答案是在假设问题出在我的 grep 正则表达式中涉及交替运算符之前,正确地转义并引用必要的特殊 shell 字符,例如 &

For example, the command I executed on my local machine was:

例如,我在本地机器上执行的命令是:

get http://localhost/foobar-& | grep "fizz\|buzz"

This command resulted in the following error:

此命令导致以下错误:

-bash: syntax error near unexpected token `|'

This error was corrected by changing my command to:

通过将我的命令更改为:

get "http://localhost/foobar-&" | grep "fizz\|buzz"

By escaping the &character with double quotes I was able to resolve my issue. The answer had nothing to do with the alternation operation at all.

通过&用双引号转义字符,我能够解决我的问题。答案与交替操作完全无关。