bash 找到至少两次重复模式的行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19319514/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
find the lines with atleast twice repeated pattern?
提问by sam
I have a big lines with lines as below
我有一条大线,如下所示
abc|Abc_12 cdf_rhtdm cdf|Cdf22 abc|Abc_100 ijm|smthr12
ddf|rtg_2 qwe_werth ddf|Cs2 abc|Abc_f0 ijm|styhr12 abc|Abc_33 ddf|Cs2 ddf|rtg_2
ddd_hm ddf|Cs2 ght|d_100 abc|Abc_55
cdf_rshtdm sdf|Cdf22 ght|d_100 ijm|smthr12
I want to create a new file that have pattern like abc|
at least two times
我想创建一个至少有两次模式的新文件abc|
So, here the output will be
所以,这里的输出将是
abc|Abc_12 cdf_rhtdm cdf|Cdf22 abc|Abc_100 ijm|smthr12
ddf|rtg_2 qwe_werth ddf|Cs2 abc|Abc_f0 ijm|styhr12 abc|Abc_33 ddf|Cs2 ddf|rtg_2
回答by anubhava
Using grep -P
(PCRE):
使用grep -P
(PCRE):
grep -P '(abc\|.*?){2}' file
abc|Abc_12 cdf_rhtdm cdf|Cdf22 abc|Abc_100 ijm|smthr12
ddf|rtg_2 qwe_werth ddf|Cs2 abc|Abc_f0 ijm|styhr12 abc|Abc_33 ddf|Cs2 ddf|rtg_2
回答by anubhava
One way is using grep
with a basic regex:
一种方法是使用grep
基本的正则表达式:
grep '^.*\(abc|\).*\(abc|\).*$' your_file
abc|Abc_12 cdf_rhtdm cdf|Cdf22 abc|Abc_100 ijm|smthr12
ddf|rtg_2 qwe_werth ddf|Cs2 abc|Abc_f0 ijm|styhr12 abc|Abc_33 ddf|Cs2 ddf|rtg_2
回答by gpmurthy
The following regex should yield the output you are looking for...
以下正则表达式应产生您正在寻找的输出...
.*?(abc\|).*?(abc\|).*?
回答by fedorqui 'SO stop harming'
With awk
it can be done quite simply:
有了awk
它可以很简单地完成:
$ awk '{if (gsub(/abc\|/, "abc", ##代码##)>= 2) print}' file
abcAbc_12 cdf_rhtdm cdf|Cdf22 abcAbc_100 ijm|smthr12
ddf|rtg_2 qwe_werth ddf|Cs2 abcAbc_f0 ijm|styhr12 abcAbc_33 ddf|Cs2 ddf|rtg_2
Explanation
解释
From the AWK manual:
从AWK 手册:
gsub(regexp, replacement, target)
The gsub function returns the number of substitutions made.
gsub(正则表达式,替换,目标)
gsub 函数返回进行的替换次数。
So we check its return code and in case it is 2 or more, we print the line.
所以我们检查它的返回码,如果它是 2 或更多,我们打印该行。