bash grep +A:匹配后打印所有内容
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18166552/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
grep +A: print everything after match
提问by B.Mr.W.
Hi I have a file contains a list of urls, looks like below:
嗨,我有一个包含网址列表的文件,如下所示:
file1:
文件 1:
http://www.google.com
http://www.bing.com
http://www.yahoo.com
http://www.baidu.com
http://www.yandex.com
....
I want to get all the records after: http://www.yahoo.com, results looks like below:
我想在http://www.yahoo.com之后获取所有记录,结果如下所示:
file2:
文件2:
http://www.baidu.com
http://www.yandex.com
....
I know that I could use grep to find the line number of where yahoo.com lies using
我知道我可以使用 grep 来查找 yahoo.com 所在位置的行号
$grep -n 'http://www.yahoo.com' file1
3 http://www.yahoo.com
But I don't know how to get the file after line number 3. Also, I know there is a flag in grep -A print the lines after your match. However, you need to specify how many lines you want after the match. I am wondering is there something to get around that issue. Like:
但我不知道如何在第 3 行之后获取文件。另外,我知道 grep -A 中有一个标志,在匹配后打印行。但是,您需要指定匹配后需要多少行。我想知道有什么办法可以解决这个问题。喜欢:
PSEUDO CODE:
$ grep -n 'http://www.yahoo.com' -A all file1 > file2
I know we could use the line number I got and wc -l to get the number of lines after yahoo.com, however.. feels pretty lame.
我知道我们可以使用我得到的行号和 wc -l 来获取 yahoo.com 之后的行数,但是……感觉很蹩脚。
Looking forward to a handy and easy solution. Feel free criticizing me for complexifying the problem right at the beginning and awk and sed commands are also welcome!
期待一个方便和简单的解决方案。随意批评我一开始就把问题复杂化了,也欢迎使用 awk 和 sed 命令!
回答by Hai Vu
Awk
awk
If you don't mind using awk:
如果您不介意使用 awk:
awk '/yahoo/{y=1;next}y' data.txt
This script has two parts:
这个脚本有两部分:
/yahoo/ { y = 1; next }
y
The first part states that if we encounter a line with yahoo, we set the variable y=1, then skip that line (the next
command will jump to the next line, thus skip any further processing on the current line). Without the next
command, the line yahoowill be printed.
第一部分指出,如果遇到带有yahoo的行,我们设置变量 y=1,然后跳过该行(该next
命令将跳到下一行,从而跳过当前行的任何进一步处理)。如果没有该next
命令,将打印yahoo行。
The second part is a short hand for:
第二部分是以下内容的简写:
y != 0 { print }
Which means, for each line, if variable y is non-zero, we print that line. In awk, if you refer to a variable, that variable will be created and is either zero or empty string, depending on context. Before encounter yahoo, variable y is 0, so the script does not print anything. After encounter yahoo, y is 1, so every line after that will be printed.
这意味着,对于每一行,如果变量 y 不为零,我们将打印该行。在 awk 中,如果您引用一个变量,则该变量将被创建并且是零或空字符串,具体取决于上下文。在遇到yahoo之前,变量 y 为 0,因此脚本不打印任何内容。遇到yahoo 后, y 为 1,因此将打印之后的每一行。
Sed
sed
Or, using sed, the following will delete everything up to and including the line with yahoo:
或者,使用sed,以下内容将删除包括雅虎行在内的所有内容:
sed '1,/yahoo/d' data.txt
回答by zwol
This is much easier done with sed
than grep
. sed
can apply any of its one-letter commands to an inclusive range of lines; the general syntax for this is
这sed
比grep
. sed
可以将其任何单字母命令应用于包含的行范围;这个的一般语法是
START , STOP COMMAND
except without any spaces. START
and STOP
can each be a number (meaning "line number N", starting from 1); a dollar sign (meaning "the end of the file"), or a regexp enclosed in slashes, meaning "the first line that matches this regexp". (The exact rules are slightly more complicated; the GNU sed
manual has more detail.)
除了没有任何空格。 START
并且STOP
每个都可以是一个数字(意思是“行号 N”,从 1 开始);一个美元符号(意思是“文件的结尾”),或者用斜杠括起来的正则表达式,意思是“与这个正则表达式匹配的第一行”。(确切的规则稍微复杂一些;GNUsed
手册有更多细节。)
So, you can do what you want like so:
所以,你可以做你想做的事:
sed -n -e '/http:\/\/www\.yahoo\.com/,$p' file1 > file2
The -n
means "don't print anything unless specifically told to", and the -e
directive means "from the first appearance of a line that matches the regexp /http:\/\/www\.yahoo\.com/
to the end of the file, p
rint."
的-n
意思是“不打印任何东西,除非特别告知”和-e
“从行正则表达式匹配的首次亮相指令手段/http:\/\/www\.yahoo\.com/
到文件的末尾,p
RINT。”
This will include the line with http://www.yahoo.com/
on it in the output. If you want everything after that point but not that line itself, the easiest way to do that is to invert the operation:
这将包括http://www.yahoo.com/
输出中带有的行。如果您想要该点之后的所有内容而不是该行本身,最简单的方法是反转操作:
sed -e '1,/http:\/\/www\.yahoo\.com/d' file1 > file2
which means "for line 1 through the first line matching the regexp /http:\/\/www\.yahoo\.com/
, d
elete the line" (and then, implicitly, print everything else; note that -n
is notused this time).
该装置“为线1通过匹配正则表达式的第一行/http:\/\/www\.yahoo\.com/
,d
elete行”(然后,隐式,打印一切;请注意,-n
在不使用该时间)。
回答by Steven Penny
awk '/yahoo/ ? c++ : c' file1
Or golfed
或打高尔夫球
awk '/yahoo/?c++:c' file1
Result
结果
http://www.baidu.com http://www.yandex.com
回答by tchrist
This is most easily done in Perl:
这在 Perl 中最容易完成:
perl -ne 'print unless 1 .. m(http://www\.yahoo\.com)' file
In other words, print all lines that aren'tbetween line 1 and the first occurrence of that pattern.
换句话说,打印所有不在第 1 行和该模式第一次出现之间的行。
回答by user1502952
using script
使用脚本
#get index of yahoo word
index=`grep -n "yahoo" filepath | cut -d':' -f1`
#get total number of lines in file
totallines=`wc -l filepath | cut -d' ' -f1`
#subtract totallines with index
result=`expr $total - $index`
#gives the desired output
grep -A $result "yahoo" filepath