bash grep +A:匹配后打印所有内容

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18166552/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 00:02:42  来源:igfitidea点击:

grep +A: print everything after match

bashsedawkgrep

提问by B.Mr.W.

Hi I have a file contains a list of urls, looks like below:

嗨,我有一个包含网址列表的文件,如下所示:

file1:

文件 1:

http://www.google.com
http://www.bing.com
http://www.yahoo.com
http://www.baidu.com
http://www.yandex.com
....

I want to get all the records after: http://www.yahoo.com, results looks like below:

我想在http://www.yahoo.com之后获取所有记录,结果如下所示:

file2:

文件2:

http://www.baidu.com
http://www.yandex.com
....

I know that I could use grep to find the line number of where yahoo.com lies using

我知道我可以使用 grep 来查找 yahoo.com 所在位置的行号

$grep -n 'http://www.yahoo.com' file1
3 http://www.yahoo.com

But I don't know how to get the file after line number 3. Also, I know there is a flag in grep -A print the lines after your match. However, you need to specify how many lines you want after the match. I am wondering is there something to get around that issue. Like:

但我不知道如何在第 3 行之后获取文件。另外,我知道 grep -A 中有一个标志,在匹配后打印行。但是,您需要指定匹配后需要多少行。我想知道有什么办法可以解决这个问题。喜欢:

PSEUDO CODE:
$ grep -n 'http://www.yahoo.com' -A all file1 > file2 

I know we could use the line number I got and wc -l to get the number of lines after yahoo.com, however.. feels pretty lame.

我知道我们可以使用我得到的行号和 wc -l 来获取 yahoo.com 之后的行数,但是……感觉很蹩脚。

Looking forward to a handy and easy solution. Feel free criticizing me for complexifying the problem right at the beginning and awk and sed commands are also welcome!

期待一个方便和简单的解决方案。随意批评我一开始就把问题复杂化了,也欢迎使用 awk 和 sed 命令!

回答by Hai Vu

Awk

awk

If you don't mind using awk:

如果您不介意使用 awk:

awk '/yahoo/{y=1;next}y' data.txt

This script has two parts:

这个脚本有两部分:

/yahoo/ { y = 1; next }
y

The first part states that if we encounter a line with yahoo, we set the variable y=1, then skip that line (the nextcommand will jump to the next line, thus skip any further processing on the current line). Without the nextcommand, the line yahoowill be printed.

第一部分指出,如果遇到带有yahoo的行,我们设置变量 y=1,然后跳过该行(该next命令将跳到下一行,从而跳过当前行的任何进一步处理)。如果没有该next命令,将打印yahoo行。

The second part is a short hand for:

第二部分是以下内容的简写:

y != 0 { print }

Which means, for each line, if variable y is non-zero, we print that line. In awk, if you refer to a variable, that variable will be created and is either zero or empty string, depending on context. Before encounter yahoo, variable y is 0, so the script does not print anything. After encounter yahoo, y is 1, so every line after that will be printed.

这意味着,对于每一行,如果变量 y 不为零,我们将打印该行。在 awk 中,如果您引用一个变量,则该变量将被创建并且是零或空字符串,具体取决于上下文。在遇到yahoo之前,变量 y 为 0,因此脚本不打印任何内容。遇到yahoo 后, y 为 1,因此将打印之后的每一行。

Sed

sed

Or, using sed, the following will delete everything up to and including the line with yahoo:

或者,使用sed,以下内容将删除包括雅虎行在内的所有内容:

sed '1,/yahoo/d' data.txt 

回答by zwol

This is much easier done with sedthan grep. sedcan apply any of its one-letter commands to an inclusive range of lines; the general syntax for this is

sedgrep. sed可以将其任何单字母命令应用于包含的行范围;这个的一般语法是

START , STOP COMMAND

except without any spaces. STARTand STOPcan each be a number (meaning "line number N", starting from 1); a dollar sign (meaning "the end of the file"), or a regexp enclosed in slashes, meaning "the first line that matches this regexp". (The exact rules are slightly more complicated; the GNU sedmanual has more detail.)

除了没有任何空格。 START并且STOP每个都可以是一个数字(意思是“行号 N”,从 1 开始);一个美元符号(意思是“文件的结尾”),或者用斜杠括起来的正则表达式,意思是“与这个正则表达式匹配的第一行”。(确切的规则稍微复杂一些;GNUsed手册有更多细节。)

So, you can do what you want like so:

所以,你可以做你想做的事:

sed -n -e '/http:\/\/www\.yahoo\.com/,$p' file1 > file2

The -nmeans "don't print anything unless specifically told to", and the -edirective means "from the first appearance of a line that matches the regexp /http:\/\/www\.yahoo\.com/to the end of the file, print."

-n意思是“不打印任何东西,除非特别告知”和-e“从行正则表达式匹配的首次亮相指令手段/http:\/\/www\.yahoo\.com/到文件的末尾,pRINT。”

This will include the line with http://www.yahoo.com/on it in the output. If you want everything after that point but not that line itself, the easiest way to do that is to invert the operation:

这将包括http://www.yahoo.com/输出中带有的行。如果您想要该点之后的所有内容而不是该行本身,最简单的方法是反转操作:

sed -e '1,/http:\/\/www\.yahoo\.com/d' file1 > file2

which means "for line 1 through the first line matching the regexp /http:\/\/www\.yahoo\.com/, delete the line" (and then, implicitly, print everything else; note that -nis notused this time).

该装置“为线1通过匹配正则表达式的第一行/http:\/\/www\.yahoo\.com/delete行”(然后,隐式,打印一切;请注意,-n使用该时间)。

回答by Steven Penny

awk '/yahoo/ ? c++ : c' file1

Or golfed

或打高尔夫球

awk '/yahoo/?c++:c' file1

Result

结果

http://www.baidu.com
http://www.yandex.com

回答by tchrist

This is most easily done in Perl:

这在 Perl 中最容易完成:

perl -ne 'print unless 1 .. m(http://www\.yahoo\.com)' file

In other words, print all lines that aren'tbetween line 1 and the first occurrence of that pattern.

换句话说,打印所有不在第 1 行和该模式第一次出现之间的行。

回答by user1502952

using script

使用脚本

#get index of yahoo word
index=`grep -n "yahoo" filepath | cut -d':' -f1`
#get total number of lines in file
totallines=`wc -l filepath | cut -d' ' -f1`
#subtract totallines with index
result=`expr $total - $index`
#gives the desired output
grep -A $result "yahoo" filepath