bash sed/awk - 在跨越多行的模式之间打印文本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13023595/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:51:51  来源:igfitidea点击:

sed/awk - print text between patterns spanned across multiple lines

bashsedawk

提问by Amarnath Revanna

I am new to scripting and was trying to learn how to extract any text that exists between two different patterns. However, I am still not able to figure out how to extract text between two patterns in the following scenario:

我是脚本的新手,并试图学习如何提取存在于两种不同模式之间的任何文本。但是,我仍然无法弄清楚如何在以下场景中提取两种模式之间的文本:

If I have my input filereading:

如果我的输入文件读取:

Hi I would like
to print text
between these 
patterns

and my expected outputis like:

我的预期输出是这样的:

I would like
to print text
between these 

i.e. my first search pattern is "Hi' and skip this pattern, but print everything that exists in the same line following that matched pattern. My second search pattern is "patterns" and I would like to completely avoid printing this line or any lines beyond that.

即我的第一个搜索模式是“嗨”并跳过此模式,但打印匹配模式后同一行中存在的所有内容。我的第二个搜索模式是“模式”,我想完全避免打印此行或超出此行的任何行那。

I tried the following:

我尝试了以下方法:

sed -n '/Hi/,/patterns/p' test.txt 

[output]

[输出]

Hi I would like
to print text
between these 
patterns 

Next, I tried:

接下来,我尝试了:

`awk ' /'"Hi"'/ {flag=1;next} /'"pattern"'/{flag=0} flag { print }'` test.txt 

[output]

[输出]

to print text
between these

Can someone help me out in identifying how to achieve this? Thanks in advance

有人可以帮助我确定如何实现这一目标吗?提前致谢

采纳答案by paxdiablo

You have the right idea, a mini-state-machine in awkbut you need some slight mods as per the following transcript:

你有一个正确的想法,一个迷你状态机,awk但你需要根据以下记录进行一些轻微的修改:

pax> echo 'Hi I would like
to print text
between these 
patterns ' | awk '
    /patterns/ { echo = 0 }
    /Hi /      { gsub("^.*Hi ", "", 
awk '/patterns/{e=0}/Hi /{gsub("^.*Hi ","",
I would like
to print text
between these 
);e=1}{if(e==1){print}}'
); echo = 1 } { if (echo == 1) { print } }'

Or, in compressed form:

或者,以压缩形式:

$ sed -n '/^Hi/,/patterns/{s/^Hi //;/^patterns/d;p;}' file
I would like
to print text
between these

The output of that is:

其输出是:

sed '/Hi /!d;s//\n/;s/.*\n//;ta;:a;s/patterns.*$//;tb;$!{n;ba};:b;/^$/d' file

as requested.

按照要求。

The way this works is as follows. The echovariable is initially 0meaning that no echoing will take place.

其工作方式如下。该echo变量最初0意味着不会发生回声。

Each line is checked in turn. If it contains patterns, echoing is disabled.

依次检查每一行。如果包含patterns,则禁用回显。

If it contains Hifollowed by a space, echoing is turned on andgsubis used to modify the line to get rid of everything up to the Hi.

如果它包含Hi后跟一个空格,则打开回显gsub用于修改该行以删除Hi.

Then, regardless, the line (possibly modified) is echoed when the echoflag is on.

然后,无论如何,当echo标志打开时,会回显该行(可能已修改)。

Now, there's going to be edge cases such as:

现在,将会有一些边缘情况,例如:

  • lines containing two occurrences of Hi; or
  • lines containing something beforethe patterns.
  • 包含两次出现的行Hi;或者
  • 在.之前包含某些内容的行patterns

You haven't specified how they should be handled so I didn't bother, but the basic concept should be the same.

您还没有指定应该如何处理它们,所以我没有打扰,但基本概念应该是相同的。

回答by Guru

Updated the solution to remove the line "patterns" :

更新了解决方案以删除“模式”行:

$ awk 'sub(/^Hi /,""){f=1} /patterns/{f=0} f'  file
I would like
to print text
between these

回答by potong

This might work for you (GNU sed):

这可能对你有用(GNU sed):

##代码##

回答by Ed Morton

Just set a flag (f) when you find+replace Hi at the start of a line, clear it when you find patterns, then invoke the default print when the flag is set:

只需在行首 find+replace Hi 时设置一个标志 (f),在找到模式时清除它,然后在设置标志时调用默认打印:

##代码##