bash awk 搜索多行记录文件的多个字段

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3438762/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-17 22:27:28  来源:igfitidea点击:

awk search on multiple fields of a multi line record file

bashshellawk

提问by adaptive

I have a file with records that are of the form:

我有一个包含以下形式的记录的文件:

SMS-MT-FSM-DEL-REP
country: IN
1280363645.979354_PFS_1_1887728354

SMS-MT-FSM-DEL-REP
country: IN
1280363645.729309_PFS_1_1084296392

SMS-MO-FSM
country: IR
1280105721.484103_PFM_1_1187616097

SMS-MO-FSM
country: MO
1280105721.461090_PFM_1_882824215

This lends itself to parsing via awk using something like: awk 'BEGIN { FS="\n"; RS="" } /country:.*MO/ {print $0}'

这有助于通过 awk 使用以下内容进行解析: awk 'BEGIN { FS="\n"; RS="" } /country:.*MO/ {print $0}'

My question is how do I use awk to search the records on 2 separate fields? For example I only want to print out records that have a country of MO AND whos record first line is SMS-MO-FSM ?

我的问题是如何使用 awk 搜索 2 个单独字段的记录?例如,我只想打印国家为 MO 且其记录第一行是 SMS-MO-FSM 的记录?

采纳答案by ghostdog74

if you have set FS="\n", and RS="", then the first field $1 would be SMS-MO-FSM. Therefore your awk code is

如果您已设置 FS="\n" 和 RS="",则第一个字段 $1 将是 SMS-MO-FSM。因此你的 awk 代码是

awk 'BEGIN{FS="\n"; RS=""} ~/country.*MO/ && ~/SMS-MO-FSM/ ' file

回答by schot

(I post this as a separate answer instead of a comment reply for better formatting)

(我将此作为单独的答案发布而不是评论回复以获得更好的格式)

Concerning your second remark about printing a record on a single line: When you don't modify your records OFSand ORShave no effect. Only when you change $0or one of the fields awkwill recompute NFand reconstruct $0based on $1 OFS $2 OFS ... $NF ORS. You can force this reconstruction like this:

关于在一行上打印记录的第二个评论:当您不修改记录OFS并且ORS没有效果时。只有当您更改$0或领域之一awk将重新计算NF和重建$0基础$1 OFS $2 OFS ... $NF ORS。您可以像这样强制进行此重建:

BEGIN {
    FS  = "\n"
    RS  = ""
    OFS = ";"     # Or another delimiter that does not appear in your data
    ORS = "\n"
}
 ~ /^[ \t]*country:[ \t]*MO[ \t]*$/ &&  ~ /^[ \t]*SMS-MO-FSM[ \t]*$ {
     =  ""    # This forces the reconstruction
    print
}