bash 我可以在 linux 的 awk 的记录分隔符中使用正则表达式吗

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14742451/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 04:30:16  来源:igfitidea点击:

Can i use regex in the Record Separator in awk in linux

linuxbashawk

提问by user2024264

I have the test file like this

我有这样的测试文件

fdsf fdsf fdsfds fdsf
fdsfdsfsdf fdsfsf
fsdfsdf var12=1343243432

fdsf fdsf fdsfds fdsf
fdsfsdfdsfsdf
fsdfsdf var12=13432434432

fdsf fdsf fdsfds fdsf
fsdfsdf fdsfsf var12=13443432432

Now i want to use var12=\d+as the record separator. Is this possible in awk

现在我想 var12=\d+用作记录分隔符。这在 awk 中可能吗

回答by Steve

Yes, however you should use [0-9]instead of \d:

是的,但是您应该使用[0-9]代替\d

awk '1' RS="var12=[0-9]+" file

IIRC, only GNU awkcan use multi-character record separators.

IIRC,只能GNU awk使用多字符记录分隔符。

Results:

结果:

fdsf fdsf fdsfds fdsf
fdsfdsfsdf fdsfsf
fsdfsdf 


fdsf fdsf fdsfds fdsf
fdsfsdfdsfsdf
fsdfsdf 


fdsf fdsf fdsfds fdsf
fsdfsdf fdsfsf 

Please post your desired output if you need further assistance.

如果您需要进一步的帮助,请发布您想要的输出。

回答by Johnsyweb

Assuming GNU awk(a.k.a. gawk) on Linux, yes.

假设在 Linux 上使用GNU awk(又名gawk),是的。

RS

This is awk's input record separator. Its default value is a string containing a single newline character, which means that an input record consists of a single line of text. It can also be the null string, in which case records are separated by runs of blank lines. If it is a regexp, records are separated by matches of the regexp in the input text.

RS

这是 awk 的输入记录分隔符。它的默认值是一个包含单个换行符的字符串,这意味着输入记录由单行文本组成。它也可以是空字符串,在这种情况下,记录由空行分隔。如果是正则表达式,则记录由输入文本中正则表达式的匹配项分隔。

Source:7.5.1 Built-in Variables That Control awk, The GNU Awk User's Guide.

来源:7.5.1 控制的内置变量awkGNU Awk 用户指南

As @steve says, \dis not in the list of Regular Expression Operatorsor gawk-Specific Regexp Operators, so you need to use a bracket expressionsuch as [0-9]or [[:digit:]]in place of your \d.

正如@steve 所说\d不在正则表达式运算符gawk-Specific Regexp 运算列表中,因此您需要使用方括号表达式,例如[0-9][[:digit:]]代替您的\d.

However, it's not clear from your question as to what your intention here is. I've answered your question but I doubt I've solved your underlying problem. See also What is the XY problem?

但是,从您的问题中不清楚您在这里的意图是什么。我已经回答了你的问题,但我怀疑我已经解决了你的根本问题。另请参阅什么是 XY 问题?