bash 我可以在 linux 的 awk 的记录分隔符中使用正则表达式吗
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14742451/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Can i use regex in the Record Separator in awk in linux
提问by user2024264
I have the test file like this
我有这样的测试文件
fdsf fdsf fdsfds fdsf
fdsfdsfsdf fdsfsf
fsdfsdf var12=1343243432
fdsf fdsf fdsfds fdsf
fdsfsdfdsfsdf
fsdfsdf var12=13432434432
fdsf fdsf fdsfds fdsf
fsdfsdf fdsfsf var12=13443432432
Now i want to use var12=\d+as the record separator. Is this possible in awk
现在我想 var12=\d+用作记录分隔符。这在 awk 中可能吗
回答by Steve
Yes, however you should use [0-9]instead of \d:
是的,但是您应该使用[0-9]代替\d:
awk '1' RS="var12=[0-9]+" file
IIRC, only GNU awkcan use multi-character record separators.
IIRC,只能GNU awk使用多字符记录分隔符。
Results:
结果:
fdsf fdsf fdsfds fdsf
fdsfdsfsdf fdsfsf
fsdfsdf
fdsf fdsf fdsfds fdsf
fdsfsdfdsfsdf
fsdfsdf
fdsf fdsf fdsfds fdsf
fsdfsdf fdsfsf
Please post your desired output if you need further assistance.
如果您需要进一步的帮助,请发布您想要的输出。
回答by Johnsyweb
Assuming GNU awk(a.k.a. gawk) on Linux, yes.
假设在 Linux 上使用GNU awk(又名gawk),是的。
RSThis is awk's input record separator. Its default value is a string containing a single newline character, which means that an input record consists of a single line of text. It can also be the null string, in which case records are separated by runs of blank lines. If it is a regexp, records are separated by matches of the regexp in the input text.
RS这是 awk 的输入记录分隔符。它的默认值是一个包含单个换行符的字符串,这意味着输入记录由单行文本组成。它也可以是空字符串,在这种情况下,记录由空行分隔。如果是正则表达式,则记录由输入文本中正则表达式的匹配项分隔。
Source:7.5.1 Built-in Variables That Control awk, The GNU Awk User's Guide.
来源:7.5.1 控制的内置变量awk,GNU Awk 用户指南。
As @steve says, \dis not in the list of Regular Expression Operatorsor gawk-Specific Regexp Operators, so you need to use a bracket expressionsuch as [0-9]or [[:digit:]]in place of your \d.
正如@steve 所说,\d不在正则表达式运算符或gawk-Specific Regexp 运算符列表中,因此您需要使用方括号表达式,例如[0-9]或[[:digit:]]代替您的\d.
However, it's not clear from your question as to what your intention here is. I've answered your question but I doubt I've solved your underlying problem. See also What is the XY problem?
但是,从您的问题中不清楚您在这里的意图是什么。我已经回答了你的问题,但我怀疑我已经解决了你的根本问题。另请参阅什么是 XY 问题?

