bash awk - 仅按第一次出现分割
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19154996/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
awk - split only by first occurrence
提问by Udy
I have a line like:
我有一条像:
one:two:three:four:five:six seven:eight
and I want to use awkto get $1to be one and $2to be two:three:four:five:six seven:eight
我想用awk获得 $1是一个$2是two:three:four:five:six seven:eight
I know I can get it by doing sedbefore. That is to change the first occurrence of :with sedthen awkit using the new delimiter.
我知道我可以通过sed以前的方式得到它。那就是使用新的分隔符更改:with sedthen的第一次出现awk。
However replacing the delimiter with a new one would not help me since I can not guarantee that the new delimiter will not already be somewhere in the text.
然而,用新的分隔符替换分隔符对我没有帮助,因为我不能保证新的分隔符不会出现在文本的某个地方。
I want to know if there is an option to get awkto behave this way
我想知道是否可以选择以awk这种方式行事
So something like:
所以像:
awk -F: '{print ,}'
will print:
将打印:
one two:three:four:five:six seven:eight
I will also want to do some manipulations on $1and $2so I don't want just to substitute the first occurrence of :.
我还想做一些操作$1,$2所以我不想只是替换:.
回答by Adrian
Without any substitutions
没有任何替代
echo "one:two:three:four:five" | awk -F: '{ st = index(one two:three:four:five
,":");print " " substr(rem = substr(echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1'
one two:three:four:five:six
,st+1)
,st+1)}'
The index command finds the first occurance of the ":" in the whole string, so in this case the variable st would be set to 4. I then use substr function to grab all the rest of the string from starting from position st+1, if no end number supplied it'll go to the end of the string. The output being
index 命令在整个字符串中找到“:”的第一次出现,因此在这种情况下,变量 st 将设置为 4。然后我使用 substr 函数从位置 st+1 开始获取字符串的所有其余部分, 如果没有提供结束编号,它将转到字符串的末尾。输出为
echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' | awk '{print ,}'
one two:three:four:five:six
If you want to do further processing you could always set the string to a variable for further processing.
如果您想进行进一步处理,您始终可以将字符串设置为变量以进行进一步处理。
echo "one:two:three:four:five:six" | awk '{sub(/:/," ");=;print ,}'
one two:three:four:five:six
Note this was tested on Solaris AWK but I can't see any reason why this shouldn't work on other flavours.
请注意,这是在 Solaris AWK 上测试过的,但我看不出有任何理由不能在其他版本上使用。
回答by Jotne
Some like this?
有些像这样?
echo "one:two:three:four:five:six seven:eight" | awk -F\| '{sub(/:/,"|");=;print "=" "\n="}'
=one
=two:three:four:five:six seven:eight
This replaces the first :to space.
You can then later get it into $1, $2
这取代了第一个:空间。然后你可以稍后把它变成 1 美元,2 美元
echo "one:two:three:four:five:six seven:eight" | awk -F"#;#." '{sub(/:/,"#;#.");=;print "=" "\n="}'
=one
=two:three:four:five:six seven:eight
Or in same awk, so even with substitution, you get $1 and $2 the way you like
或者在同一个 awk 中,因此即使使用替换,您也可以按照自己喜欢的方式获得 1 美元和 2 美元
$ awk '{print }' FPAT='(^[^:]+)|(:.*)' file
one
$ awk '{print }' FPAT='(^[^:]+)|(:.*)' file
:two:three:four:five:six seven:eight
EDIT:
Using a different separator you can get first oneas filed $1and rest in $2like this:
编辑:使用不同的分隔符,您可以首先获得one归档$1并$2像这样休息:
$ awk '{print substr(,2)}' FPAT='(^[^:]+)|(:.*)' file
two:three:four:five:six seven:eight
Unique separator
唯一分隔符
$ awk '{print , substr(,2)}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight
回答by Chris Seymour
The closest you can get with is with GNU awk's FPAT:
你能得到的最接近的是 GNU awk 的FPAT:
$ awk '{=substr(,2); print ,}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight
But $2will include the leading delimiter but you could use substrto fix that:
但$2将包括前导分隔符,但您可以substr用来解决这个问题:
awk '{n=index(##代码##,":");s=##代码##;=substr(s,1,n-1);=substr(s,n+1);print }' FS='##代码##'
one
awk '{n=index(##代码##,":");s=##代码##;=substr(s,1,n-1);=substr(s,n+1);print }' FS='##代码##'
two:three:four five:six:seven
awk '{n=index(##代码##,":");s=##代码##;=substr(s,1,n-1);=substr(s,n+1);print ,}' FS='##代码##'
one two:three:four five:six:seven
So putting it all together:
所以把它们放在一起:
##代码##Storing the results of the substrback in $2will allow further processing on $2without the leading delimiter:
存放的结果substr在后面$2将允许在进一步的处理$2没有前导定界符:
A solution that should work with mawk 1.3.3:
应该使用的解决方案mawk 1.3.3:

