string 如何在awk中将分隔字符串拆分为数组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8009664/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to split a delimited string into an array in awk?
提问by Mohamed Saligh
How to split the string when it contains pipe symbols |
in it.
I want to split them to be in array.
当字符串中包含管道符号时如何拆分字符串|
。我想将它们拆分成数组。
I tried
我试过
echo "12:23:11" | awk '{split(echo "12|23|11" | awk '{split( awk '{split($ awk '{split($ awk '{split($ awk -F: '{split($ awk '{split($ awk '{split(% awk -F\| '{
for (i = 0; ++i <= NF;)
print i, $i
}' <<<'12|23|11'
1 12
2 23
3 11
, a, ":*", sep); print a[2]; print sep[1]}' <<< "a:::b c::d e"
b c
:::
, a, ":*"); print a[2]}' <<< "a:::b c::d e" #note multiple :
b c
, a); print a[1]}' <<< "a:b c:d e"
b c
, a, ":"); print a[2]}' <<< "a:b c:d e"
b c
, a); print a[2]}' <<< "a:b c:d e"
c:d
, a, ":")}'
# ^^ ^ ^^^
# | | |
# string | delimiter
# |
# array to store the pieces
,a,"|"); print a[3],a[2],a[1]}'
,a,":"); print a[3] a[2] a[1]}'
Which works fine. If my string is like "12|23|11"
then how do I split them into an array?
哪个工作正常。如果我的字符串就像"12|23|11"
那样,我该如何将它们拆分成一个数组?
回答by Calin Paul Alexandru
Have you tried:
你有没有尝试过:
% awk '{
n = split(awk -F\| '{print }' <<<'12|23|11'
, t, "|")
for (i = 0; ++i <= n;)
print i, t[i]
}' <<<'12|23|11'
1 12
2 23
3 11
回答by fedorqui 'SO stop harming'
To split a string to an array in awk
we use the function split()
:
要将字符串拆分为数组,awk
我们使用以下函数split()
:
awk -F\| 'awk -v T='12|23|11' 'BEGIN{split(T,a,"|");print a[3] a[2] a[1]}'
= ' <<<'12|23|11'
If no separator is given, it uses the FS
, which defaults to the space:
如果未给出分隔符,则使用FS
,默认为空格:
T='12|23|11';echo -n ${T##*|};T=${T%|*};echo ${T#*|}${T%|*}
T='12|23|11';echo ${T:6}${T:3:2}${T:0:2}
We can give a separator, for example :
:
我们可以给一个分隔符,例如:
:
112312
Which is equivalent to setting it through the FS
:
这相当于通过以下方式设置它FS
:
echo "12|23|11" | awk 'BEGIN {FS="|";} { print , , }'
In gawk you can also provide the separator as a regexp:
在 gawk 中,您还可以将分隔符作为正则表达式提供:
echo "12|23|11" | awk '{split(echo "12|23|11" | awk '{split(p2> echo "12|23|11" | awk '{split(# Convert to an array
_ITEMS=($(echo "12|23|11" | tr '|' '\n'))
# Output array items
for _ITEM in "${_ITEMS[@]}"; do
echo "Item: ${_ITEM}"
done
,a,"|"); print a[3] a[2] a[1]}'
112312
,a,"|"); print a[3] a[2] a[1]}'
,a,"|"); print a[3] a[2] a[1]}'
And even see what the delimiter was on every step by using its fourth parameter:
甚至通过使用它的第四个参数来查看每一步的分隔符是什么:
Item: 12
Item: 23
Item: 11
Let's quote the man page of GNU awk:
split(string, array [, fieldsep [, seps ] ])
Divide stringinto pieces separated by fieldsepand store the pieces in arrayand the separator strings in the sepsarray. The first piece is stored in
array[1]
, the second piece inarray[2]
, and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string(much as FScan be a regexp describing where to split input records). If fieldsepis omitted, the value of FSis used.split()
returns the number of elements created. sepsis agawk
extension, withseps[i]
being the separator string betweenarray[i]
andarray[i+1]
. If fieldsepis a single space, then any leading whitespace goes intoseps[0]
and any trailing whitespace goes intoseps[n]
, where nis the return value ofsplit()
(i.e., the number of elements in array).
split(string, array [, fieldsep [, sep ] ])
将字符串分成由fieldsep分隔的部分,并将这些部分存储在数组中,将分隔符字符串存储在sep数组中。第一部分存储在 中
array[1]
,第二部分存储在中array[2]
,依此类推。第三个参数fieldsep的字符串值是一个描述在何处拆分字符串的正则表达式(就像FS可以是一个描述在何处拆分输入记录的正则表达式)。如果省略fieldsep,则使用FS的值。split()
返回创建的元素数。seps是一个gawk
扩展名,seps[i]
是之间的分隔符字符串array[i]
和array[i+1]
。如果fieldsep是单个空格,则任何前导空格进入seps[0]
,任何尾随空格进入seps[n]
,其中n是 的返回值split()
(即,数组中的元素数)。
回答by Dimitre Radoulov
Please be more specific! What do you mean by "it doesn't work"? Post the exact output (or error message), your OS and awk version:
请再具体点!“它不起作用”是什么意思?发布确切的输出(或错误消息)、您的操作系统和 awk 版本:
##代码##Or, using split:
或者,使用拆分:
##代码##Edit: on Solarisyou'll need to use the POSIXawk (/usr/xpg4/bin/awk) in order to process 4000 fields correctly.
编辑:在Solaris 上,您需要使用POSIXawk ( /usr/xpg4/bin/awk) 才能正确处理 4000 个字段。
回答by TrueY
I do not like the echo "..." | awk ...
solution as it calls unnecessary fork
and exec
system calls.
我不喜欢echo "..." | awk ...
,因为它要求不必要的解决方案fork
和exec
系统调用。
I prefer a Dimitre's solution with a little twist
我更喜欢稍微扭曲的 Dimitre 解决方案
##代码##Or a bit shorter version:
或者更短的版本:
##代码##In this case the output record put together which is a true condition, so it gets printed.
在这种情况下,输出记录放在一起是一个真实的条件,所以它被打印出来。
In this specific case the stdin
redirection can be spared with setting an awkinternal variable:
在这种特定情况下,stdin
可以通过设置awk内部变量来避免重定向:
I used kshquite a while, but in bashthis could be managed by internal string manipulation. In the first case the original string is split by internal terminator. In the second case it is assumed that the string always contains digit pairs separated by a one character separator.
我使用ksh很长一段时间,但在bash 中,这可以通过内部字符串操作来管理。在第一种情况下,原始字符串被内部终止符分割。在第二种情况下,假设字符串总是包含由一个字符分隔符分隔的数字对。
##代码##The result in all cases is
在所有情况下的结果是
##代码##回答by Sven
回答by Schildmeijer
回答by codaddict
should work.
应该管用。
回答by duedl0r
Joke? :)
玩笑?:)
How about echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'
怎么样 echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'
This is my output:
这是我的输出:
##代码##so I guess it's working after all..
所以我想它毕竟有效..
回答by Qorbani
I know this is kind of old question, but I thought maybe someone like my trick. Especially since this solution not limited to a specific number of items.
我知道这是个老问题,但我想也许有人喜欢我的伎俩。特别是因为此解决方案不限于特定数量的项目。
##代码##The output will be:
输出将是:
##代码##