string 如何在awk中将分隔字符串拆分为数组?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8009664/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 01:15:34  来源:igfitidea点击:

How to split a delimited string into an array in awk?

stringunixawksplit

提问by Mohamed Saligh

How to split the string when it contains pipe symbols |in it. I want to split them to be in array.

当字符串中包含管道符号时如何拆分字符串|。我想将它们拆分成数组。

I tried

我试过

echo "12:23:11" | awk '{split(
echo "12|23|11" | awk '{split(
 awk '{split(
$ awk '{split(
$ awk '{split(
$ awk -F: '{split(
$ awk '{split(
$ awk '{split(
% awk -F\| '{
  for (i = 0; ++i <= NF;)
    print i, $i
  }' <<<'12|23|11'
1 12
2 23
3 11
, a, ":*", sep); print a[2]; print sep[1]}' <<< "a:::b c::d e" b c :::
, a, ":*"); print a[2]}' <<< "a:::b c::d e" #note multiple : b c
, a); print a[1]}' <<< "a:b c:d e" b c
, a, ":"); print a[2]}' <<< "a:b c:d e" b c
, a); print a[2]}' <<< "a:b c:d e" c:d
, a, ":")}' # ^^ ^ ^^^ # | | | # string | delimiter # | # array to store the pieces
,a,"|"); print a[3],a[2],a[1]}'
,a,":"); print a[3] a[2] a[1]}'

Which works fine. If my string is like "12|23|11"then how do I split them into an array?

哪个工作正常。如果我的字符串就像"12|23|11"那样,我该如何将它们拆分成一个数组?

回答by Calin Paul Alexandru

Have you tried:

你有没有尝试过:

% awk '{
  n = split(
awk -F\| '{print   }' <<<'12|23|11'
, t, "|") for (i = 0; ++i <= n;) print i, t[i] }' <<<'12|23|11' 1 12 2 23 3 11

回答by fedorqui 'SO stop harming'

To split a string to an array in awkwe use the function split():

要将字符串拆分为数组,awk我们使用以下函数split()

awk -F\| '
awk -v T='12|23|11' 'BEGIN{split(T,a,"|");print a[3] a[2] a[1]}'
= ' <<<'12|23|11'


If no separator is given, it uses the FS, which defaults to the space:

如果未给出分隔符,则使用FS,默认为空格:

T='12|23|11';echo -n ${T##*|};T=${T%|*};echo ${T#*|}${T%|*}
T='12|23|11';echo ${T:6}${T:3:2}${T:0:2}

We can give a separator, for example ::

我们可以给一个分隔符,例如:

112312

Which is equivalent to setting it through the FS:

这相当于通过以下方式设置它FS

echo "12|23|11" | awk 'BEGIN {FS="|";} { print , ,  }'

In gawk you can also provide the separator as a regexp:

在 gawk 中,您还可以将分隔符作为正则表达式提供:

echo "12|23|11" | awk '{split(
echo "12|23|11" | awk '{split(
p2> echo "12|23|11" | awk '{split(
# Convert to an array
_ITEMS=($(echo "12|23|11" | tr '|' '\n'))

# Output array items
for _ITEM in "${_ITEMS[@]}"; do
  echo "Item: ${_ITEM}"
done
,a,"|"); print a[3] a[2] a[1]}' 112312
,a,"|"); print a[3] a[2] a[1]}'
,a,"|"); print a[3] a[2] a[1]}'

And even see what the delimiter was on every step by using its fourth parameter:

甚至通过使用它的第四个参数来查看每一步的分隔符是什么:

Item: 12
Item: 23
Item: 11

Let's quote the man page of GNU awk:

让我们引用GNU awk手册页

split(string, array [, fieldsep [, seps ] ])

Divide stringinto pieces separated by fieldsepand store the pieces in arrayand the separator strings in the sepsarray. The first piece is stored in array[1], the second piece in array[2], and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string(much as FScan be a regexp describing where to split input records). If fieldsepis omitted, the value of FSis used. split()returns the number of elements created. sepsis a gawkextension, with seps[i]being the separator string between array[i]and array[i+1]. If fieldsepis a single space, then any leading whitespace goes into seps[0]and any trailing whitespace goes into seps[n], where nis the return value of split()(i.e., the number of elements in array).

split(string, array [, fieldsep [, sep ] ])

字符串分成由fieldsep分隔的部分,并将这些部分存储在数组中,将分隔符字符串存储在sep数组中。第一部分存储在 中array[1],第二部分存储在中array[2],依此类推。第三个参数fieldsep的字符串值是一个描述在何处拆分字符串的正则表达式(就像FS可以是一个描述在何处拆分输入记录的正则表达式)。如果省略fieldsep,则使用FS的值。split()返回创建的元素数。seps是一个gawk扩展名,seps[i]是之间的分隔符字符串array[i]array[i+1]。如果fieldsep是单个空格,则任何前导空格进入seps[0],任何尾随空格进入seps[n],其中n是 的返回值split()(即,数组中的元素数)。

回答by Dimitre Radoulov

Please be more specific! What do you mean by "it doesn't work"? Post the exact output (or error message), your OS and awk version:

请再具体点!“它不起作用”是什么意思?发布确切的输出(或错误消息)、您的操作系统和 awk 版本:

##代码##

Or, using split:

或者,使用拆分:

##代码##

Edit: on Solarisyou'll need to use the POSIXawk (/usr/xpg4/bin/awk) in order to process 4000 fields correctly.

编辑:在Solaris 上,您需要使用POSIXawk ( /usr/xpg4/bin/awk) 才能正确处理 4000 个字段。

回答by TrueY

I do not like the echo "..." | awk ...solution as it calls unnecessary forkand execsystem calls.

我不喜欢echo "..." | awk ...,因为它要求不必要的解决方案forkexec系统调用。

I prefer a Dimitre's solution with a little twist

我更喜欢稍微扭曲的 Dimitre 解决方案

##代码##

Or a bit shorter version:

或者更短的版本:

##代码##

In this case the output record put together which is a true condition, so it gets printed.

在这种情况下,输出记录放在一起是一个真实的条件,所以它被打印出来。

In this specific case the stdinredirection can be spared with setting an awkinternal variable:

在这种特定情况下,stdin可以通过设置awk内部变量来避免重定向:

##代码##

I used kshquite a while, but in bashthis could be managed by internal string manipulation. In the first case the original string is split by internal terminator. In the second case it is assumed that the string always contains digit pairs separated by a one character separator.

我使用ksh很长一段时间,但在bash 中,这可以通过内部字符串操作来管理。在第一种情况下,原始字符串被内部终止符分割。在第二种情况下,假设字符串总是包含由一个字符分隔符分隔的数字对。

##代码##

The result in all cases is

在所有情况下的结果是

##代码##

回答by Sven

Actually awkhas a feature called 'Input Field Separator Variable' link. This is how to use it. It's not really an array, but it uses the internal $ variables. For splitting a simple string it is easier.

实际上awk有一个称为“输入字段分隔符变量”链接的功能。这是如何使用它。它不是真正的数组,但它使用内部 $ 变量。对于拆分简单的字符串,它更容易。

##代码##

回答by Schildmeijer

##代码##

回答by codaddict

##代码##

should work.

应该管用。

回答by duedl0r

Joke? :)

玩笑?:)

How about echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'

怎么样 echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'

This is my output:

这是我的输出:

##代码##

so I guess it's working after all..

所以我想它毕竟有效..

回答by Qorbani

I know this is kind of old question, but I thought maybe someone like my trick. Especially since this solution not limited to a specific number of items.

我知道这是个老问题,但我想也许有人喜欢我的伎俩。特别是因为此解决方案不限于特定数量的项目。

##代码##

The output will be:

输出将是:

##代码##