Linux Bash:如何标记字符串变量?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5382712/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 00:33:05  来源:igfitidea点击:

Bash: How to tokenize a string variable?

linuxbash

提问by Jake Wilson

If I have a string variable who's value is "john is 17 years old"how do I tokenize this using spaces as the delimeter? Would I use awk?

如果我有一个字符串变量,其值是"john is 17 years old"如何使用空格作为分隔符来标记它?我会用awk吗?

采纳答案by John Kugelman

Use the shell's automatic tokenization of unquoted variables:

使用 shell 对未加引号变量的自动标记化:

$ string="john is 17 years old"
$ for word in $string; do echo "$word"; done
john
is
17
years
old

If you want to change the delimiter you can set the $IFSvariable, which stands for internal field separator. The default value of $IFSis " \t\n"(space, tab, newline).

如果要更改分隔符,可以设置$IFS变量,它代表内部字段分隔符。的默认值$IFS" \t\n"(空格,制表,换行)。

$ string="john_is_17_years_old"
$ (IFS='_'; for word in $string; do echo "$word"; done)
john
is
17
years
old

(Note that in this second example I added parentheses around the second line. This creates a sub-shell so that the change to $IFSdoesn't persist. You generally don't want to permanently change $IFSas it can wreak havoc on unsuspecting shell commands.)

(请注意,在第二个示例中,我在第二行周围添加了括号。这会创建一个子 shell,以便更改$IFS不会持续存在。您通常不希望永久更改,$IFS因为它可能会对毫无戒心的 shell 命令造成严重破坏。 )

回答by harshit

you can try something like this :

你可以尝试这样的事情:

#!/bin/bash
n=0
a=/home/file.txt
for i in `cat ${a} | tr ' ' '\n'` ; do
   str=${str},${i}
   let n=$n+1
   var=`echo "var${n}"`
   echo $var is ... ${i}
done

回答by Diego Torres Milano

$ string="john is 17 years old"
$ tokens=( $string )
$ echo ${tokens[*]}

For other delimiters, like ';'

对于其他分隔符,如“;”

$ string="john;is;17;years;old"
$ IFS=';' tokens=( $string )
$ echo ${tokens[*]}

回答by kurumi

$ string="john is 17 years old"
$ set -- $string
$ echo 
john
$ echo 
is
$ echo 
17

回答by Mila Nautikus

with POSIX extended regex:

使用 POSIX 扩展正则表达式:

$ str='a b     c d'
$ echo "$str" | sed -E 's/\W+/\n/g' | hexdump -C
00000000  61 0a 62 0a 63 0a 64 0a                           |a.b.c.d.|
00000008

this is like python's re.split(r'\W+', str)

这就像蟒蛇的 re.split(r'\W+', str)

\Wmatches a non-word character,
including space, tab, newline, return, [like the bash fortokenizer]
but also including symbols like quotes, brackets, signs, ...

\W匹配非单词字符,
包括空格、制表符、换行符、回车符、[如bash for标记器],
但也包括引号、括号、符号等符号...

... except the underscore sign _,
so snake_caseis one word, but kebab-caseare two words.

......除了下划线符号_
所以snake_case是一字之差,却kebab-case是两个词。

leading and trailing space will create an empty line.

前导和尾随空格将创建一个空行。