如何在 Bash 中对字符串中的每个字符执行 for 循环?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10551981/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:03:14  来源:igfitidea点击:

How to perform a for loop on each character in a string in Bash?

bashfor-loop

提问by Village

I have a variable like this:

我有一个这样的变量:

words="这是一条狗。"

I want to make a for loop on each of the characters, one at a time, e.g. first character="这", then character="是", character="一", etc.

我想打一个for循环的每个字符,一次一个,例如,第一character="这",然后character="是"character="一"

The only way I know is to output each character to separate line in a file, then use while read line, but this seems very inefficient.

我知道的唯一方法是将每个字符输出到文件中的单独行,然后使用while read line,但这似乎效率很低。

  • How can I process each character in a string through a for loop?
  • 如何通过 for 循环处理字符串中的每个字符?

采纳答案by Rony

With sedon dashshell of LANG=en_US.UTF-8, I got the followings working right:

随着seddash的壳LANG=en_US.UTF-8,我得到了以下工作的权利:

$ echo "你好嗎 新年好。全型句號" | sed -e 's/\(.\)/\n/g'
你
好
嗎

新
年
好
。
全
型
句
號

and

$ echo "Hello world" | sed -e 's/\(.\)/\n/g'
H
e
l
l
o

w
o
r
l
d

Thus, output can be looped with while read ... ; do ... ; done

因此,输出可以循环 while read ... ; do ... ; done

edited for sample text translate into English:

编辑为示例文本翻译成英文:

"你好嗎 新年好。全型句號" is zh_TW.UTF-8 encoding for:
"你好嗎"     = How are you[ doing]
" "         = a normal space character
"新年好"     = Happy new year
"。全型空格" = a double-byte-sized full-stop followed by text description

回答by chepner

You can use a C-style forloop:

您可以使用 C 风格的for循环:

foo=string
for (( i=0; i<${#foo}; i++ )); do
  echo "${foo:$i:1}"
done

${#foo}expands to the length of foo. ${foo:$i:1}expands to the substring starting at position $iof length 1.

${#foo}扩展到foo. ${foo:$i:1}扩展到从$i长度为 1 的位置开始的子字符串。

回答by Tiago Peczenyj

${#var}returns the length of var

${#var}返回长度 var

${var:pos:N}returns N characters from posonwards

${var:pos:N}pos开始返回 N 个字符

Examples:

例子:

$ words="abc"
$ echo ${words:0:1}
a
$ echo ${words:1:1}
b
$ echo ${words:2:1}
c

so it is easy to iterate.

所以很容易迭代。

another way:

其它的办法:

$ grep -o . <<< "abc"
a
b
c

or

或者

$ grep -o . <<< "abc" | while read letter;  do echo "my letter is $letter" ; done 

my letter is a
my letter is b
my letter is c

回答by Six

I'm surprised no one has mentioned the obvious bashsolution utilizing only whileand read.

我很惊讶没有人提到bash仅使用whileand的明显解决方案read

while read -n1 character; do
    echo "$character"
done < <(echo -n "$words")

Note the use of echo -nto avoid the extraneous newline at the end. printfis another good option and may be more suitable for your particular needs. If you want to ignore whitespace then replace "$words"with "${words// /}".

注意使用echo -n以避免在末尾出现多余的换行符。printf是另一个不错的选择,可能更适合您的特定需求。如果您想忽略空格,请替换"$words""${words// /}".

Another option is fold. Please note however that it should never be fed into a for loop. Rather, use a while loop as follows:

另一种选择是fold。但是请注意,它永远不应该被送入 for 循环。相反,使用 while 循环如下:

while read char; do
    echo "$char"
done < <(fold -w1 <<<"$words")

The primary benefit to using the external foldcommand (of the coreutilspackage) would be brevity. You can feed it's output to another command such as xargs(part of the findutilspackage) as follows:

使用外部fold命令(coreutils包)的主要好处是简洁。您可以将其输出提供给另一个命令,例如xargsfindutils包的一部分),如下所示:

fold -w1 <<<"$words" | xargs -I% -- echo %

You'll want to replace the echocommand used in the example above with the command you'd like to run against each character. Note that xargswill discard whitespace by default. You can use -d '\n'to disable that behavior.

您需要将echo上面示例中使用的命令替换为您希望针对每个字符运行的命令。请注意,xargs默认情况下将丢弃空格。您可以使用-d '\n'禁用该行为。



Internationalization国际化

I just tested foldwith some of the Asian characters and realized it doesn't have Unicode support. So while it is fine for ASCII needs, it won't work for everyone. In that case there are some alternatives.

我刚刚测试fold了一些亚洲字符,发现它不支持 Unicode。因此,虽然它可以满足 ASCII 需求,但它并不适用于所有人。在这种情况下,有一些替代方案。

I'd probably replace fold -w1with an awk array:

我可能会fold -w1用 awk 数组替换:

awk 'BEGIN{FS=""} {for (i=1;i<=NF;i++) print $i}'

Or the grepcommand mentioned in another answer:

或者grep另一个答案中提到的命令:

grep -o .



Performance表现

FYI, I benchmarked the 3 aforementioned options. The first two were fast, nearly tying, with the fold loop slightly faster than the while loop. Unsurprisingly xargswas the slowest... 75x slower.

仅供参考,我对上述 3 个选项进行了基准测试。前两个速度很快,几乎打成平手,fold 循环比 while 循环稍快。不出所料xargs是最慢的……慢了 75 倍。

Here is the (abbreviated) test code:

这是(缩写)测试代码:

words=$(python -c 'from string import ascii_letters as l; print(l * 100)')

testrunner(){
    for test in test_while_loop test_fold_loop test_fold_xargs test_awk_loop test_grep_loop; do
        echo "$test"
        (time for (( i=1; i<$((${1:-100} + 1)); i++ )); do "$test"; done >/dev/null) 2>&1 | sed '/^$/d'
        echo
    done
}

testrunner 100

Here are the results:

结果如下:

test_while_loop
real    0m5.821s
user    0m5.322s
sys     0m0.526s

test_fold_loop
real    0m6.051s
user    0m5.260s
sys     0m0.822s

test_fold_xargs
real    7m13.444s
user    0m24.531s
sys     6m44.704s

test_awk_loop
real    0m6.507s
user    0m5.858s
sys     0m0.788s

test_grep_loop
real    0m6.179s
user    0m5.409s
sys     0m0.921s

回答by Thunderbeef

I believe there is still no ideal solution that would correctly preserve all whitespace characters and is fast enough, so I'll post my answer. Using ${foo:$i:1}works, but is very slow, which is especially noticeable with large strings, as I will show below.

我相信仍然没有理想的解决方案可以正确保留所有空白字符并且足够快,所以我会发布我的答案。使用${foo:$i:1}有效,但非常慢,这对于大字符串尤其明显,我将在下面展示。

My idea is an expansion of a method proposed by Six, which involves read -n1, with some changes to keep all characters and work correctly for any string:

我的想法是对Six提出的一种方法的扩展,其中涉及read -n1,进行一些更改以保留所有字符并为任何字符串正常工作:

while IFS='' read -r -d '' -n 1 char; do
        # do something with $char
done < <(printf %s "$string")

How it works:

这个怎么运作:

  • IFS=''- Redefining internal field separator to empty string prevents stripping of spaces and tabs. Doing it on a same line as readmeans that it will not affect other shell commands.
  • -r- Means "raw", which prevents readfrom treating \at the end of the line as a special line concatenation character.
  • -d ''- Passing empty string as a delimiter prevents readfrom stripping newline characters. Actually means that null byte is used as a delimiter. -d ''is equal to -d $'\0'.
  • -n 1- Means that one character at a time will be read.
  • printf %s "$string"- Using printfinstead of echo -nis safer, because echotreats -nand -eas options. If you pass "-e" as a string, echowill not print anything.
  • < <(...)- Passing string to the loop using process substitution. If you use here-strings instead (done <<< "$string"), an extra newline character is appended at the end. Also, passing string through a pipe (printf %s "$string" | while ...) would make the loop run in a subshell, which means all variable operations are local within the loop.
  • IFS=''- 将内部字段分隔符重新定义为空字符串可防止剥离空格和制表符。在同一行执行它read意味着它不会影响其他 shell 命令。
  • -r-均值“原始”,这防止read从处理\在该行作为特殊线路连接字符的结束。
  • -d ''- 传递空字符串作为分隔符可防止read剥离换行符。实际上意味着使用空字节作为分隔符。-d ''等于-d $'\0'
  • -n 1- 表示一次读取一个字符。
  • printf %s "$string"- 使用printf而不是echo -n更安全,因为echo-n-e视为选项。如果将“-e”作为字符串传递,echo则不会打印任何内容。
  • < <(...)- 使用进程替换将字符串传递给循环。如果您使用 here-strings ( done <<< "$string"),则会在末尾附加一个额外的换行符。此外,通过管道 ( printf %s "$string" | while ...)传递字符串将使循环在子外壳中运行,这意味着所有变量操作在循环内都是本地的。

Now, let's test the performance with a huge string. I used the following file as a source:
https://www.kernel.org/doc/Documentation/kbuild/makefiles.txt
The following script was called through timecommand:

现在,让我们用一个巨大的字符串来测试性能。我使用以下文件作为源:
https://www.kernel.org/doc/Documentation/kbuild/makefiles.txt
通过time命令调用以下脚本:

#!/bin/bash

# Saving contents of the file into a variable named `string'.
# This is for test purposes only. In real code, you should use
# `done < "filename"' construct if you wish to read from a file.
# Using `string="$(cat makefiles.txt)"' would strip trailing newlines.
IFS='' read -r -d '' string < makefiles.txt

while IFS='' read -r -d '' -n 1 char; do
        # remake the string by adding one character at a time
        new_string+="$char"
done < <(printf %s "$string")

# confirm that new string is identical to the original
diff -u makefiles.txt <(printf %s "$new_string")

And the result is:

结果是:

$ time ./test.sh

real    0m1.161s
user    0m1.036s
sys     0m0.116s

As we can see, it is quite fast.
Next, I replaced the loop with one that uses parameter expansion:

正如我们所见,它非常快。
接下来,我用一个使用参数扩展的循环替换了循环:

for (( i=0 ; i<${#string}; i++ )); do
    new_string+="${string:$i:1}"
done

The output shows exactly how bad the performance loss is:

输出显示了性能损失的严重程度:

$ time ./test.sh

real    2m38.540s
user    2m34.916s
sys     0m3.576s

The exact numbers may very on different systems, but the overall picture should be similar.

不同系统上的确切数字可能非常不同,但总体情况应该相似。

回答by William Pursell

I've only tested this with ascii strings, but you could do something like:

我只用 ascii 字符串对此进行了测试,但您可以执行以下操作:

while test -n "$words"; do
   c=${words:0:1}     # Get the first character
   echo character is "'$c'"
   words=${words:1}   # trim the first character
done

回答by De Novo

The C style loop in @chepner's answer is in the shell function update_terminal_cwd, and the grep -o .solution is clever, but I was surprised not to see a solution using seq. Here's mine:

@chepner 答案中的 C 风格循环在 shell 函数中update_terminal_cwdgrep -o .解决方案很聪明,但我很惊讶没有看到使用seq. 这是我的:

read word
for i in $(seq 1 ${#word}); do
  echo "${word:i-1:1}"
done

回答by sebix

It is also possible to split the string into a character array using foldand then iterate over this array:

也可以使用将字符串拆分为字符数组fold,然后遍历该数组:

for char in `echo "这是一条狗。" | fold -w1`; do
    echo $char
done

回答by sebix

Another approach, if you don't care about whitespace being ignored:

另一种方法,如果你不关心空格被忽略:

for char in $(sed -E s/'(.)'/' '/g <<<"$your_string"); do
    # Handle $char here
done

回答by Javier Salas

Another way is:

另一种方式是:

Characters="TESTING"
index=1
while [ $index -le ${#Characters} ]
do
    echo ${Characters} | cut -c${index}-${index}
    index=$(expr $index + 1)
done