Bash - 计算字符串中的子字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26212889/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 11:28:56  来源:igfitidea点击:

Bash - Counting substrings in a string

linuxbashshellubuntu

提问by choroba

I am kinda struggling to write a command which counts how many times a sub-string appears in a string.

我有点努力编写一个命令来计算子字符串在字符串中出现的次数。

Instead of running the code below 10 times, I would rather previously count how many times the sub-string appears and adapt the "for" based on its result:

与其将代码运行 10 次以下,我宁愿先计算子字符串出现的次数并根据其结果调整“for”:

Here you can see the code:

在这里你可以看到代码:

CommandResult="Interface    Chipset     Driver     mon0    Unknown      iwlwifi - [phy0]wlan0       Unknown     iwlwifi - [phy0]"

for i in `seq 0 9`;
do
  InstanceID="mon"$i

  if echo "$CommandResult" | grep -q "$InstanceID"; then
    echo "found"
  fi
done

Any help would be appreciated!

任何帮助,将不胜感激!

Thanks,

谢谢,

回答by Cyrus

Feel free to try this to get the number:

请随意尝试以获取号码:

echo "$CommandResult" | tr " " "\n" | grep -c "$InstanceID"

回答by choroba

I'd use grep -oto extract the desired string from the output:

我会用来grep -o从输出中提取所需的字符串:

#!/bin/bash
CommandResult="Interface    Chipset     Driver     mon0    Unknown      iwlwifi - [phy0]wlan0       Unknown     iwlwifi - [phy0]
Interface    Chipset     Driver     mon12    Unknown      iwlwifi - [phy0]wlan0       Unknown     iwlwifi - [phy0]"

for InstanceId in $(grep -o 'mon[0-9]\+' <<< "$CommandResult") ; do
    echo "found $InstanceId "$(grep -c "$InstanceId" <<< "$CommandResult")' times'
done

回答by Cyclonecode

You could do like this:

你可以这样做:

#!/bin/bash
CommandResult="Interface    Chipset     Driver     mon0    Unknown      iwlwifi - [phy0]wlan0       Unknown     iwlwifi - [phy0]"
InstanceId="mon0";
count=`grep -o "$InstanceId" <<< "$CommandResult" | wc -l`
echo "$InstanceId encountered "$count" times";

The above would produce an output like this:

以上将产生这样的输出:

mon0 encountered 1 times

mon0 遇到 1 次

The above could easily be expanded to take take a string as input:

上面的内容可以很容易地扩展为以字符串作为输入:

#!/bin/bash
CommandResult=
InstanceId="mon0";
count=`grep -o "$InstanceId" <<< "$CommandResult" | wc -l`
echo "$InstanceId encountered "$count" times";

Then you could call it like this:

那么你可以这样称呼它:

./script.sh "Interface chipset mon0 mon0 unknown .   test"

or perhaps send the output from another command as an argument:

或者可能将另一个命令的输出作为参数发送:

./script.sh `cat file.txt`

of course xargswould also work:

当然xargs也可以:

cat script.txt | xargs ./script.sh

回答by TimoM

Wanted to share another solution. 3,5 years later, but better late than never I guess. ;)

想分享另一个解决方案。3.5 年后,但我猜迟到总比没有好。;)

I was working on a script that had to do the same, that is get the number of substrings in a string, and I was using count=$(echo '$string' | grep -o ... | wc -l)at first.
I got what I wanted, but when looping through ~1500 files with 0...8000 lines in each the performance was just terrible: the script took about 49 minutesto complete.
So, I went and searched for alternative approaches and eventually found this:

我正在编写一个必须执行相同操作的脚本,即获取字符串中子字符串的数量,并且我首先使用count=$(echo '$string' | grep -o ... | wc -l)
我得到了我想要的东西,但是当循环遍历大约 1500 个文件时,每个文件都有 0...8000 行,性能非常糟糕:脚本花了大约49 分钟才能完成。
所以,我去寻找替代方法,最终找到了这个:

InstanceId="mon0";
tmp="${CommandResult//$InstanceId}"
count=$(((${#CommandResult} - ${#tmp}) / ${#InstanceId}))

Got the same result but waaaay quicker, in 8-9 minutes.

得到了相同的结果,但 waaaay 更快,在 8-9 分钟内。

Explained:

解释:

tmp="${CommandResult//$InstanceId}"

This removes all occurrences of $InstanceIdfrom $CommandResultand places it in tmp.
Actually we're using substring replacement here with replacement string missing. Syntax for substring replacement is ${string//substring/replacement}(this replaces all occurrences of substring with replacement).

这将删除所有出现的$InstanceIdfrom$CommandResult并将其放入tmp.
实际上,我们在这里使用子字符串替换,但缺少替换字符串。子字符串替换的语法是${string//substring/replacement}(这将用替换替换所有出现的子字符串)。

count=$(((${#CommandResult} - ${#tmp}) / ${#InstanceId}))

This gives us the number of occurrences of $InstanceIdin $CommandResult.
${#string}gives string length so (${#CommandResult} - ${#tmp])is length of all occurrences of $InstanceId(remember, we removed all occurrences of $InstanceIdfrom $CommandResultand placed the result in $tmp).
Then we just divide the substraction with length of $InstanceIdto get the number of $InstanceIdoccurrences.

这给了我们$InstanceIdin的出现次数$CommandResult
${#string}给出字符串长度,所以(${#CommandResult} - ${#tmp])是所有出现的长度$InstanceId(请记住,我们删除了所有出现的$InstanceIdfrom$CommandResult并将结果放在 中$tmp)。
然后我们只需将减法除以长度$InstanceId即可得到$InstanceId出现次数。

For further info about substring replacement and string length, see e.g. https://www.tldp.org/LDP/abs/html/string-manipulation.html.

有关子字符串替换和字符串长度的更多信息,请参见例如https://www.tldp.org/LDP/abs/html/string-manipulation.html

回答by jm666

you can use:

您可以使用:

mon_num=$(airmon-ng | grep -Poc '\bmon\d+\b')
echo here are $mon_num mon interfaces

for m in $(airmon-ng | grep -Po '\bmon\d+\b')
do
        #do something
        echo "this is $m"
done

for the airmon-ngoutput such:

对于这样的airmon-ng输出:

Interface   Chipset     Driver

mon0        Unknown     iwlwifi - [phy0]
mon1        Unknown     iwlwifi - [phy0]
mon2        Unknown     iwlwifi - [phy0]
wlan0       Unknown     iwlwifi - [phy0]

will print

将打印

here are 3 mon interfaces
this is mon0
this is mon1
this is mon2