用于提取正则表达式模式的所有匹配项的 bash 脚本

Question

提问by Neeladri Vishweswaran

I found this but it assumes the words are space separated.

我找到了这个，但它假设单词是空格分隔的。

result="abcdefADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLOabcdefg"

for word in $result
do
    if echo $word | grep -qi '(ADDNAME\d\d.*HELLO)'
    then
        match="$match $word"
    fi
done

POST EDITED

后期编辑

Re-naming for clarity:

为清晰起见重新命名：

data="abcdefADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLOabcdefg"
for word in $data
do
    if echo $word | grep -qi '(ADDNAME\d\d.*HELLO)'
    then
        match="$match $word"
    fi
done
echo $match

Original left so comments asking about resultcontinue to make sense.

原始离开所以评论询问result继续有意义。

Answer 1

回答by Paused until further notice.

Edit: answer to edited question:

编辑：对已编辑问题的回答：

for string in "$(echo $result | grep -Po "ADDNAME[0-9]{2}.*?HELLO")"
do
    match="${match:+$match }$string"
done

Original answer:

原答案：

If you're using Bash version 3.2 or higher, you can use its regex matching.

如果您使用的是 Bash 3.2 或更高版本，则可以使用其正则表达式匹配。

string="string to search 99 with 88 some 42 numbers"
pattern="[0-9]{2}"
for word in $string
do
    [[ $word =~ $pattern ]]
    if [[ ${BASH_REMATCH[0]} ]]
    then
        match="${match:+match }${BASH_REMATCH[0]}"
    fi
done

The result will be "99 88 42".

结果将是“99 88 42”。

Answer 2

回答by Daenyth

Use grep -o

用 grep -o

-o, --only-matching show only the part of a line matching PATTERN

-o, --only-matching 只显示匹配 PATTERN 的行的一部分

Answer 3

回答by Jonathan Leffler

Not very elegant - and there are problems because of greedy matching - but this more or less works:

不是很优雅 - 由于贪婪匹配而存在问题 - 但这或多或少是有效的：

data="abcdefADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLOabcdefg"
for word in $data \
    "ADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLOabcdefg" \
    "ADDNAME25abcdefgHELLOabcdefgADDNAME25abcdefgHELLO"
do
    echo $word
done |
sed -e '/ADDNAME[0-9][0-9][a-z]*HELLO/{
        s/\(ADDNAME[0-9][0-9][a-z]*HELLO\)/  /g
        }' |
while read line
do
    set -- $line
    for arg in "$@"
    do echo $arg
    done
done |
grep "ADDNAME[0-9][0-9][a-z]*HELLO"

The first loop echoes three lines of data - you'd probably replace that with cator I/O redirection. The sedscript uses a modified regex to put spaces around the patterns. The last loop breaks up the 'space separated words' into one 'word' per line. The final grepselects the lines you want.

第一个循环回显三行数据 - 您可能cat会将其替换为或 I/O 重定向。该sed脚本使用修改后的正则表达式在模式周围放置空格。最后一个循环将“空格分隔的单词”分解为每行一个“单词”。最后grep选择你想要的行。

The regex is modified with [a-z]*in place of the original .*because the pattern matching is greedy. If the data between ADDNAME and HELLO is unconstrained, then you need to think about using non-greedy regexes, which are available in Perl and probably Python and other modern scripting languages:

因为模式匹配是贪婪的，所以正则表达式被修改为[a-z]*代替原来.*的。如果 ADDNAME 和 HELLO 之间的数据不受约束，那么您需要考虑使用非贪婪的正则表达式，这些正则表达式在 Perl 中可用，可能还有 Python 和其他现代脚本语言：

#!/bin/perl -w
while (<>)
{
    while (/(ADDNAME\d\d.*?HELLO)/g)
    {
        print "\n";
    }
}

This is a good demonstration of using the right too for the job.

这是在工作中也使用权利的一个很好的示范。

用于提取正则表达式模式的所有匹配项的 bash 脚本

提问by Neeladri Vishweswaran

回答by Paused until further notice.

回答by Daenyth

回答by Jonathan Leffler

相关推荐

最近更新

标签

用于提取正则表达式模式的所有匹配项的 bash 脚本

提问by Neeladri Vishweswaran

回答by Paused until further notice.

回答by Daenyth

回答by Jonathan Leffler

相关推荐

如何在 Bash 中保持 MySQL 连接打开

bash 如何从变量运行脚本命令？

bash 您如何判断当前终端会话是否在 GNU 屏幕中？

bash 为什么 chmod 上的递归模式除了递归之外什么都做？

相关推荐

最近更新

标签