bash 如何将正则表达式的匹配项分配给变量？

Question

提问by samoz

I have a text file with various entries in it. Each entry is ended with line containing all asterisks.

我有一个包含各种条目的文本文件。每个条目都以包含所有星号的行结束。

I'd like to use shell commands to parse this file and assign each entry to a variable. How can I do this?

我想使用 shell 命令来解析这个文件并将每个条目分配给一个变量。我怎样才能做到这一点？

Here's an example input file:

这是一个示例输入文件：

***********
Field1
***********
Lorem ipsum
Data to match
***********
More data
Still more data
***********

Here is what my solution looks like so far:

这是我的解决方案到目前为止的样子：

#!/bin/bash
for error in `python example.py | sed -n '/.*/,/^\**$/p'`
do
    echo -e $error
    echo -e "\n"
done

However, this just assigns each word in the matched text to $error, rather than a whole block.

然而，这只是将匹配文本中的每个单词分配给 $error，而不是整个块。

Answer 1

回答by Cascabel

I'm surprised to not see a native bash solution here. Yes, bash has regular expressions. You can find plenty of random documentation online, particularly if you include "bash_rematch" in your query, or just look at the man pages. Here's a silly example, taken from hereand slightly modified, which prints the whole match, and each of the captured matches, for a regular expression.

我很惊讶在这里没有看到本机 bash 解决方案。是的，bash 有正则表达式。您可以在网上找到大量随机文档，特别是如果您在查询中包含“bash_rematch”，或者只是查看手册页。这是一个愚蠢的示例，取自此处并稍作修改，它打印整个匹配项以及每个捕获的匹配项，用于正则表达式。

if [[ $str =~ $regex ]]; then
    echo "$str matches"
    echo "matching substring: ${BASH_REMATCH[0]}"
    i=1
    n=${#BASH_REMATCH[*]}
    while [[ $i -lt $n ]]
    do
        echo "  capture[$i]: ${BASH_REMATCH[$i]}"
        let i++
    done
else
    echo "$str does not match"
fi

The important bit is that the extended test [[ ... ]]using its regex comparision =~stores the entire match in ${BASH_REMATCH[0]}and the captured matches in ${BASH_REMATCH[i]}.

重要的一点是，[[ ... ]]使用正则表达式比较的扩展测试=~将整个匹配存储${BASH_REMATCH[0]}在${BASH_REMATCH[i]}.

Answer 2

回答by Jukka Matilainen

If you want to do it in Bash, you could do something like the following. It uses globbing instead of regexps (The extglobshell option enables extended pattern matching, so that we can match a line consisting only of asterisks.)

如果您想在 Bash 中执行此操作，可以执行以下操作。它使用 globbing 而不是正则表达式（extglobshell 选项启用扩展模式匹配，以便我们可以匹配仅由星号组成的行。）

#!/bin/bash
shopt -s extglob
entry=""
while read line
do
    case $line in 
        +(\*))
            # do something with $entry here
            entry=""
            ;;
        *)
            entry="$entry$line
"
            ;;
    esac
done

Answer 3

回答by Brad Gilbert

Try putting double quotes around the command.

尝试在命令周围加上双引号。

#!/bin/bash
for error in "`python example.py | sed -n '/.*/,/^\**$/p'`"
do
    echo -e $error
    echo -e "\n"
done

Answer 4

回答by William Pursell

Splitting records in (ba)sh is not so easy, but can be done using IFS to split on single characters (simply set IFS='*' before your for loop, but this generates multiple empty records and is problematic if any record contains a '*'). The obvious solution is to use perl or awk and use RS to split your records, since those tools provide better mechanisms for splitting records. A hybrid solution is to use perl to do the record splitting, and have perl call your bash function with the record you want. For example:

在 (ba)sh 中拆分记录并不是那么容易，但可以使用 IFS 拆分单个字符来完成（只需在 for 循环之前设置 IFS='*'，但这会生成多个空记录，如果任何记录包含'*'）。显而易见的解决方案是使用 perl 或 awk 并使用 RS 拆分您的记录，因为这些工具提供了更好的拆分记录的机制。混合解决方案是使用 perl 进行记录拆分，并让 perl 使用您想要的记录调用您的 bash 函数。例如：

#!/bin/bash

foo() {
    echo record start:
    echo "$@"
    echo record end
}
export -f foo

perl -e "$/='********'; while(<>){chomp;system( \"foo '$_'\" )}" << 'EOF'
this is a 2-line
record
********
the 2nd record
is 3 lines
long
********
a 3rd * record
EOF

This gives the following output:

这给出了以下输出：

record start:
this is a 2-line
record

record end
record start:

the 2nd record
is 3 lines
long

record end
record start:

a 3rd * record

record end

Answer 5

回答by ghostdog74

depending on what you want to do with the variables

取决于你想对变量做什么

awk '
f && /\*/{print "variable:"s;f=0}
/\*/{ f=1 ;s="";next}
f{
   s=s" "# ./test.sh
variable: Field1
variable: Lorem ipsum Data to match
variable: More data Still more data

}' file

output:

输出：

##代码##

the above just prints them out. if you want, store in array for later use...eg array[++d]=s

以上只是将它们打印出来。如果需要，可以存储在数组中以备后用...例如 array[++d]=s

bash 如何将正则表达式的匹配项分配给变量？

提问by samoz

回答by Cascabel

回答by Jukka Matilainen

回答by Brad Gilbert

回答by William Pursell

回答by ghostdog74

相关推荐

最近更新

标签

bash 如何将正则表达式的匹配项分配给变量？

提问by samoz

回答by Cascabel

回答by Jukka Matilainen

回答by Brad Gilbert

回答by William Pursell

回答by ghostdog74

相关推荐

从 iPhone 应用程序运行 BASH 脚本？

不同的 vi 编辑模式的不同 bash 提示？

bash 在单行上回显打印变量

bash 如何在 linux 中重新添加 unicode 字节顺序标记？

相关推荐

最近更新

标签