bash 如何将正则表达式的匹配项分配给变量?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1247069/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I assign the match of my regular expression to a variable?
提问by samoz
I have a text file with various entries in it. Each entry is ended with line containing all asterisks.
我有一个包含各种条目的文本文件。每个条目都以包含所有星号的行结束。
I'd like to use shell commands to parse this file and assign each entry to a variable. How can I do this?
我想使用 shell 命令来解析这个文件并将每个条目分配给一个变量。我怎样才能做到这一点?
Here's an example input file:
这是一个示例输入文件:
*********** Field1 *********** Lorem ipsum Data to match *********** More data Still more data ***********
Here is what my solution looks like so far:
这是我的解决方案到目前为止的样子:
#!/bin/bash
for error in `python example.py | sed -n '/.*/,/^\**$/p'`
do
echo -e $error
echo -e "\n"
done
However, this just assigns each word in the matched text to $error, rather than a whole block.
然而,这只是将匹配文本中的每个单词分配给 $error,而不是整个块。
回答by Cascabel
I'm surprised to not see a native bash solution here. Yes, bash has regular expressions. You can find plenty of random documentation online, particularly if you include "bash_rematch" in your query, or just look at the man pages. Here's a silly example, taken from hereand slightly modified, which prints the whole match, and each of the captured matches, for a regular expression.
我很惊讶在这里没有看到本机 bash 解决方案。是的,bash 有正则表达式。您可以在网上找到大量随机文档,特别是如果您在查询中包含“bash_rematch”,或者只是查看手册页。这是一个愚蠢的示例,取自此处并稍作修改,它打印整个匹配项以及每个捕获的匹配项,用于正则表达式。
if [[ $str =~ $regex ]]; then
echo "$str matches"
echo "matching substring: ${BASH_REMATCH[0]}"
i=1
n=${#BASH_REMATCH[*]}
while [[ $i -lt $n ]]
do
echo " capture[$i]: ${BASH_REMATCH[$i]}"
let i++
done
else
echo "$str does not match"
fi
The important bit is that the extended test [[ ... ]]using its regex comparision =~stores the entire match in ${BASH_REMATCH[0]}and the captured matches in ${BASH_REMATCH[i]}.
重要的一点是,[[ ... ]]使用正则表达式比较的扩展测试=~将整个匹配存储${BASH_REMATCH[0]}在${BASH_REMATCH[i]}.
回答by Jukka Matilainen
If you want to do it in Bash, you could do something like the following. It uses globbing instead of regexps (The extglobshell option enables extended pattern matching, so that we can match a line consisting only of asterisks.)
如果您想在 Bash 中执行此操作,可以执行以下操作。它使用 globbing 而不是正则表达式(extglobshell 选项启用扩展模式匹配,以便我们可以匹配仅由星号组成的行。)
#!/bin/bash
shopt -s extglob
entry=""
while read line
do
case $line in
+(\*))
# do something with $entry here
entry=""
;;
*)
entry="$entry$line
"
;;
esac
done
回答by Brad Gilbert
Try putting double quotes around the command.
尝试在命令周围加上双引号。
#!/bin/bash
for error in "`python example.py | sed -n '/.*/,/^\**$/p'`"
do
echo -e $error
echo -e "\n"
done
回答by William Pursell
Splitting records in (ba)sh is not so easy, but can be done using IFS to split on single characters (simply set IFS='*' before your for loop, but this generates multiple empty records and is problematic if any record contains a '*'). The obvious solution is to use perl or awk and use RS to split your records, since those tools provide better mechanisms for splitting records. A hybrid solution is to use perl to do the record splitting, and have perl call your bash function with the record you want. For example:
在 (ba)sh 中拆分记录并不是那么容易,但可以使用 IFS 拆分单个字符来完成(只需在 for 循环之前设置 IFS='*',但这会生成多个空记录,如果任何记录包含'*')。显而易见的解决方案是使用 perl 或 awk 并使用 RS 拆分您的记录,因为这些工具提供了更好的拆分记录的机制。混合解决方案是使用 perl 进行记录拆分,并让 perl 使用您想要的记录调用您的 bash 函数。例如:
#!/bin/bash
foo() {
echo record start:
echo "$@"
echo record end
}
export -f foo
perl -e "$/='********'; while(<>){chomp;system( \"foo '$_'\" )}" << 'EOF'
this is a 2-line
record
********
the 2nd record
is 3 lines
long
********
a 3rd * record
EOF
This gives the following output:
这给出了以下输出:
record start: this is a 2-line record record end record start: the 2nd record is 3 lines long record end record start: a 3rd * record record end
回答by ghostdog74
depending on what you want to do with the variables
取决于你想对变量做什么
awk '
f && /\*/{print "variable:"s;f=0}
/\*/{ f=1 ;s="";next}
f{
s=s" "# ./test.sh
variable: Field1
variable: Lorem ipsum Data to match
variable: More data Still more data
}' file
output:
输出:
##代码##the above just prints them out. if you want, store in array for later use...eg array[++d]=s
以上只是将它们打印出来。如果需要,可以存储在数组中以备后用...例如 array[++d]=s

