在 bash 脚本中使用正则表达式将 1 个参数拆分为 2 个参数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11416134/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 02:43:06  来源:igfitidea点击:

Split 1 argument into 2 arguments using regexp in a bash script

regexbash

提问by synaptik

Here's my situation. Currently, I have a script that accepts two arguments: book name and chapter name. For example:

这是我的情况。目前,我有一个接受两个参数的脚本:书名和章节名。例如:

$ myscript book1 chap1

Now, for reasons that would take a long time to explain, I would prefer my script to be able to take a single argument of the following format: {book name}.{chapter name}. For example:

现在,由于需要很长时间才能解释的原因,我希望我的脚本能够采用以下格式的单个参数:{书名}.{章节名}。例如:

$ myscript book1.chap1

The difficulty for me is that I do not know how to take a string $1=abc.xyz and turn it into two separate variables, $var1=abc and $var2=xyz. How can I do this?

对我来说,困难在于我不知道如何将字符串 $1=abc.xyz 转换为两个单独的变量,$var1=abc 和 $var2=xyz。我怎样才能做到这一点?

回答by smocking

If it's just two tags you can use a bash expression

如果只有两个标签,您可以使用 bash 表达式

arg=
beforedot=${arg%.*}
afterdot=${arg#*.}

It's faster than cutbecause it's a shell builtin. Note that this puts everything before the firstlast dot into beforedotand everything after into afterdot.

它比cut因为它是内置的 shell更快。请注意,这会将第一个最后一个点之前的beforedot所有内容放入 into 之后的所有内容中afterdot

EDIT:

编辑

There's also a substitution/reinterpretation construct if you want to split by an arbitrary number of tokens:

如果您想按任意数量的标记拆分,还有一个替换/重新解释构造:

string=a.b.c.d.e
tokens=(${string//\./ })

You're replacing dots by spaces and then that gets interpreted as an array declaration+definition because of the parentheses around it.

您正在用空格替换点,然后由于它周围的括号而将其解释为数组声明+定义。

However I've found this to be less portable to bash' siblings and offspring. For example, it doesn't work in my favourite shell, zsh.

但是,我发现这对于 bash 的兄弟姐妹和后代来说不太便携。例如,它在我最喜欢的 shell 中不起作用zsh

Arrays need to be dereferenced with braces and are indexed from 0:

数组需要用大括号取消引用,并从 0 开始索引:

echo "Third token: ${tokens[2]}"

You can loop through them as well by dereferencing the whole array with [@]:

您也可以通过使用 [@] 取消引用整个数组来遍历它们:

for i in ${tokens[@]}
do
    # do stuff
done

回答by Paused until further notice.

For completeness and since you asked about a regex method:

为了完整起见,并且由于您询问了正则表达式方法:

pattern='^([^.]*)\.(.*)'
[[  =~ $pattern ]]
book=${BASH_REMATCH[1]}
chapter=${BASH_REMATCH[2]}

The capture groups are elements in the BASH_REMATCHarray. Element 0 contains the whole match.

捕获组是BASH_REMATCH数组中的元素。元素 0 包含整个匹配项。

This regex will capture up to the first dot in the first element. Anything after the first dot including susbsequent dots will be in the second element. The regex can be easily modified to break on the last dot if needed.

此正则表达式将最多捕获第一个元素中的第一个点。第一个点之后的任何内容,包括后续的点,都将在第二个元素中。如果需要,可以轻松修改正则表达式以在最后一个点处中断。

回答by Brian Agnew

If $argcontains book.chap

如果$arg包含book.chap

read BOOK CHAP<<<$(IFS="."; echo $arg)

will set the variables BOOK and CHAP accordingly. This uses the bash internal field separator (IFS) which controls how bash understands word boundaries. If (say) you have multiple separators in your original $argthen just specify further variables to contain the results.

将相应地设置变量 BOOK 和 CHAP。这使用 bash 内部字段分隔符 (IFS),它控制 bash 如何理解单词边界。如果(例如)您的原始文件中有多个分隔符,$arg则只需指定更多变量以包含结果。

From here:

这里

$IFS defaults to whitespace (space, tab, and newline), but may be changed, for example, to parse a comma-separated data file

$IFS 默认为空格(空格、制表符和换行符),但可以更改,例如解析逗号分隔的数据文件

回答by Todd A. Jacobs

Pattern Subsitution with Shell Parameter Expansion

带壳参数扩展的模式替换

There are a lot of ways to accomplish what you're trying to do. One of the ways not covered in other answers is pattern substitution.

有很多方法可以完成您正在尝试做的事情。其他答案中未涵盖的方法之一是模式替换

If you know that the value will always split correctly on a period, you can apply pattern substitution to the value so that it will be easy to tokenize with IFS. For example:

如果您知道该值始终会在某个时间段内正确拆分,则可以对该值应用模式替换,以便使用IFS轻松标记。例如:

set -- foo.bar
myvar="${1/./ }"
echo $myvar

This will yield foo bar.

这将产生foo bar.

回答by Palladium

You can use parentheses to capture the two parts; afterwards, you can use backreferences to grab them again. The syntax differs between languages; check http://www.regular-expressions.info/brackets.htmlfor a lesson on backreferences in general.

您可以使用括号来捕获这两个部分;之后,您可以使用反向引用再次获取它们。语言之间的语法不同;查看http://www.regular-expressions.info/brackets.html了解有关反向引用的一般课程。

回答by Thedward

#!/bin/bash

book=${1%.*}
chapter=${1#*.}

printf 'book: %s\nchapter: %s\n' "$book" "$chapter"