在 bash 脚本中使用正则表达式将 1 个参数拆分为 2 个参数

Question

提问by synaptik

Here's my situation. Currently, I have a script that accepts two arguments: book name and chapter name. For example:

这是我的情况。目前，我有一个接受两个参数的脚本：书名和章节名。例如：

$ myscript book1 chap1

Now, for reasons that would take a long time to explain, I would prefer my script to be able to take a single argument of the following format: {book name}.{chapter name}. For example:

现在，由于需要很长时间才能解释的原因，我希望我的脚本能够采用以下格式的单个参数：{书名}.{章节名}。例如：

$ myscript book1.chap1

The difficulty for me is that I do not know how to take a string $1=abc.xyz and turn it into two separate variables, $var1=abc and $var2=xyz. How can I do this?

对我来说，困难在于我不知道如何将字符串 $1=abc.xyz 转换为两个单独的变量，$var1=abc 和 $var2=xyz。我怎样才能做到这一点？

Answer 1

回答by smocking

If it's just two tags you can use a bash expression

如果只有两个标签，您可以使用 bash 表达式

arg=
beforedot=${arg%.*}
afterdot=${arg#*.}

It's faster than cutbecause it's a shell builtin. Note that this puts everything before the ~~first~~last dot into beforedotand everything after into afterdot.

它比cut因为它是内置的 shell更快。请注意，这会将~~第一个~~最后一个点之前的beforedot所有内容放入 into 之后的所有内容中afterdot。

EDIT:

编辑：

There's also a substitution/reinterpretation construct if you want to split by an arbitrary number of tokens:

如果您想按任意数量的标记拆分，还有一个替换/重新解释构造：

string=a.b.c.d.e
tokens=(${string//\./ })

You're replacing dots by spaces and then that gets interpreted as an array declaration+definition because of the parentheses around it.

您正在用空格替换点，然后由于它周围的括号而将其解释为数组声明+定义。

However I've found this to be less portable to bash' siblings and offspring. For example, it doesn't work in my favourite shell, zsh.

但是，我发现这对于 bash 的兄弟姐妹和后代来说不太便携。例如，它在我最喜欢的 shell 中不起作用zsh。

Arrays need to be dereferenced with braces and are indexed from 0:

数组需要用大括号取消引用，并从 0 开始索引：

echo "Third token: ${tokens[2]}"

You can loop through them as well by dereferencing the whole array with [@]:

您也可以通过使用 [@] 取消引用整个数组来遍历它们：

for i in ${tokens[@]}
do
    # do stuff
done

Answer 2

回答by Paused until further notice.

For completeness and since you asked about a regex method:

为了完整起见，并且由于您询问了正则表达式方法：

pattern='^([^.]*)\.(.*)'
[[  =~ $pattern ]]
book=${BASH_REMATCH[1]}
chapter=${BASH_REMATCH[2]}

The capture groups are elements in the BASH_REMATCHarray. Element 0 contains the whole match.

捕获组是BASH_REMATCH数组中的元素。元素 0 包含整个匹配项。

This regex will capture up to the first dot in the first element. Anything after the first dot including susbsequent dots will be in the second element. The regex can be easily modified to break on the last dot if needed.

此正则表达式将最多捕获第一个元素中的第一个点。第一个点之后的任何内容，包括后续的点，都将在第二个元素中。如果需要，可以轻松修改正则表达式以在最后一个点处中断。

Answer 3

回答by Brian Agnew

If $argcontains book.chap

如果$arg包含book.chap

read BOOK CHAP<<<$(IFS="."; echo $arg)

will set the variables BOOK and CHAP accordingly. This uses the bash internal field separator (IFS) which controls how bash understands word boundaries. If (say) you have multiple separators in your original $argthen just specify further variables to contain the results.

将相应地设置变量 BOOK 和 CHAP。这使用 bash 内部字段分隔符 (IFS)，它控制 bash 如何理解单词边界。如果（例如）您的原始文件中有多个分隔符，$arg则只需指定更多变量以包含结果。

From here:

从这里：

$IFS defaults to whitespace (space, tab, and newline), but may be changed, for example, to parse a comma-separated data file

$IFS 默认为空格（空格、制表符和换行符），但可以更改，例如解析逗号分隔的数据文件

Answer 4

回答by Todd A. Jacobs

Pattern Subsitution with Shell Parameter Expansion

带壳参数扩展的模式替换

There are a lot of ways to accomplish what you're trying to do. One of the ways not covered in other answers is pattern substitution.

有很多方法可以完成您正在尝试做的事情。其他答案中未涵盖的方法之一是模式替换。

If you know that the value will always split correctly on a period, you can apply pattern substitution to the value so that it will be easy to tokenize with IFS. For example:

如果您知道该值始终会在某个时间段内正确拆分，则可以对该值应用模式替换，以便使用IFS轻松标记。例如：

set -- foo.bar
myvar="${1/./ }"
echo $myvar

This will yield foo bar.

这将产生foo bar.

Answer 5

回答by Palladium

You can use parentheses to capture the two parts; afterwards, you can use backreferences to grab them again. The syntax differs between languages; check http://www.regular-expressions.info/brackets.htmlfor a lesson on backreferences in general.

您可以使用括号来捕获这两个部分；之后，您可以使用反向引用再次获取它们。语言之间的语法不同；查看http://www.regular-expressions.info/brackets.html了解有关反向引用的一般课程。

Answer 6

回答by Thedward

#!/bin/bash

book=${1%.*}
chapter=${1#*.}

printf 'book: %s\nchapter: %s\n' "$book" "$chapter"

在 bash 脚本中使用正则表达式将 1 个参数拆分为 2 个参数

提问by synaptik

回答by smocking

回答by Paused until further notice.

回答by Brian Agnew

回答by Todd A. Jacobs

Pattern Subsitution with Shell Parameter Expansion

带壳参数扩展的模式替换

回答by Palladium

回答by Thedward

相关推荐

最近更新

标签

在 bash 脚本中使用正则表达式将 1 个参数拆分为 2 个参数

提问by synaptik

回答by smocking

回答by Paused until further notice.

回答by Brian Agnew

回答by Todd A. Jacobs

Pattern Subsitution with Shell Parameter Expansion

带壳参数扩展的模式替换

回答by Palladium

回答by Thedward

相关推荐

bash 使用sed插入文件内容

Bash - /etc/profile，登录时过多的只读变量消息

Bash 排序并跳过标题。

在 Bash 中使用多个表达式复合 if 语句

相关推荐

最近更新

标签