如何匹配直到最后一次出现 bash shell 中的字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32084533/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 13:27:26  来源:igfitidea点击:

How to match until the last occurrence of a character in bash shell

regexbashshellgrep

提问by mo0206

I am using curland cuton a output like below.

我正在使用curlcut输出如下所示。

var=$(curl https://avc.com/actuator/info | tr '"' '\n' | grep - | head -n1 | cut -d'-' -f -1, -3)

Varible vargets have two kinds of values (one at a time).

变量var获取有两种值(一次一个)。

HIX_MAIN-7ae526629f6939f717165c526dad3b7f0819d85b
HIX-R1-1-3b5126629f67892110165c524gbc5d5g1808c9b5

I am actually trying to get everything until the last '-'. i.e HIX-MAINor HIX-R1-1.

我实际上是想把所有东西都弄到最后一个“-”。即HIX-MAINHIX-R1-1

The command shown works fine to get HIX-R1-1.

显示的命令可以很好地获取HIX-R1-1.

But I figured this is the wrong way to do when I have something something like only 1 -in the variable; it is getting me the entire variable value (e.g. HIX_MAIN-7ae526629f6939f717165c526dad3b7f0819d85b).

但是我认为当-变量中只有 1 之类的东西时,这是错误的做法;它让我得到整个变量值(例如HIX_MAIN-7ae526629f6939f717165c526dad3b7f0819d85b)。

How do I go about getting everything up to the last '-' into the variable var?

我如何将最后一个“-”的所有内容都放入变量中var

回答by John1024

This removes everything from the last -to the end:

这将删除从最后-到最后的所有内容:

sed 's/\(.*\)-.*//'

As examples:

例如:

$ echo HIX_MAIN-7ae52 | sed 's/\(.*\)-.*//'
HIX_MAIN
$ echo HIX-R1-1-3b5126629f67 | sed 's/\(.*\)-.*//'
HIX-R1-1

How it works

这个怎么运作

The sed substitute command has the form s/old/new/where oldis a regular expression. In this case, the regex is \(.*\)-.*. This works because \(.*\)-is greedy: it will match everything up to the last -. Because of the escaped parens,\(...\), everything before the last -will be saved in group 1 which we can refer to as \1. The final .*matches everything after the last -. Thus, as long as the line contains a -, this regex matches the whole line and the substitute command replaces the whole line with \1.

sed 替代命令的形式为s/old/new/whereold是正则表达式。在这种情况下,正则表达式是\(.*\)-.*. 之所以有效,\(.*\)-是因为它是贪婪的:它将匹配所有内容,直到最后一个-. 由于转义括号,\(...\),最后一个之前的所有内容都-将保存在第 1 组中,我们可以将其称为\1。final.*匹配 last 之后的所有内容-。因此,只要该行包含-,该正则表达式就匹配整行并且替换命令将整行替换为\1

回答by Jeff Bowman

You can use bash string manipulation:

您可以使用bash 字符串操作

$ foo=a-b-c-def-ghi
$ echo "${foo%-*}"
a-b-c-def

The operators, #and %are on either side of $on a QWERTY keyboard, which helps to remember how they modify the variable:

运算符#和位于 QWERTY 键盘的%两侧$,这有助于记住它们如何修改变量:

  • #patterntrims off the shortest prefix matching "pattern".
  • ##patterntrims off the longest prefix matching "pattern".
  • %patterntrims off the shortest suffix matching "pattern".
  • %%patterntrims off the longest suffix matching "pattern".
  • #pattern修剪掉匹配“模式”的最短前缀。
  • ##pattern修剪掉最长的前缀匹配“模式”。
  • %pattern修剪掉匹配“模式”的最短后缀。
  • %%pattern修剪掉最长的匹配“模式”的后缀。

where patternmatches the bash pattern matching rules, including ?(one character) and *(zero or more characters).

wherepattern匹配bash 模式匹配规则,包括?(一个字符)和*(零个或多个字符)。

Here, we're trimming off the shortest suffix matching the pattern -*, so ${foo%-*}will get you what you want.

在这里,我们修剪掉与 pattern 匹配的最短后缀-*,这样${foo%-*}就能得到你想要的。

Of course, there are many ways to do this using awkor sed, possibly reusing the sedcommand you're already running. Variable manipulation, however, can be done natively in bash without launching another process.

当然,有很多方法可以使用awkor来做到这一点sed,可能会重用sed您已经在运行的命令。然而,变量操作可以在 bash 中本地完成,而无需启动另一个进程。

回答by higuaro

You can reverse the string with rev, cutfrom the second field and then revagain:

您可以使用rev,cut从第二个字段反转字符串,然后rev再次:

rev <<< "$VARIABLE" | cut -d"-" -f2- | rev

For HIX-R1-1----3b5126629f67892110165c524gbc5d5g1808c9b5, prints:

对于HIX-R1-1----3b5126629f67892110165c524gbc5d5g1808c9b5,打印:

HIX-R1-1---

回答by Jonathan Leffler

I think you should be using sed, at least after the tr:

我认为您应该使用sed,至少在以下之后tr

var=$(curl https://avc.com/actuator/info | tr '"' '\n' | sed -n '/-/{s/-[^-]*$//;p;q}')

The -nmeans "don't print by default". The /-/looks for a line containing a dash; it then executes s/-[^-]*$//to delete the last dash and everything after it, followed by pto print and qto quit (so it only prints the first such line).

-n意思是“默认不打印”。该/-/查找包含短划线的线; 然后执行s/-[^-]*$//删除最后一个破折号及其后的所有内容,然后p打印并q退出(因此它只打印第一行)。



I'm assuming that the output from curlintrinsically contains multiple lines, some of them with unwanted double quotes in them, and that you need to match only the first line that contains a dash at all (which might very well not be the first line). Once you've whittled the input down to the sole interesting line, you could use pure shell techniques to get the result that's desired, but getting the sole interesting line is not as trivial as some of the answers seem to be assuming.

我假设输出curl本质上包含多行,其中一些带有不需要的双引号,并且您只需要匹配包含破折号的第一行(这很可能不是第一行) . 一旦您将输入减少到唯一有趣的行,您就可以使用纯 shell 技术来获得所需的结果,但获得唯一有趣的行并不像某些答案所假设的那样微不足道。