Bash Shell - 在第二次出现某个字符后返回子字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15408327/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 04:51:19  来源:igfitidea点击:

Bash Shell - Return substring after second occurrence of certain character

linuxbashsedsubstring

提问by Leo

I need to return everything after a delimeter I decide but still don't fully know how to use sed. What I need to do is:

我需要在我决定的分隔符之后返回所有内容,但仍然不完全知道如何使用 sed。我需要做的是:

$ echo "ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,," \
  | sed <some regexp>

For this example the return should be (substring)everything after the second comma:

对于这个例子,返回应该是(子字符串)第二个逗号之后的所有内容:

123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,

I can do this with cut like this: echo "ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,," | cut -d',' -f 2

我可以像这样用 cut 做到这一点: echo "ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,," | cut -d',' -f 2

but I've been told cut is slower than sed...

但有人告诉我 cut 比 sed 慢...

Can some guru who has them (and wants to... :) ) give me a few minutes of his time and advice me please? Thanks! Leo

一些拥有它们(并想要...... :))的大师可以给我几分钟的时间并给我建议吗?谢谢!狮子座

采纳答案by Thor

In my experience cutis always faster than sed.

以我的经验cut总是比sed.

To do what you want with sedyou could use a non-matching group:

要执行您想要的操作,sed您可以使用不匹配的组:

echo 'ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,' |
  sed -r 's/([^,]*,){2}//'

This removes the first two fields (if the fields do not contain commas themselves) by removing non-comma characters [^,]followed by a comma twice {2}.

这通过删除非逗号字符[^,]后跟逗号两次来删除前两个字段(如果字段本身不包含逗号){2}

Output:

输出:

123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,

回答by chepner

You could also try doing the extraction in bashwithout spawning an external process at all:

您也可以尝试在bash不产生外部进程的情况下进行提取:

$ [[ 'ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,' =~ [^,]*,[^,]*,(.*) ]]
$ echo "${BASH_REMATCH[@]}"
123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,

or

或者

$ FOO='ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,'
$ echo ${FOO/+([^,]),+([^,]),}

or

或者

$ IFS=, read -a FOO <<< 'ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,'
$ echo ${FOO[@]:2}

(Assuming this is for a one-off match, not iterating over the contents of a file.)

(假设这是一次性匹配,而不是迭代文件的内容。)

回答by Vijay Nirmal

This method is by find the index of second occurrence of a character and using bash substring to get the required result

这种方法是通过查找字符第二次出现的索引并使用bash子字符串来获得所需的结果

input="ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,"
index=$(($(echo $input| grep -aob '/' | grep -oE '[0-9]+' | awk 'NR==2') + 1))
result=${input:$index}