bash 如何使用sed从字符串中提取多个文本和数字？

Question

提问by Mikey R

How can I extract 3 or more separate text from a line using 'sed'

如何使用“sed”从一行中提取 3 个或更多单独的文本

I have the following line:

我有以下几行：

echo <MX><[Mike/DOB-029/Post-555/Male]><MX>

So far I am able to extract the 'DOB-029' by doing

到目前为止，我能够通过执行提取“DOB-029”

sed -n 's/.*\(DOB-[0-9]*\).*//p'

but I am not getting the other texts such as the name or the post.

但我没有收到其他文本，例如姓名或职位。

My expected output should be MikeDOB-029Post-555

我的预期输出应该是Mike DOB-029 Post-555

Edited

已编辑

Say I have a list within a file and I want to extract specific text/IDs from the entire list and save it to a .txt file

假设我在文件中有一个列表，我想从整个列表中提取特定的文本/ID 并将其保存到 .txt 文件中

Answer 1

回答by ShellFish

sed 's/.*[$.*$.$DOB-[0-9]*$.$Post-[0-9]*$.*/\1 \2 \3/'should do the trick!

sed 's/.*[$.*$.$DOB-[0-9]*$.$Post-[0-9]*$.*/\1 \2 \3/'应该做的伎俩！

Parts in between $and $are captured strings that can be called upon using \iwith ithe index of the group.

$和之间的部分$是捕获的字符串，可以\i与i组的索引一起使用。

Script for custom use:

自定义使用脚本：

#! /bin/bash


fields=${1:-123}
file='/path/to/input'

name=$(sed 's/.*\[\([^\/]*\)\/.*//' $file)
dob=$(sed 's/.*\(DOB-[0-9]*\).*//' $file)
post=$(sed 's/.*\(Post-[0-9]*\).*//' $file)

[[ $fields =~ .*1.* ]] && output=$name
[[ $fields =~ .*2.* ]] && output="$output $dob"
[[ $fields =~ .*3.* ]] && output="$output $post"

echo $output

Set the file with the line you want to parse in the filevariable (I can add more functionality such as supplying the file as argument or getting it from a larger file if you like). And execute the script with an int argument, if this int contains '1' it will display name, if 2, it will display DOB and 3 will output post information. You can combine to e.g. '123' or '32' or whichever combination you like.

使用file变量中要解析的行设置文件（如果您愿意，我可以添加更多功能，例如将文件作为参数提供或从更大的文件中获取）。并使用 int 参数执行脚本，如果该 int 包含 '1' 则显示名称，如果为 2，则显示 DOB，3 将输出发布信息。您可以组合成例如“123”或“32”或您喜欢的任何组合。

Stdin

标准输入

If you want to read from stdin, use following script:

如果要从 stdin 读取，请使用以下脚本：

#! /usr/bin/env bash

line=$(cat /dev/stdin)

fields=${1:-123}

name=$(echo $line | sed 's/.*\[\([^\/]*\)\/.*//')
dob=$(echo $line | sed 's/.*\(DOB-[0-9]*\).*//')
post=$(echo $line | sed 's/.*\(Post-[0-9]*\).*//')

[[ $fields =~ .*1.* ]] && output=$name
[[ $fields =~ .*2.* ]] && output="$output $dob"
[[ $fields =~ .*3.* ]] && output="$output $post"

echo $output

Example usage:

用法示例：

$ chmod +x script.sh
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 123
Mike DOB-029 Post-555
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 12
Mike DOB-029
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 32
DOB-029 Post-555
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 
Mike DOB-029 Post-555

Answer 2

回答by Arjun Mathew Dan

A solution with awk:

awk的解决方案：

echo "<MX><[Mike/DOB-029/Post-555/Male]><MX>" | awk -F[/[] '{print , , }'

We set the delimiter as /or [(-F[/[]). then we just print the fields $2, $3 and $4which are the 2nd, 3rd and 4th fieldsrespectively.

我们将分隔符设置为/或[( -F[/[])。然后我们只打印分别$2, $3 and $4是的字段2nd, 3rd and 4th fields。

With sed:

使用sed：

echo "<MX><[Mike/DOB-029/Post-555/Male]><MX>" | sed 's/\(^.*\[\)\(.*\)\(\/[^/]*$\)//; s/\// /g'

Answer 3

回答by Marc Bredt

use the bash substitution builtins.

使用 bash 替换内置函数。

line="<MX><[Mike/D0B-029/Post-555/Male]><MX>"; 
linel=${line/*[/}; liner=${linel%\/*}; echo ${liner//\// }

bash 如何使用sed从字符串中提取多个文本和数字？

提问by Mikey R

回答by ShellFish

回答by Arjun Mathew Dan

回答by Marc Bredt

相关推荐

最近更新

标签

bash 如何使用sed从字符串中提取多个文本和数字？

提问by Mikey R

回答by ShellFish

回答by Arjun Mathew Dan

回答by Marc Bredt

相关推荐

bash 使用脚本将字符串传递给 linux cli 交互式程序

bash 在bash脚本中，如何在while循环条件下调用函数

用信号陷阱中断 bash 中的睡眠

使用 sed 将 bash 变量中的单词大写

相关推荐

最近更新

标签