bash 如何找到括号之间出现的所有单词？

Question

提问by Village

I have a file containing some words in parenthesis. I'd like to compile a list of all of the unique words appearing there, e.g.:

我有一个文件，括号中包含一些单词。我想汇编出现在那里的所有独特单词的列表，例如：

This is some (text).
This (text) has some (words) in parenthesis.
Sometimes, there are numbers, such as (123) in parenthesis too.

This would be the resulting list:

这将是结果列表：

text
words
123

How can I list all of the items appearing between parenthesis?

如何列出括号之间出现的所有项目？

Answer 1

回答by Steve

You can use awklike this:

你可以这样使用awk：

awk -F "[()]" '{ for (i=2; i<NF; i+=2) print $i }' file.txt

prints:

印刷：

text
text
words
123

You can use an array to print the unique values:

您可以使用数组来打印唯一值：

awk -F "[()]" '{ for (i=2; i<NF; i+=2) array[$1]=$i; print array[$1] }' file.txt

prints:

印刷：

text
words
123

HTH

Answer 2

回答by glenn Hymanman

With GNU grep, you can use a perl-compatible regex with look-around assertions to exclude the parens:

使用 GNU grep，您可以使用与 perl 兼容的正则表达式和环视断言来排除括号：

grep -Po '(?<=\().*?(?=\))' file.txt | sort -u

Answer 3

回答by mkb

grep -oE '$[[:alnum:]]*?$' | sed 's/[()]//g' | sort | uniq

-oOnly prints the matching text
-Emeans use extended regular expressions
\(means match a literal paren
[[:alnum:]]is the POSIX character class for letters and numbers.

-o只打印匹配的文本
-E意味着使用扩展的正则表达式
\(表示匹配文字括号
[[:alnum:]]是字母和数字的 POSIX 字符类。

That sedscript should strip out the parens. This is tested against GNU grep, but BSD sed so be wary.

该sed脚本应该去掉括号。这是针对 GNU grep 测试的，但是 BSD sed 所以要小心。

Answer 4

回答by Mark O'Connor

To reproduce your list:

要重现您的列表：

cat file.txt | sed  's/.*(\(.*\)).*//'

To compile a list of unique words, you need to process the list further:

要编译唯一单词列表，您需要进一步处理列表：

cat file.txt | sed  's/.*(\(.*\)).*//' | sort | uniq

Answer 5

回答by Venkat Madhav

You can try this

你可以试试这个

 sed -e 's/\(/\n\(/g' -e 's/\)/\n/g' filename|awk -F'(' '{print }'|sort -u

Explaination:

说明：

The 1st sed statement places the words in parenthesis in new line and the second sed replaces the character ')' with new line. So after running the below statement

第一个 sed 语句将括号中的单词放在新行中，第二个 sed 将字符 ')' 替换为新行。所以在运行下面的语句后

sed -e 's/\(/\n\(/g' -e 's/\)/\n/g' filename

the output would look like this

输出看起来像这样

This is some 
(text
.This 
(text
has some 
(words
 in parenthesis.
Sometimes, there are numbers, such as 
(123
 in parenthesis too.

Now piping this output to below awk statement which prints the second word between the filter character '('

现在将此输出传送到 awk 语句下方，该语句打印过滤器字符 '(' 之间的第二个单词

awk -F'(' '{print }'

the output now will be

现在的输出将是

text
text
words
123

the above output is piped to sort -u command to give unique words from the above output. Hope this explanation helps.

上面的输出通过管道传送到 sort -u 命令，以从上面的输出中给出唯一的词。希望这个解释有帮助。

bash 如何找到括号之间出现的所有单词？

提问by Village

回答by Steve

回答by glenn Hymanman

回答by mkb

回答by Mark O'Connor

回答by Venkat Madhav

相关推荐

最近更新

标签

bash 如何找到括号之间出现的所有单词？

提问by Village

回答by Steve

回答by glenn Hymanman

回答by mkb

回答by Mark O'Connor

回答by Venkat Madhav

相关推荐

bash 带通配符的猫

Bash 的源命令不适用于来自 Internet 的 curl'd 文件

bash /usr/bin/env 错误的解释器

bash 使用 inotify-tools 作为守护进程处理数据

相关推荐

最近更新

标签