bash 如何从文件 linux 中找到唯一的单词

Question

提问by jan345

i have a big file, teh lines look like this Text numbers etc. [Man-(some numers)] is lot of this Man-somenumbers is repeat in few lines, i want to count only unique Mans -words. I cant use unique file , because text before Man words is always different in each line. How can i count only unique Man-somenumbers words in file ?

我有一个大文件，这些行看起来像这样的文本数字等。 [Man-(some numers)] 很多这个 Man-somenumbers 在几行中重复，我只想计算唯一的 Mans -words。我不能使用 unique file ，因为 Man words 之前的文本在每一行中总是不同的。我如何才能只计算文件中唯一的 Man-somenumbers 单词？

Answer 1

回答by Wintermute

If I understand what you want to do correctly, then

如果我理解你想要正确做的事情，那么

grep -oE 'Man-[0-9]+' filename | sort | uniq -c

should do the trick. It works as follows: First

应该做的伎俩。它的工作原理如下：首先

grep -oE 'Man-[0-9]+' filename

isolates all words from the file that match the Man-[0-9]+regular expression. That list is then piped through sortto get the sorted list that uniqrequires, and then that sorted list is piped through uniq -cto count how often each unique Man-word appears.

从文件中分离出与Man-[0-9]+正则表达式匹配的所有单词。然后通过管道传输该列表sort以获取所需的排序列表，uniq然后通过管道传输该排序列表uniq -c以计算每个唯一Man-单词出现的频率。

bash 如何从文件 linux 中找到唯一的单词

提问by jan345

回答by Wintermute

相关推荐

最近更新

标签

bash 如何从文件 linux 中找到唯一的单词

提问by jan345

回答by Wintermute

相关推荐

'cd ${0%/*}' 在 bash 中是什么意思？

bash 使用带分隔符的“ssconvert”将 CSV 转换为 XLS

在 BASH / SHELL 中捕获输出和退出代码

bash 编写脚本以使用预定义的密码创建多个用户

相关推荐

最近更新

标签