bash 删除 grep 输出中的重复项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49313160/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 16:52:03  来源:igfitidea点击:

Removing duplicates in grep output

bashgrep

提问by john mas

I have a case where i got a results file with the following pattern:

我有一个案例,我得到了一个具有以下模式的结果文件:

path:pattern found

for example

例如

./user/home/file1:this is a game

in other words when i searched for some string i got the file and the line it found it.

换句话说,当我搜索某个字符串时,我得到了文件和找到它的行。

Problem is sometimes i have multiple cases in the same file so i would like to remove the duplicates files (the cases would be different so it's not possible).

问题是有时我在同一个文件中有多个案例,所以我想删除重复的文件(案例会有所不同,所以这是不可能的)。

Any help or ideas are appreciated :)

任何帮助或想法表示赞赏:)

End results is to turn this:

最终结果是变成这样:

/user/home/desktop/file1:this is a game
/user/home/desktop/file1:what kind of game
/user/home/desktop/file1:fast action game

into just the first results found without losing all the rest of the data in the file.

进入找到的第一个结果,而不会丢失文件中的所有其余数据。

Update1:

更新1:

So the actual file looks like this:

所以实际的文件是这样的:

/user/home/desktop/file1:this is a game
/user/home/desktop/file1:what kind of game
/user/home/desktop/file1:fast action game
/user/home/desktop/file2:a game
/user/home/desktop/file3:of game
/user/home/desktop/file4:fast game

i'm looking to get rid of the multiple occurrences in the same file so it should look like this:

我希望摆脱同一个文件中的多次出现,所以它应该是这样的:

/user/home/desktop/file1:this is a game
/user/home/desktop/file2:a game
/user/home/desktop/file3:of game
/user/home/desktop/file4:fast game

回答by codeforester

You could use sort -u:

你可以使用sort -u

grep pattern files | sort -t: -u -k1,1
  • -t:- use : as the delimiter
  • -k1,1- sort based on the first field only
  • -u- removed duplicates (based on the first field)
  • -t:- 使用:作为分隔符
  • -k1,1- 仅根据第一个字段排序
  • -u- 删除重复项(基于第一个字段)

This will retain just one occurrence of files, removing any duplicates.

这将只保留一次文件,删除任何重复项。

For your example, this is the output you get:

对于您的示例,这是您获得的输出:

/user/home/desktop/file1:this is a game

In case you are looking for multiple distinct matches with a file, then:

如果您正在寻找与文件的多个不同匹配项,则:

grep pattern files | sort -u