bash 如何在unix bash中删除文本文件中的重复行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18170647/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to delete duplicate lines in a text file in unix bash?
提问by t28292
I just have a file.txt with multiple lines, I would like to remove duplicate lines without sorting the file. what command can i use in unix bash ?
我只有一个多行的 file.txt,我想删除重复的行而不对文件进行排序。我可以在 unix bash 中使用什么命令?
sample of file.txt
文件示例.txt
orangejuice;orange;juice_apple
pineapplejuice;pineapple;juice_pineapple
orangejuice;orange;juice_apple
sample of output:
输出样本:
orangejuice;orange;juice_apple
pineapplejuice;pineapple;juice_pineapple
回答by Steve
One way using awk
:
一种使用方式awk
:
awk '!a[perl -ne 'print unless $seen{$_}++' file.txt
]++' file.txt
回答by choroba
You can use Perl for this:
您可以为此使用 Perl:
##代码##The -n
switch makes Perl process the file line by line. Each line ($_
) is stored as a key in a hash named "seen", but since ++
happens after returning the value, the line is printed the first time it is met.
该-n
开关使 Perl 逐行处理文件。每行 ( $_
) 作为键存储在名为“seen”的散列中,但由于++
在返回值后发生,因此在第一次遇到时打印该行。