bash 如何在unix bash中删除文本文件中的重复行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18170647/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 00:02:53  来源:igfitidea点击:

how to delete duplicate lines in a text file in unix bash?

bash

提问by t28292

I just have a file.txt with multiple lines, I would like to remove duplicate lines without sorting the file. what command can i use in unix bash ?

我只有一个多行的 file.txt,我想删除重复的行而不对文件进行排序。我可以在 unix bash 中使用什么命令?

sample of file.txt

文件示例.txt

orangejuice;orange;juice_apple
pineapplejuice;pineapple;juice_pineapple
orangejuice;orange;juice_apple

sample of output:

输出样本:

orangejuice;orange;juice_apple
pineapplejuice;pineapple;juice_pineapple

回答by Steve

One way using awk:

一种使用方式awk

awk '!a[
perl -ne 'print unless $seen{$_}++' file.txt
]++' file.txt

回答by choroba

You can use Perl for this:

您可以为此使用 Perl:

##代码##

The -nswitch makes Perl process the file line by line. Each line ($_) is stored as a key in a hash named "seen", but since ++happens after returning the value, the line is printed the first time it is met.

-n开关使 Perl 逐行处理文件。每行 ( $_) 作为键存储在名为“seen”的散列中,但由于++在返回值后发生,因此在第一次遇到时打印该行。