使用 Bash 脚本删除重复条目

Question

提问by divz

I want to remove duplicate entries from a text file, e.g:

我想从文本文件中删除重复的条目，例如：

kavitha= Tue Feb    20 14:00 19 IST 2012  (duplicate entry) 
sree=Tue Jan  20 14:05 19 IST 2012  
divya = Tue Jan  20 14:20 19 IST 2012  
anusha=Tue Jan 20 14:45 19 IST 2012 
kavitha= Tue Feb    20 14:00 19 IST 2012 (duplicate entry)

Is there any possible way to remove the duplicate entries using a Bash script?

有没有可能使用 Bash 脚本删除重复条目的方法？

Desired output

期望输出

kavitha= Tue Feb    20 14:00 19 IST 2012 
sree=Tue Jan  20 14:05 19 IST 2012  
divya = Tue Jan  20 14:20 19 IST 2012  
anusha=Tue Jan 20 14:45 19 IST 2012

Answer 1

回答by kev

You can sortthen uniq:

sort然后你可以uniq：

$ sort -u input.txt

Or use awk:

或使用awk：

$ awk '!a[sed '$!N; /^\(.*\)\n$/!P; D'
]++' input.txt

Answer 2

回答by Siva Charan

It deletes duplicate, consecutive lines from a file (emulates "uniq").
First line in a set of duplicate lines is kept, rest are deleted.

它从文件中删除重复的连续行（模拟“uniq”）。
保留一组重复行中的第一行，删除其余行。

perl -ne 'print if ! $a{$_}++' input

Answer 3

回答by Chris Koknat

Perl one-liner similar to @kev's awk solution:

Perl one-liner 类似于@kev 的 awk 解决方案：

perl -lne 's/\s*$//; print if ! $a{$_}++' input

This variation removes trailing whitespace before comparing:

此变体在比较之前删除尾随空格：

perl -i -ne 'print if ! $a{$_}++' input

This variation edits the file in-place:

此变体就地编辑文件：

perl -i.bak -ne 'print if ! $a{$_}++' input

This variation edits the file in-place, and makes a backup input.bak

此变体就地编辑文件，并进行备份 input.bak

cat -n file.txt |
sort -u -k2,7 |
sort -n |
sed 's/.*\t/    /;s/\([0-9]\{4\}\).*//'

Answer 4

回答by potong

This might work for you:

这可能对你有用：

 awk '{line=substr(##代码##,1,match(##代码##,/[0-9][0-9][0-9][0-9]/)+3);sub(/^/,"    ",line);if(!dup[line]++)print line}' file.txt

or this:

或这个：

##代码##

使用 Bash 脚本删除重复条目

提问by divz

回答by kev

回答by Siva Charan

回答by Chris Koknat

回答by potong

相关推荐

最近更新

标签

使用 Bash 脚本删除重复条目

提问by divz

回答by kev

回答by Siva Charan

回答by Chris Koknat

回答by potong

相关推荐

bash 在命令行中执行 perl 而不在 UNIX 中指定 perl

bash 对于目录中的文件，只回显文件名（无路径）

bash 从每个参数中删除尾部斜杠的最简单方法是什么？

不要将当前的 bash 会话保存到历史记录中

相关推荐

最近更新

标签