bash 按列消除部分重复的行并保留最后一行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5429840/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 20:19:23  来源:igfitidea点击:

Eliminate partially duplicate lines by column and keep the last one

bashawksedtext-processing

提问by Dagang

I have a file that looks like this:

我有一个看起来像这样的文件:

2011-03-21 name001 line1
2011-03-21 name002 line2
2011-03-21 name003 line3
2011-03-22 name002 line4
2011-03-22 name001 line5

for each name, I only want its last appearance. So, I expect the result to be:

对于每个名字,我只想要它的最后一次出现。所以,我希望结果是:

2011-03-21 name003 line3
2011-03-22 name002 line4
2011-03-22 name001 line5

Could someone give me a solution with bash/awk/sed?

有人可以用 bash/awk/sed 给我一个解决方案吗?

回答by PaulP

This code get uniq lines by second field but from the end of file or text (like in your result example)

此代码按第二个字段获取 uniq 行,但从文件或文本的末尾(如您的结果示例中所示)

tac temp.txt | sort -k2,2 -r -u

回答by pepoluan

awk '{a[]=
awk '!a[] {b[++i]=} {a[]=
tac file | awk '!a[] {b[++i]=} {a[]=
awk '!a[] {b[++i]=} {a[]=
tac file | awk '!a[] {b[++i]=} {a[]=
sort < bar > foo
uniq  < foo > bar
} END {for (i in b) print a[b[i]]}'
} END {for (i in b) print a[b[i]]}' file
} END {for (i in b) print a[b[i]]}'
} END {for (i in b) print a[b[i]]}' file
} END {for (i in a) print a[i]}' file


If order of appearance is important:

如果出现顺序很重要:

  • Based on first appearance:

    sort -k 2 filename | while read f1 f2 f3; do if [ ! "$f2" = "$lf2" ]; then echo "$f1 $f2 $f3"; lf2="$f2"; fi; done
    
  • Based on last appearance:

    ##代码##
  • 基于首次出现:

    ##代码##
  • 根据上次出现:

    ##代码##

回答by nkvnkv

##代码##

bar now has no duplicated lines

酒吧现在没有重复的行

回答by Erik

EDIT: Here's a version that actually answers the question.

编辑:这是一个实际回答问题的版本。

##代码##