bash 按列消除部分重复的行并保留最后一行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5429840/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Eliminate partially duplicate lines by column and keep the last one
提问by Dagang
I have a file that looks like this:
我有一个看起来像这样的文件:
2011-03-21 name001 line1
2011-03-21 name002 line2
2011-03-21 name003 line3
2011-03-22 name002 line4
2011-03-22 name001 line5
for each name, I only want its last appearance. So, I expect the result to be:
对于每个名字,我只想要它的最后一次出现。所以,我希望结果是:
2011-03-21 name003 line3
2011-03-22 name002 line4
2011-03-22 name001 line5
Could someone give me a solution with bash/awk/sed?
有人可以用 bash/awk/sed 给我一个解决方案吗?
回答by PaulP
This code get uniq lines by second field but from the end of file or text (like in your result example)
此代码按第二个字段获取 uniq 行,但从文件或文本的末尾(如您的结果示例中所示)
tac temp.txt | sort -k2,2 -r -u
回答by pepoluan
awk '{a[]=awk '!a[] {b[++i]=} {a[]=tac file | awk '!a[] {b[++i]=} {a[]=awk '!a[] {b[++i]=} {a[]=tac file | awk '!a[] {b[++i]=} {a[]=sort < bar > foo
uniq < foo > bar
} END {for (i in b) print a[b[i]]}'
} END {for (i in b) print a[b[i]]}' file
} END {for (i in b) print a[b[i]]}'
} END {for (i in b) print a[b[i]]}' file
} END {for (i in a) print a[i]}' file
If order of appearance is important:
如果出现顺序很重要:
Based on first appearance:
sort -k 2 filename | while read f1 f2 f3; do if [ ! "$f2" = "$lf2" ]; then echo "$f1 $f2 $f3"; lf2="$f2"; fi; done
Based on last appearance:
##代码##
基于首次出现:
##代码##根据上次出现:
##代码##
回答by nkvnkv
bar now has no duplicated lines
酒吧现在没有重复的行
回答by Erik
EDIT: Here's a version that actually answers the question.
编辑:这是一个实际回答问题的版本。
##代码##