Linux 如何基于一列排序但 uniq 基于另一列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6302006/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to sort based on a column but uniq based on another column?
提问by Ken
He all,
I have a file having some columns. I would like to do sort for column 2 then apply uniq for column 1. I found this posttalking about sort and uniq for the same column but my problem is a little bit different. I am thinking of using something using sort
and uniq
but don't know how. Thanks.
他所有,我有一个包含一些列的文件。我想对第 2 列进行排序,然后对第 1 列应用 uniq。我发现这篇文章讨论了同一列的排序和 uniq,但我的问题有点不同。我正在考虑使用一些东西sort
,uniq
但不知道如何使用。谢谢。
采纳答案by Bruce
You can use pipe, however it's not in place.
您可以使用管道,但它没有到位。
Example :
例子 :
$ cat initial.txt
1,3,4
2,3,1
1,2,3
2,3,4
1,4,1
3,1,3
4,2,4
$ cat initial.txt | sort -u -t, -k1,1 | sort -t, -k2,2
3,1,3
4,2,4
1,3,4
2,3,1
Result is sorted by key 2, unique by key 1. note that result is displayed on the console, if you want it in a file, just use a redirect (> newFiletxt
)
结果按键 2 排序,按键 1 唯一。注意,结果显示在控制台上,如果您想将其放在文件中,只需使用重定向 ( > newFiletxt
)
Other solution for this kind of more complex operation is to rely on another tool (depending on your preferences (and age), awk, perl or python)
这种更复杂操作的其他解决方案是依赖另一个工具(取决于您的偏好(和年龄),awk、perl 或 python)
EDIT: If i understood correctly the new requirement, it's sorted by colum 2, column 1 is unique for a given column 2:
编辑:如果我正确理解了新要求,它将按第 2 列排序,第 1 列对于给定的第 2 列是唯一的:
$ cat initial.txt | sort -u -t, -k1,2 | sort -t, -k2,2
3,1,3
1,2,3
4,2,4
1,3,4
2,3,1
1,4,1
Is it what you expect ? Otherwise, I did not understand :-)
这是你所期望的吗?否则,我不明白:-)
回答by Praveen Lobo
uniq
needs the data to be in sorted order to work, so if you sort
on second field and then apply uniq
on first field, you won't get correct result.
uniq
需要数据按排序顺序工作,所以如果你sort
在第二个领域然后申请uniq
第一个领域,你将不会得到正确的结果。
You may want to try
你可能想尝试
sort -u -t, -k1,1 filename | sort -t, -k2,2
回答by Sultan
Just to be sure that I got what you mean correctly. You want to sort a file based on the second column in the file. Then you want to remove duplicates from the first column (another way of saying applying uniq to column one!). cool, to do this, you need to perform three tasks:
只是为了确保我理解你的意思是正确的。您想根据文件中的第二列对文件进行排序。然后您想从第一列中删除重复项(另一种说法是将 uniq 应用于第一列!)。很酷,要做到这一点,您需要执行三项任务:
- sort the column on which uniq is going to be applied (since uniq can work only on sorted input).
- apply uniq on the sorted column.
- sort the output based on the values in column two.
- 对将要应用 uniq 的列进行排序(因为 uniq 只能在已排序的输入上工作)。
- 在排序的列上应用 uniq。
- 根据第二列中的值对输出进行排序。
Using pipes: The command is
使用管道:命令是
sort -t ',' -k1 fileName| awk '!x[]++' | sort -t ',' -k2
Note that you cannot specify the first field in uniq, you can use the -f
switch to jump the first n
fields. Hence, I used awk
to replace uniq
.
注意uniq中不能指定第一个字段,可以使用-f
开关跳转第一个n
字段。因此,我曾经awk
将uniq
.
回答by sa_nyc
I used this
sort -t ',' -nk2
我用过这个
sort -t ',' -nk2
here sorts
这里排序
1,2
2,5
3,1
to
3,1
1,2
2,5