Linux 如何基于一列排序但 uniq 基于另一列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6302006/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 04:26:06  来源:igfitidea点击:

how to sort based on a column but uniq based on another column?

linuxsortinguniq

提问by Ken

He all, I have a file having some columns. I would like to do sort for column 2 then apply uniq for column 1. I found this posttalking about sort and uniq for the same column but my problem is a little bit different. I am thinking of using something using sortand uniqbut don't know how. Thanks.

他所有,我有一个包含一些列的文件。我想对第 2 列进行排序,然后对第 1 列应用 uniq。我发现这篇文章讨论了同一列的排序和 uniq,但我的问题有点不同。我正在考虑使用一些东西sortuniq但不知道如何使用。谢谢。

采纳答案by Bruce

You can use pipe, however it's not in place.

您可以使用管道,但它没有到位。

Example :

例子 :

$ cat initial.txt
1,3,4
2,3,1
1,2,3
2,3,4
1,4,1
3,1,3
4,2,4

$ cat initial.txt | sort -u -t, -k1,1 | sort -t, -k2,2
3,1,3
4,2,4
1,3,4
2,3,1

Result is sorted by key 2, unique by key 1. note that result is displayed on the console, if you want it in a file, just use a redirect (> newFiletxt)

结果按键 2 排序,按键 1 唯一。注意,结果显示在控制台上,如果您想将其放在文件中,只需使用重定向 ( > newFiletxt)

Other solution for this kind of more complex operation is to rely on another tool (depending on your preferences (and age), awk, perl or python)

这种更复杂操作的其他解决方案是依赖另一个工具(取决于您的偏好(和年龄),awk、perl 或 python)

EDIT: If i understood correctly the new requirement, it's sorted by colum 2, column 1 is unique for a given column 2:

编辑:如果我正确理解了新要求,它将按第 2 列排序,第 1 列对于给定的第 2 列是唯一的:

$ cat initial.txt | sort -u -t, -k1,2 | sort -t, -k2,2
3,1,3
1,2,3
4,2,4
1,3,4
2,3,1
1,4,1

Is it what you expect ? Otherwise, I did not understand :-)

这是你所期望的吗?否则,我不明白:-)

回答by Praveen Lobo

uniqneeds the data to be in sorted order to work, so if you sorton second field and then apply uniqon first field, you won't get correct result.

uniq需要数据按排序顺序工作,所以如果你sort在第二个领域然后申请uniq第一个领域,你将不会得到正确的结果。

You may want to try

你可能想尝试

sort  -u -t,  -k1,1 filename | sort -t, -k2,2

回答by Sultan

Just to be sure that I got what you mean correctly. You want to sort a file based on the second column in the file. Then you want to remove duplicates from the first column (another way of saying applying uniq to column one!). cool, to do this, you need to perform three tasks:

只是为了确保我理解你的意思是正确的。您想根据文件中的第二列对文件进行排序。然后您想从第一列中删除重复项(另一种说法是将 uniq 应用于第一列!)。很酷,要做到这一点,您需要执行三项任务:

  1. sort the column on which uniq is going to be applied (since uniq can work only on sorted input).
  2. apply uniq on the sorted column.
  3. sort the output based on the values in column two.
  1. 对将要应用 uniq 的列进行排序(因为 uniq 只能在已排序的输入上工作)。
  2. 在排序的列上应用 uniq。
  3. 根据第二列中的值对输出进行排序。

Using pipes: The command is

使用管道:命令是

 sort -t ',' -k1  fileName| awk '!x[]++' | sort -t ',' -k2

Note that you cannot specify the first field in uniq, you can use the -fswitch to jump the first nfields. Hence, I used awkto replace uniq.

注意uniq中不能指定第一个字段,可以使用-f开关跳转第一个n字段。因此,我曾经awkuniq.

回答by sa_nyc

I used this sort -t ',' -nk2

我用过这个 sort -t ',' -nk2

here sorts

这里排序

1,2
2,5
3,1

to

3,1
1,2
2,5