Linux 如何基于一列排序但 uniq 基于另一列？

Question

提问by Ken

He all, I have a file having some columns. I would like to do sort for column 2 then apply uniq for column 1. I found this posttalking about sort and uniq for the same column but my problem is a little bit different. I am thinking of using something using sortand uniqbut don't know how. Thanks.

他所有，我有一个包含一些列的文件。我想对第 2 列进行排序，然后对第 1 列应用 uniq。我发现这篇文章讨论了同一列的排序和 uniq，但我的问题有点不同。我正在考虑使用一些东西sort，uniq但不知道如何使用。谢谢。

Answer 1

采纳答案by Bruce

You can use pipe, however it's not in place.

您可以使用管道，但它没有到位。

Example :

例子：

$ cat initial.txt
1,3,4
2,3,1
1,2,3
2,3,4
1,4,1
3,1,3
4,2,4

$ cat initial.txt | sort -u -t, -k1,1 | sort -t, -k2,2
3,1,3
4,2,4
1,3,4
2,3,1

Result is sorted by key 2, unique by key 1. note that result is displayed on the console, if you want it in a file, just use a redirect (> newFiletxt)

结果按键 2 排序，按键 1 唯一。注意，结果显示在控制台上，如果您想将其放在文件中，只需使用重定向 ( > newFiletxt)

Other solution for this kind of more complex operation is to rely on another tool (depending on your preferences (and age), awk, perl or python)

这种更复杂操作的其他解决方案是依赖另一个工具（取决于您的偏好（和年龄），awk、perl 或 python）

EDIT: If i understood correctly the new requirement, it's sorted by colum 2, column 1 is unique for a given column 2:

编辑：如果我正确理解了新要求，它将按第 2 列排序，第 1 列对于给定的第 2 列是唯一的：

$ cat initial.txt | sort -u -t, -k1,2 | sort -t, -k2,2
3,1,3
1,2,3
4,2,4
1,3,4
2,3,1
1,4,1

Is it what you expect ? Otherwise, I did not understand :-)

这是你所期望的吗？否则，我不明白:-)

Answer 2

回答by Praveen Lobo

uniqneeds the data to be in sorted order to work, so if you sorton second field and then apply uniqon first field, you won't get correct result.

uniq需要数据按排序顺序工作，所以如果你sort在第二个领域然后申请uniq第一个领域，你将不会得到正确的结果。

You may want to try

你可能想尝试

sort  -u -t,  -k1,1 filename | sort -t, -k2,2

Answer 3

回答by Sultan

Just to be sure that I got what you mean correctly. You want to sort a file based on the second column in the file. Then you want to remove duplicates from the first column (another way of saying applying uniq to column one!). cool, to do this, you need to perform three tasks:

只是为了确保我理解你的意思是正确的。您想根据文件中的第二列对文件进行排序。然后您想从第一列中删除重复项（另一种说法是将 uniq 应用于第一列！）。很酷，要做到这一点，您需要执行三项任务：

sort the column on which uniq is going to be applied (since uniq can work only on sorted input).
apply uniq on the sorted column.
sort the output based on the values in column two.

对将要应用 uniq 的列进行排序（因为 uniq 只能在已排序的输入上工作）。
在排序的列上应用 uniq。
根据第二列中的值对输出进行排序。

Using pipes: The command is

使用管道：命令是

 sort -t ',' -k1  fileName| awk '!x[]++' | sort -t ',' -k2

Note that you cannot specify the first field in uniq, you can use the -fswitch to jump the first nfields. Hence, I used awkto replace uniq.

注意uniq中不能指定第一个字段，可以使用-f开关跳转第一个n字段。因此，我曾经awk将uniq.

Answer 4

回答by sa_nyc

I used this sort -t ',' -nk2

我用过这个 sort -t ',' -nk2

here sorts

这里排序

1,2
2,5
3,1

to

3,1
1,2
2,5

Linux 如何基于一列排序但 uniq 基于另一列？

提问by Ken

采纳答案by Bruce

回答by Praveen Lobo

回答by Sultan

回答by sa_nyc

相关推荐

最近更新

标签

Linux 如何基于一列排序但 uniq 基于另一列？

提问by Ken

采纳答案by Bruce

回答by Praveen Lobo

回答by Sultan

回答by sa_nyc

相关推荐

C# 禁用视图 (ASP.NET MVC) 中的所有控件（文本框、复选框、按钮等）

如何在 Linux 上设置私有 git 服务器

Linux 中断阻塞读取

C# 枚举中定义的项目总数

相关推荐

最近更新

标签