使用 Linux 剪切、排序和 uniq

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21584727/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-07 01:57:36  来源:igfitidea点击:

using Linux cut, sort and uniq

linuxsortingcutuniq

提问by

I have a list with population, year, and county and I need to cut the list, and then find the number of uniq counties.

我有一个包含人口、年份和县的列表,我需要剪切列表,然后找到 uniq 县的数量。

The list starts off like this:

列表是这样开始的:

#Population,    Year,   County
3900,   1969,   Beaver
3798,   1970,   Beaver
3830,   1971,   Beaver
3864,   1972,   Beaver
3993,   1973,   Beaver
3976,   1974,   Beaver
4064,   1975,   Beaver

There is much more to this list, and many more counties. I have to cut out the county column, sort it, and then output the number of uniq counties. I tried this command:

这个列表还有更多,还有更多的县。我要剪出县列,排序,然后输出uniq县的数量。我试过这个命令:

 cut -c3- list.txt | sort -k3 | uniq -c

But this does not cut the third list, nor does it sort it alphabetically. What am I doing wrong?

但这不会削减第三个列表,也不会按字母顺序对其进行排序。我究竟做错了什么?

采纳答案by favoretti

You can add a delimiter, which is a comma in your case:

您可以添加一个分隔符,在您的情况下是逗号:

cut -f 3 -d, list.txt | sort | uniq

then, -cspecifies characterposition, rather than field, which is specified with -f.

然后,-c指定字符位置,而不是用 指定的字段-f

To strip spaces in front you can pipe this all through, e.g. awk '{print $1}', i.e.

要去除前面的空间,您可以通过管道将其全部通过,例如awk '{print $1}',即

cut -f 3 -d, list.txt | awk '{print }' | sort | uniq

[edit]

[编辑]

Aaaaand. If you try to cutthe 3rd field out, you are left with only one field after the pipe, so sorting on the 3rd field won't work, which is why I omitted it in my example. You get 1 field, you just sort on it and the apply uniq.

啊啊啊。如果您尝试cut输出第三个字段,则管道后只剩下一个字段,因此对第三个字段进行排序将不起作用,这就是我在示例中省略它的原因。你得到 1 个字段,你只需对其进行排序和 apply uniq

回答by FabienAndre

You can use awk to extract third field (space delimited), and then do your sort/uniq stuff.

您可以使用 awk 提取第三个字段(空格分隔),然后进行排序/uniq 操作。

awk '{print }' list.txt |sort |uniq -c