bash 如何删除基于列值的重复行？

Question

提问by user3494949

Given the following table

鉴于下表

 123456.451 entered-auto_attendant
 123456.451 duration:76 real:76
 139651.526 entered-auto_attendant
 139651.526 duration:62 real:62`
 139382.537 entered-auto_attendant

Using a bash shell script based in Linux, I'd like to delete all the rows based on the value of column 1 (The one with the long number). Having into consideration that this number is a variable number

使用基于 Linux 的 bash shell 脚本，我想根据第 1 列的值（具有长数字的值）删除所有行。考虑到这个数字是一个可变数字

I've tried with

我试过

awk '{a[$3]++}!(a[$3]-1)' file

sort -u | uniq

But I am not getting the result which would be something like this, making a comparison between all the values of the first column, delete all the duplicates and show it

但我没有得到类似这样的结果，在第一列的所有值之间进行比较，删除所有重复项并显示它

 123456.451 entered-auto_attendant
 139651.526 entered-auto_attendant
 139382.537 entered-auto_attendant

Answer 1

采纳答案by Kent

you didn't give an expected output, does this work for you?

你没有给出预期的输出，这对你有用吗？

 awk '!a[]++' file

with your data, the output is:

使用您的数据，输出为：

123456.451 entered-auto_attendant
139651.526 entered-auto_attendant
139382.537 entered-auto_attendant

and this line prints only unique column1 line:

此行仅打印唯一的 column1 行：

 awk '{a[]++;b[]=139382.537 entered-auto_attendant
}END{for(x in a)if(a[x]==1)print b[x]}' file

output:

输出：

sort -t ' ' -k 1,1 -u file

Answer 2

回答by that other guy

uniq, by default, compares the entire line. Since your lines are not identical, they are not removed.

uniq，默认情况下，比较整行。由于您的行不相同，因此不会删除它们。

You can use sortto conveniently sort by the first field and also delete duplicates of it:

您可以使用sort方便地按第一个字段排序并删除它的重复项：

awk '!x[]++ { print ,  }' file

-t ' 'fields are separated by spaces
-k 1,1: only look at the first field
-u: delete duplicates

-t ' '字段由空格分隔
-k 1,1: 只看第一个字段
-u: 删除重复项

Additionally, you might have seen the awk '!a[$0]++'trick for deduplicating lines. You can make this dedupe on the first column only using awk '!a[$1]++'.

此外，您可能已经看到了awk '!a[$0]++'重复数据删除行的技巧。您只能使用awk '!a[$1]++'.

Answer 3

回答by Yogesh Deore

try this command

试试这个命令

awk '!( in a){a[]++; next}  in a' file
123456.451 duration:76 real:76
139651.526 duration:62 real:62

Answer 4

回答by anubhava

Using awk:

使用 awk：

##代码##

bash 如何删除基于列值的重复行？

提问by user3494949

采纳答案by Kent

回答by that other guy

回答by Yogesh Deore

回答by anubhava

相关推荐

最近更新

标签

bash 如何删除基于列值的重复行？

提问by user3494949

采纳答案by Kent

回答by that other guy

回答by Yogesh Deore

回答by anubhava

相关推荐

Bash 'read' 命令在 Mac 上不接受 -i 参数。任何替代方案？

bash Linux：在案例中调用函数

bash 在 .bash_profile 中创建运行 shell 脚本的别名

使用 bash 脚本连接到远程 SQL Server

相关推荐

最近更新

标签