将一列中的数字分组并在 bash 中的另一列中求和

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36715790/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 14:31:04  来源:igfitidea点击:

group numbers in a column and sum in another colums in bash

bashawk

提问by unix124

would like to group numbers within user defined distance in a column and sum corresponding values in another column in a file in bash. Here is the sample file

想在用户定义的距离内对列中的数字进行分组,并在 bash 中的文件中对另一列中的相应值求和。这是示例文件

D   seq 1876    A   seq 3802    31
D   seq 1877    A   seq 3803    104
D   seq 13691   A   seq 14117   15
D   seq 13694   A   seq 14120   65

so if user would define the distance to merge to 5 then sample output would look like

因此,如果用户定义要合并到 5 的距离,则示例输出将如下所示

D,seq,1876-1877,A,seq,3802-3803,135
D,seq,13691-13694,A,seq,14117-14120,85

采纳答案by 7171u

Something like this?

像这样的东西?

awk -v d=5 '{
    a[NR]=;
    b[NR]=
}
(a[NR]-a[NR-1] > d || b[NR]-b[NR-1] > d){
    if(NR!=1){
        print "D seq",s"-"a[NR-1],"A seq",t"-"b[NR-1],c
    };
    c=$NF;
    s=;
    t=;
    next
}
{
    c+=$NF
}
END{
    print "D seq",s"-"a[NR],"A seq",t"-"b[NR],c'
}' file.txt

Where dhas the distance value.

哪里d有距离值。