Linux 使用 AWK 查找列中的最小和最大数字?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8604498/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using AWK to find the smallest and largest number in a column?
提问by mahmood
If I have a file with few column and I want to use an AWK command to show the largest and the lowest number in a particular column!
如果我有一个包含几列的文件,并且我想使用 AWK 命令来显示特定列中的最大和最小数字!
example:
例子:
a 212
b 323
c 23
d 45
e 54
f 102
I want my command to show that the lowest number is 23 and another command to say the highest number is 323
我希望我的命令显示最小数字是 23,另一个命令显示最大数字是 323
I have no idea why the answers are not working! I put a more realistic example of my file( maybe I should mention that is tab determined)
我不知道为什么答案不起作用!我把我的文件的一个更现实的例子(也许我应该提到这是由制表符决定的)
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="# high-quality bases">
##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled strand bias P-value">
##FORMAT=<ID=PL,Number=-1,Type=Integer,Description="List of Phred-scaled genotype likelihoods, number of values is (#ALT+1)*(#ALT+2)/2">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT rmdup_wl_25248.bam
Chr10 247 . T C 7.8 . DP=37;AF1=0.5;CI95=0.5,0.5;DP4=7,1,19,0;MQ=15;FQ=6.38;PV4=0.3,1,0.038,1 GT:PL:GQ 0/1:37,0,34:36
Chr10 447 . A C 75 . DP=30;AF1=1;CI95=1,1;DP4=0,0,22,5;MQ=14;FQ=-108 GT:PL:GQ 1/1:108,81,0:99
Chr10 449 . G C 35.2 . DP=33;AF1=1;CI95=0.5,1;DP4=3,2,20,3;MQ=14;FQ=-44;PV4=0.21,1.7e-06,1,0.34 GT:PL:GQ 1/1:68,17,0:31
Chr10 517 . G A 222 . DP=197;AF1=1;CI95=1,1;DP4=0,0,128,62;MQ=24;FQ=-282 GT:PL:GQ 1/1:255,255,0:99
Chr10 761 . G A 27 . DP=185;AF1=0.5;CI95=0.5,0.5;DP4=24,71,8,54;MQ=20;FQ=30;PV4=0.07,8.4e-50,1,1 GT:PL:GQ 0/1:57,0,149:60
Chr10 1829 . A G 3.01 . DP=74;AF1=0.4998;CI95=0.5,0.5;DP4=18,0,54,0;MQ=19;FQ=4.68;PV4=1,9.1e-12,0.003,1 GT:PL:GQ 0/1:30,0,45:28
I should say that I have already add excluding line that start with # so this is the command that I use:
我应该说我已经添加了以 # 开头的排除行,所以这是我使用的命令:
awk ' !~/#/' | awk -F'\t' 'BEGIN{first=1;} {if (first) { max = min = ; first = 0; next;} if (max < ) max=; if (min > ) min=; } END { print min, max }' wl_25210_filtered.vcf
awk ' !~/#/' | awk -F'\t' 'BEGIN{getline;min=max=} NF{ max=(max>)?max: min=(min>)?:min} END{print min,max}' wl_25210_filtered.vcf
and
和
awk ' !~/#/' | awk -F'\t' '
NR==2{min=max=;next}
NR>2 && NF{
max=(max>)?max:
min=(min>)?:min
}
END{print min,max}' wl_25210_filtered.vcf
采纳答案by jaypal singh
You can create two user defined functions and use them as per your need. This will offer more generic solution.
您可以创建两个用户定义的函数并根据需要使用它们。这将提供更通用的解决方案。
[jaypal:~/Temp] cat file
a 212
b 323
c 23
d 45
e 54
f 102
[jaypal:~/Temp] awk '
function max(x){i=0;for(val in x){if(i<=x[val]){i=x[val];}}return i;}
function min(x){i=max(x);for(val in x){if(i>x[val]){i=x[val];}}return i;}
{a[]=;next}
END{minimum=min(a);maximum=max(a);print "Maximum = "maximum " and Minimum = "minimum}' file
Maximum = 323 and Minimum = 23
In the above solution, there are 2 user defined functions - max
and min
. We store the column 2 in an array. You can store each of your columns like this. In the END
statement you can invoke the function and store the value in a variable and print it.
在上述解决方案中,有 2 个用户定义的函数 -max
和min
. 我们将第 2 列存储在一个数组中。您可以像这样存储每一列。在END
语句中,您可以调用函数并将值存储在变量中并打印它。
Hope this helps!
希望这可以帮助!
Update:
更新:
Executed the following as per the latest example -
根据最新示例执行以下操作 -
[jaypal:~/Temp] awk '
function max(x){i=0;for(val in x){if(i<=x[val]){i=x[val];}}return i;}
function min(x){i=max(x);for(val in x){if(i>x[val]){i=x[val];}}return i;}
/^#/{next}
{a[]=;next}
END{minimum=min(a);maximum=max(a);print "Maximum = "maximum " and Minimum = "minimum}' sample
Maximum = 222 and Minimum = 3.01
回答by codaddict
awk 'BEGIN{first=1;}
{if (first) { max = min = ; first = 0; next;}
if (max < ) max=; if (min > ) min=; }
END { print min, max }' file
回答by Christopher Neylan
Use the BEGIN
and END
blocks to initialize and print variables that keep track of the min and max.
使用BEGIN
和END
块来初始化和打印跟踪最小值和最大值的变量。
e.g.,
例如,
awk 'BEGIN{max=0;min=512} { if (max < ){ max = }; if(min > ){ min = } } END{ print max, min}'
回答by Chris
If your file contains empty lines, neither of the posted solutions will work. For correct handling of empty lines try this:
如果您的文件包含空行,则任何已发布的解决方案都不起作用。为了正确处理空行,试试这个:
$ cat f.awk
BEGIN{getline;min=max=}
NF{
max=(max>)?max:
min=(min>)?:min
}
END{print min,max}
Then run this command:
然后运行这个命令:
sed "/^#/d" my_file | awk -f f.awk
At first it catches the first line of the file to set min and max. Than for each non-empty line it use the ternary operator check, if a new min or max was found. At the end the result ist printed.
首先它捕获文件的第一行来设置最小值和最大值。比对于每个非空行,它使用三元运算符检查,如果找到新的最小值或最大值。最后打印结果。
HTH Chris
HTH克里斯
回答by lel7lel7
awk 'BEGIN {max = 0} {if (>max) max=} END {print max}' yourfile.txt
回答by kakoma
The min can be found by:
最小值可以通过以下方式找到:
awk 'BEGIN {min=1000000; max=0;}; { if(<min && != "") min = ; if(>max && != "") max = ; } END {print min, max}' file
This will output the minimum and maximum, comma-separated
这将输出以逗号分隔的最小值和最大值