bash unix - 文件中每列的最大值（长度）

Question

提问by toop

Given a file with data like this (ie stores.dat file)

给定一个包含这样数据的文件（即 stores.dat 文件）

sid|storeNo|latitude|longitude
2tt|1|-28.0372000t0|153.42921670
9|2t|-33tt.85t09t0000|15t1.03274200

Desired output:

期望的输出：

sid : 3
storeNo : 2
latitude : 16
longitude : 13

What is the syntax to return the maximum length of the values under each column?

返回每列下值的最大长度的语法是什么？

I have tried this but it does not work:

我试过这个，但它不起作用：

nawk 'BEGIN { FS = "|" }
{
for(n = 1; n <= NF; n++) {
if (length($n) > max)
max = length($n)
maxlen[$n] = max
}
}
END {
for (i in maxlen) print "col " i ": " maxlen[i]
} ' stores.dat

UPDATE (thanks to Mat's answer - I settled on this):

更新（感谢 Mat 的回答 - 我决定了）：

awk -F"|" '  NR==1{
    for(n = 1; n <= NF; n++) {
       colname[n]=$n
    }
}
NR>1{
    for(n = 1; n <= NF; n++) {
        if (length($n)>maxlen[n])
            maxlen[n]=length($n)
    }
}
END {
        for (i in colname) {
                print colname[i], ":", maxlen[i]+0;
        }
} ' filename

Answer 1

回答by Mat

There's a few problems with your script - maxis shared between columns, and you're not dealing with the header line at all. Try the following:

您的脚本存在一些问题 -max在列之间共享，并且您根本没有处理标题行。请尝试以下操作：

$ cat t.awk 
#!/bin/awk -f
NR==1{
    for(n = 1; n <= NF; n++) {
       colname[n]=$n
    }
}
NR>1{
    for(n = 1; n <= NF; n++) {
        if (length($n)>maxlen[n])
            maxlen[n]=length($n)
    }
}
END {
        for (i in maxlen) {
                print colname[i], ":", maxlen[i];
        }
}
$ awk -F'|' -f t.awk stores.dat

$nrefers to the contents of the nth column. nis the column number (in the first and second loop). The last loop just shows a way of iterating over an array in awk.

$n指的是n第 th 列的内容。n是列号（在第一个和第二个循环中）。最后一个循环只是展示了一种在awk.

Answer 2

回答by Moreaki

My take on this is by using a pure Bash approach:

我对此的看法是使用纯 Bash 方法：

#!/usr/bin/env bash

dat=./stores.dat
del='|'
TOKENS=$(head -1 "${dat}" | tr $del ' ')
declare -a col=( $TOKENS )
declare -a max

skip=1
while IFS=$del read $TOKENS; do
    if [ $skip -eq 1 ]; then
        skip=0
        continue
    fi
    idx=0
    for tok in ${TOKENS}; do
        tokref=${!tok}
        printf "%-10s = %-16s[%2d] " "$tok" "${tokref}" "${#tokref}"
        echo "--> max=${max[$idx]} tokref=${#tokref}"
        #This works  : c=$a>$b?$a:$b
        #This doesn't: max[$idx]=${max[$idx]}>${#tokref}?${max[$idx]}:${#tokref}
        max[$idx]=$((${max[$idx]:=0}>${#tokref}?${max[$idx]}:${#tokref}))
        let idx++
    done
    printf "\n"
done < ${dat}

for ((idx=0; idx<${#col[@]}; idx++)); do
    printf "%-10s : %d\n" "${col[$idx]}" "${max[$idx]}"
done

The output is as follows:

输出如下：

sid        = 2tt             [ 3] --> max=0 tokref=3
storeNo    = 1               [ 1] --> max=0 tokref=1
latitude   = -28.0372000t0   [13] --> max=0 tokref=13
longitude  = 153.42921670    [12] --> max=0 tokref=12

sid        = 9               [ 1] --> max=3 tokref=1
storeNo    = 2t              [ 2] --> max=1 tokref=2
latitude   = -33tt.85t09t0000[16] --> max=13 tokref=16
longitude  = 15t1.03274200   [13] --> max=12 tokref=13

sid        : 3
storeNo    : 2
latitude   : 16
longitude  : 13

I've added this solution because I liked the challenge and had some minutes to spare.

我添加了这个解决方案是因为我喜欢这个挑战并且有几分钟的空闲时间。

bash unix - 文件中每列的最大值（长度）

提问by toop

回答by Mat

回答by Moreaki

相关推荐

最近更新

标签

bash unix - 文件中每列的最大值（长度）

提问by toop

回答by Mat

回答by Moreaki

相关推荐

bash 在 .gitconfig 中隐藏 GitHub 令牌

bash sed 单引号

Bash - 如何将参数传递给通过标准输入读取的脚本

如何在 bash 中获取一个字符串并将其拆分为 2 个变量？

相关推荐

最近更新

标签