Bash：查找具有最大行数的文件

Question

提问by Marek Sebera

This is my try to do it

这是我的尝试

Find all *.javafiles
find . -name '*.java'
Count lines
wc -l
Delete last line
sed '$d'
Use AWK to find max lines-count in wcoutput
awk 'max=="" || data=="" || $1 > max {max=$1 ; data=$2} END{ print max " " data}'

查找所有*.java文件
find . -name '*.java'
计数线
wc -l
删除最后一行
sed '$d'
使用 AWK 在wc输出中查找最大行数
awk 'max=="" || data=="" || $1 > max {max=$1 ; data=$2} END{ print max " " data}'

then merge it to single line

然后将其合并为单行

find . -name '*.java' | xargs wc -l | sed '$d' | awk 'max=="" || data=="" ||  > max {max= ; data=} END{ print max " " data}'

Can I somehow implement counting just non-blank lines?

我可以以某种方式实现只计算非空行吗？

Answer 1

回答by Shawn Chin

find . -type f -name "*.java" -exec grep -H -c '[^[:space:]]' {} \; | \
    sort -nr -t":" -k2 | awk -F: '{print ; exit;}'

Replace the awkcommand with head -n1if you also want to see the number of non-blank lines.

如果您还想查看非空白行的数量，请将awk命令替换为head -n1。

Breakdown of the command:

命令分解：

find . -type f -name "*.java" -exec grep -H -c '[^[:space:]]' {} \; 
'---------------------------'       '-----------------------'
             |                                   |
   for each *.java file             Use grep to count non-empty lines
                                   -H includes filenames in the output
                                 (output = ./full/path/to/file.java:count)

| sort -nr -t":" -k2  | awk -F: '{print ; exit;}'
  '----------------'    '-------------------------'
          |                            |
  Sort the output in         Print filename of the first entry (largest count)
reverse order using the         then exit immediately
  second column (count)

Answer 2

回答by Vijay

find . -name "*.java" -type f | xargs wc -l | sort -rn | grep -v ' total$' | head -1

Answer 3

回答by holygeek

Something like this might work:

像这样的事情可能会奏效：

find . -name '*.java'|while read filename; do
    nlines=`grep -v -E '^[[:space:]]*$' "$filename"|wc -l`
    echo $nlines $filename
done|sort -nr|head -1

(edited as per Ed Morton's comment. I must have had too much coffee :-) )

（根据 Ed Morton 的评论进行编辑。我一定是喝了太多咖啡 :-) ）

Answer 4

回答by Ed Morton

To get the size of all of your files using awk is just:

要使用 awk 获取所有文件的大小，只需：

$ find . -name '*.java' -print0 | xargs -0 awk '
BEGIN { for (i=1;i<ARGC;i++) size[ARGV[i]]=0 }
{ size[FILENAME]++ }
END { for (file in size) print size[file], file }
'

To get the count of the non-empty lines, simply make the line where you increment the size[] conditional:

要获得非空行的计数，只需将增加 size[] 的行设置为有条件的：

$ find . -name '*.java' -print0 | xargs -0 awk '
BEGIN { for (i=1;i<ARGC;i++) size[ARGV[i]]=0 }
NF { size[FILENAME]++ }
END { for (file in size) print size[file], file }
'

(If you want to consider lines that contain only blanks as "empty" then replace NF with /^./.)

（如果您想将仅包含空格的行视为“空”，则将 NF 替换为 /^./。）

To get only the file with the most non-empty lines just tweak again:

要仅获取具有最多非空行的文件，只需再次调整：

$ find . -name '*.java' -print0 | xargs -0 awk '
BEGIN { for (i=1;i<ARGC;i++) size[ARGV[i]]=0 }
NF { size[FILENAME]++ }
END {
   for (file in size) {
      if (size[file] >= maxSize) {
         maxSize = size[file]
         maxFile = file
      }
   }
   print maxSize, maxFile
}
'

Bash：查找具有最大行数的文件

提问by Marek Sebera

回答by Shawn Chin

回答by Vijay

回答by holygeek

回答by Ed Morton

相关推荐

最近更新

标签

Bash：查找具有最大行数的文件

提问by Marek Sebera

回答by Shawn Chin

回答by Vijay

回答by holygeek

回答by Ed Morton

相关推荐

如何在 Bash 脚本中为 conky 制作特殊字符？

Bash - 在收到信号之前我应该​​如何空闲？

bash 中意外标记“elif”附近的语法错误

bash 如何使用bash检查服务器上是否存在文件

相关推荐

最近更新

标签

Bash - 在收到信号之前我应该如何空闲？