Linux 如何在文件夹层次结构中找到所有不同的文件扩展名?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1842254/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I find all of the distinct file extensions in a folder hierarchy?
提问by GloryFish
On a Linux machine I would like to traverse a folder hierarchy and get a list of all of the distinct file extensions within it.
在 Linux 机器上,我想遍历文件夹层次结构并获取其中所有不同文件扩展名的列表。
What would be the best way to achieve this from a shell?
从外壳实现这一目标的最佳方法是什么?
采纳答案by Ivan Nevostruev
Try this (not sure if it's the best way, but it works):
试试这个(不确定这是否是最好的方法,但它有效):
find . -type f | perl -ne 'print if m/\.([^.\/]+)$/' | sort -u
It work as following:
它的工作原理如下:
- Find all files from current folder
- Prints extension of files if any
- Make a unique sorted list
- 查找当前文件夹中的所有文件
- 打印文件的扩展名(如果有)
- 制作一个独特的排序列表
回答by ChristopheD
Recursive version:
递归版本:
find . -type f | sed -e 's/.*\.//' | sed -e 's/.*\///' | sort -u
If you want totals (how may times the extension was seen):
如果你想要总数(看到扩展的次数):
find . -type f | sed -e 's/.*\.//' | sed -e 's/.*\///' | sort | uniq -c | sort -rn
Non-recursive (single folder):
非递归(单个文件夹):
for f in *.*; do printf "%s\n" "${f##*.}"; done | sort -u
I've based this upon this forum post, credit should go there.
我基于这个论坛帖子,信用应该去那里。
回答by user224243
Find everythin with a dot and show only the suffix.
用点查找所有内容并仅显示后缀。
find . -type f -name "*.*" | awk -F. '{print $NF}' | sort -u
if you know all suffix have 3 characters then
如果您知道所有后缀都有 3 个字符,那么
find . -type f -name "*.???" | awk -F. '{print $NF}' | sort -u
or with sed shows all suffixes with one to four characters. Change {1,4} to the range of characters you are expecting in the suffix.
或 with sed 显示所有后缀一到四个字符。将 {1,4} 更改为您期望后缀中的字符范围。
find . -type f | sed -n 's/.*\.\(.\{1,4\}\)$//p'| sort -u
回答by ChristopheD
Since there's already another solution which uses Perl:
由于已经有另一个使用 Perl 的解决方案:
If you have Python installed you could also do (from the shell):
如果您安装了 Python,您还可以执行以下操作(从 shell):
python -c "import os;e=set();[[e.add(os.path.splitext(f)[-1]) for f in fn]for _,_,fn in os.walk('/home')];print '\n'.join(e)"
回答by ChristopheD
None of the replies so far deal with filenames with newlines properly (except for ChristopheD's, which just came in as I was typing this). The following is not a shell one-liner, but works, and is reasonably fast.
到目前为止,没有任何回复正确处理带有换行符的文件名(除了 ChristopheD,它在我输入时才出现)。以下不是单行外壳,但有效,并且相当快。
import os, sys
def names(roots):
for root in roots:
for a, b, basenames in os.walk(root):
for basename in basenames:
yield basename
sufs = set(os.path.splitext(x)[1] for x in names(sys.argv[1:]))
for suf in sufs:
if suf:
print suf
回答by Simon R
Powershell:
电源外壳:
dir -recurse | select-object extension -unique
Thanks to http://kevin-berridge.blogspot.com/2007/11/windows-powershell.html
感谢http://kevin-berridge.blogspot.com/2007/11/windows-powershell.html
回答by SiegeX
No need for the pipe to sort
, awk can do it all:
不需要管道 to sort
,awk 可以做到这一切:
find . -type f | awk -F. '!a[$NF]++{print $NF}'
回答by Andres Restrepo
In Python using generators for very large directories, including blank extensions, and getting the number of times each extension shows up:
在 Python 中使用生成器生成非常大的目录,包括空白扩展名,并获取每个扩展名出现的次数:
import json
import collections
import itertools
import os
root = '/home/andres'
files = itertools.chain.from_iterable((
files for _,_,files in os.walk(root)
))
counter = collections.Counter(
(os.path.splitext(file_)[1] for file_ in files)
)
print json.dumps(counter, indent=2)
回答by jrock2004
you could also do this
你也可以这样做
find . -type f -name "*.php" -exec PATHTOAPP {} +
回答by gkb0986
Adding my own variation to the mix. I think it's the simplest of the lot and can be useful when efficiency is not a big concern.
将我自己的变化添加到组合中。我认为它是最简单的,当效率不是一个大问题时会很有用。
find . -type f | grep -o -E '\.[^\.]+$' | sort -u