Python glob 多种文件类型

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4568580/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 16:22:24  来源:igfitidea点击:

Python glob multiple filetypes

pythonglob

提问by Raptrex

Is there a better way to use glob.glob in python to get a list of multiple file types such as .txt, .mdown, and .markdown? Right now I have something like this:

有没有更好的方法在python中使用glob.glob来获取.txt、.mdown和.markdown等多种文件类型的列表?现在我有这样的事情:

projectFiles1 = glob.glob( os.path.join(projectDir, '*.txt') )
projectFiles2 = glob.glob( os.path.join(projectDir, '*.mdown') )
projectFiles3 = glob.glob( os.path.join(projectDir, '*.markdown') )

采纳答案by user225312

Maybe there is a better way, but how about:

也许有更好的方法,但是如何:

>>> import glob
>>> types = ('*.pdf', '*.cpp') # the tuple of file types
>>> files_grabbed = []
>>> for files in types:
...     files_grabbed.extend(glob.glob(files))
... 
>>> files_grabbed   # the list of pdf and cpp files

Perhaps there is another way, so wait in case someone else comes up with a better answer.

也许还有另一种方法,所以请等待,以防其他人提出更好的答案。

回答by Christian

with glob it is not possible. you can use only:
* matches everything
? matches any single character
[seq] matches any character in seq
[!seq] matches any character not in seq

使用 glob 是不可能的。您只能使用:
* 匹配所有内容
?匹配任何单个字符
[seq] 匹配 seq 中的任何字符
[!seq] 匹配任何不在 seq 中的字符

use os.listdir and a regexp to check patterns:

使用 os.listdir 和一个正则表达式来检查模式:

for x in os.listdir('.'):
  if re.match('.*\.txt|.*\.sql', x):
    print x

回答by tzot

Chain the results:

链接结果:

import itertools as it, glob

def multiple_file_types(*patterns):
    return it.chain.from_iterable(glob.iglob(pattern) for pattern in patterns)

Then:

然后:

for filename in multiple_file_types("*.txt", "*.sql", "*.log"):
    # do stuff

回答by thegauraw

You can try to make a manual list comparing the extension of existing with those you require.

您可以尝试制作一个手动列表,将现有的扩展名与您需要的扩展名进行比较。

ext_list = ['gif','jpg','jpeg','png'];
file_list = []
for file in glob.glob('*.*'):
  if file.rsplit('.',1)[1] in ext_list :
    file_list.append(file)

回答by Andrew Alcock

I have released Formicwhich implements multiple includes in a similar way to Apache Ant's FileSet and Globs.

我已经发布了Formic,它以与 Apache Ant 的FileSet 和 Globs类似的方式实现了多个包含。

The search can be implemented:

可以实现搜索:

import formic
patterns = ["*.txt", "*.markdown", "*.mdown"]
fileset = formic.FileSet(directory=projectDir, include=patterns)
for file_name in fileset.qualified_files():
    # Do something with file_name

Because the full Ant glob is implemented, you can include different directories with each pattern, so you could choose only those .txt files in one subdirectory, and the .markdown in another, for example:

由于实现了完整的 Ant glob,您可以在每个模式中包含不同的目录,因此您可以只选择一个子目录中的 .txt 文件,而另一个子目录中的 .markdown 文件,例如:

patterns = [ "/unformatted/**/*.txt", "/formatted/**/*.mdown" ]

I hope this helps.

我希望这有帮助。

回答by joemaller

Not glob, but here's another way using a list comprehension:

不是glob,但这是使用列表理解的另一种方式:

extensions = 'txt mdown markdown'.split()
projectFiles = [f for f in os.listdir(projectDir) 
                  if os.path.splitext(f)[1][1:] in extensions]

回答by Tim Fuller

The following function _globglobs for multiple file extensions.

以下函数_glob适用于多个文件扩展名。

import glob
import os
def _glob(path, *exts):
    """Glob for multiple file extensions

    Parameters
    ----------
    path : str
        A file name without extension, or directory name
    exts : tuple
        File extensions to glob for

    Returns
    -------
    files : list
        list of files matching extensions in exts in path

    """
    path = os.path.join(path, "*") if os.path.isdir(path) else path + "*"
    return [f for files in [glob.glob(path + ext) for ext in exts] for f in files]

files = _glob(projectDir, ".txt", ".mdown", ".markdown")

回答by user2363986

from glob import glob

files = glob('*.gif')
files.extend(glob('*.png'))
files.extend(glob('*.jpg'))

print(files)

If you need to specify a path, loop over match patterns and keep the join inside the loop for simplicity:

如果您需要指定路径,请遍历匹配模式并将连接保留在循环内以简单起见:

from os.path import join
from glob import glob

files = []
for ext in ('*.gif', '*.png', '*.jpg'):
   files.extend(glob(join("path/to/dir", ext)))

print(files)

回答by jdnoon

This Should Work:

这应该有效:

import glob
extensions = ('*.txt', '*.mdown', '*.markdown')
for i in extensions:
    for files in glob.glob(i):
        print (files)

回答by Hans Goldman

After coming here for help, I made my own solution and wanted to share it. It's based on user2363986's answer, but I think this is more scalable. Meaning, that if you have 1000 extensions, the code will still look somewhat elegant.

来到这里寻求帮助后,我制定了自己的解决方案,并想分享它。它基于 user2363986 的回答,但我认为这更具可扩展性。意思是,如果你有 1000 个扩展,代码看起来仍然有些优雅。

from glob import glob

directoryPath  = "C:\temp\*." 
fileExtensions = [ "jpg", "jpeg", "png", "bmp", "gif" ]
listOfFiles    = []

for extension in fileExtensions:
    listOfFiles.extend( glob( directoryPath + extension ))

for file in listOfFiles:
    print(file)   # Or do other stuff