Python glob 排除模式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20638040/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:54:28  来源:igfitidea点击:

glob exclude pattern

pythonglob

提问by Anastasios Andronidis

I have a directory with a bunch of files inside: eee2314, asd3442... and eph.

我有一个目录,里面有一堆文件:eee2314, asd3442... 和eph.

I want to exclude all files that start with ephwith the globfunction.

我想排除所有eph以该glob函数开头的文件。

How can I do it?

我该怎么做?

采纳答案by Kenly

The pattern rules for glob are not regular expressions. Instead, they follow standard Unix path expansion rules. There are only a few special characters: two different wild-cards, and character ranges are supported [from glob].

glob 的模式规则不是正则表达式。相反,它们遵循标准的 Unix 路径扩展规则。只有几个特殊字符:两种不同的通配符,并且支持字符范围 [来自glob]。

So you can exclude some files with patterns.
For example to exclude manifests files (files starting with _) with glob, you can use:

所以你可以排除一些带有模式的文件。
例如_,要使用 glob排除清单文件(以 开头的文件),您可以使用:

files = glob.glob('files_path/[!_]*')

回答by Martijn Pieters

You can't exclude patterns with the globfunction, globs only allow for inclusionpatterns. Globbing syntaxis very limited (even a [!..]character class mustmatch a character, so it is an inclusion patternfor every character that is not in the class).

你不能用glob函数排除模式,globs 只允许包含模式。Globbing 语法非常有限(即使是[!..]字符类也必须匹配一个字符,因此它是每个不在类中的字符的包含模式)。

You'll have to do your own filtering; a list comprehension usually works nicely here:

您必须自己进行过滤;列表理解在这里通常很有效:

files = [fn for fn in glob('somepath/*.txt') 
         if not os.path.basename(fn).startswith('eph')]

回答by neutrinus

You can deduct sets:

您可以扣除集合:

set(glob("*")) - set(glob("eph*"))

回答by Lord Henry Wotton

More generally, to exclude files that don't comply with some shell regexp, you could use module fnmatch:

更一般地,要排除不符合某些 shell regexp 的文件,您可以使用 module fnmatch

import fnmatch

file_list = glob('somepath')    
for ind, ii in enumerate(file_list):
    if not fnmatch.fnmatch(ii, 'bash_regexp_with_exclude'):
        file_list.pop(ind)

The above will first generate a list from a given path and next pop out the files that won't satisfy the regular expression with the desired constraint.

上面将首先从给定的路径生成一个列表,然后弹出不满足具有所需约束的正则表达式的文件。

回答by K Raphael

Late to the game but you could alternatively just apply a python filterto the result of a glob:

游戏迟到,但您也可以将 python 应用于 afilter的结果glob

files = glob.iglob('your_path_here')
files_i_care_about = filter(lambda x: not x.startswith("eph"), files)

or replacing the lambda with an appropriate regex search, etc...

或用适当的正则表达式搜索等替换 lambda...

EDIT: I just realized that if you're using full paths the startswithwon't work, so you'd need a regex

编辑:我刚刚意识到,如果您使用完整路径startswith将无法工作,因此您需要一个正则表达式

In [10]: a
Out[10]: ['/some/path/foo', 'some/path/bar', 'some/path/eph_thing']

In [11]: filter(lambda x: not re.search('/eph', x), a)
Out[11]: ['/some/path/foo', 'some/path/bar']

回答by Ryan Farber

As mentioned by the accepted answer, you can't exclude patterns with glob, so the following is a method to filter your glob result.

正如接受的答案所述,您不能使用 glob 排除模式,因此以下是过滤 glob 结果的方法。

The accepted answer is probably the best pythonic way to do things but if you think list comprehensions look a bit ugly and want to make your code maximally numpythonic anyway (like I did) then you can do this (but note that this is probably less efficient than the list comprehension method):

接受的答案可能是最好的 Pythonic 做事方式,但是如果您认为列表推导式看起来有点难看,并且无论如何都想让您的代码最大限度地使用 numpythonic(就像我所做的那样),那么您可以这样做(但请注意,这可能效率较低)比列表理解方法):

import glob

data_files = glob.glob("path_to_files/*.fits")

light_files = np.setdiff1d( data_files, glob.glob("*BIAS*"))
light_files = np.setdiff1d(light_files, glob.glob("*FLAT*"))

(In my case, I had some image frames, bias frames, and flat frames all in one directory and I just wanted the image frames)

(就我而言,我在一个目录中有一些图像框架、偏置框架和平面框架,我只想要图像框架)

回答by Scott Ming

Compare with glob, I recommend pathlib, filter one pattern is very simple.

glob我推荐的相比pathlib,过滤模式非常简单。

from pathlib import Path

p = Path(YOUR_PATH)
filtered = [x for x in p.glob("**/*") if not x.name.startswith("eph")]

and if you want to filter more complex pattern, you can define a function to do that, just like:

如果你想过滤更复杂的模式,你可以定义一个函数来做到这一点,就像:

def not_in_pattern(x):
    return (not x.name.startswith("eph")) and not x.name.startswith("epi")


filtered = [x for x in p.glob("**/*") if not_in_pattern(x)]

use that code, you can filter all files that start with ephor start with epi.

使用该代码,您可以过滤所有ephepi.

回答by KK2491

You can use the below method:

您可以使用以下方法:

# Get all the files
allFiles = glob.glob("*")
# Files starting with eph
ephFiles = glob.glob("eph*")
# Files which doesnt start with eph
noephFiles = []
for file in allFiles:
    if file not in ephFiles:
        noephFiles.append(file)
# noepchFiles has all the file which doesnt start with eph.

Thank you.  

回答by Azhar Ansari

How about skipping the particular file while iterating over all the files in the folder! Below code would skip all excel files that start with 'eph'

在遍历文件夹中的所有文件时跳过特定文件怎么样!下面的代码将跳过所有以“eph”开头的excel文件

import glob
import re
for file in glob.glob('*.xlsx'):
    if re.match('eph.*\.xlsx',file):
        continue
    else:
        #do your stuff here
        print(file)

This way you can use more complex regex patterns to include/exclude a particular set of files in a folder.

通过这种方式,您可以使用更复杂的正则表达式模式来包含/排除文件夹中的特定文件集。