OS X 在 bash 中查找正则表达式数字 \d 没有产生预期的结果

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9843828/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 01:50:58  来源:igfitidea点击:

OS X Find in bash with regex digits \d not producing expected results

regexmacosbashterminalfind

提问by juliushibert

I'm using the following regex find command in OS X terminal to find a whole load of files that have 8 digit file names followed by either a .jpg, .gif, .png or .eps extension. The following produces no results even though I've told OS X/BSD find to use modern regex

我在 OS X 终端中使用以下正则表达式 find 命令来查找具有 8 位文件名后跟 .jpg、.gif、.png 或 .eps 扩展名的全部文件。即使我告诉 OS X/BSD find 使用现代正则表达式,以下也不会产生任何结果

find -E ./ -iregex '\d{8}'

Using http://rubular.com/(http://rubular.com/r/YMz3J8Qlgh) shows that the regex pattern produces the expected results and OS X produces the results when typing

使用http://rubular.com/(http://rubular.com/r/YMz3J8Qlgh) 显示正则表达式模式会产生预期的结果,而 OS X 在键入时会产生结果

find . -iname '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].*'

But this seems a little long winded.

但这似乎有点啰嗦。

回答by Hymanjr300

These commands works on OSX

这些命令适用于 OSX

find -E . -iregex '.*/[0-9]{8}\.(jpg|png|eps|gif)'

this command matches 12345678.jpg , not 123456789.jpg

此命令匹配 12345678.jpg ,而不是 123456789.jpg



find -E . -iregex '.*/[0-9]{8,}\.(jpg|png|eps|gif)'

this command matches 12345678.jpg and 123456789.jpg

此命令匹配 12345678.jpg 和 123456789.jpg



.*/ 

equal the folder path or the subFolder path

等于文件夹路径或子文件夹路径

回答by ugn

With all your answers, i was finally able to use OSX find (10.8.1) with regex. For giving back, here are my findings: We use custom strings to identify clips, the pattern goes like this: "YYMMDDabc##abc*.ext": Year/Month/Day/3chars/2digits/3chars/whatever/ext

有了你所有的答案,我终于能够使用带有正则表达式的 OSX find (10.8.1)。为了回馈,这是我的发现:我们使用自定义字符串来识别剪辑,模式如下:“YYMMDDabc##abc*.ext”:年/月/日/3chars/2digits/3chars/whatever/ext

find -E /path/to/folder -type f -regex '^/.*/[0-9]{6}[A-Za-z]{3}[0-9]{2}[A-Za-z0-9]{3}\.*.*\.(ext)$'

The initial ^ makes sure the pattern is at the beginning of the search, [0-9]{6} searches for a 6 digit string, \d does'nt work. \D doesn't work for letters, A-Za-z does. The $ in the end makes sure the last search is the end of the string.

初始 ^ 确保模式位于搜索的开头,[0-9]{6} 搜索 6 位字符串,\d 不起作用。\D 不适用于字母,A-Za-z 可以。最后的 $ 确保最后一次搜索是字符串的结尾。

After reading Apples manpage about findand re_formati was completely off track regarding escaping characters.

在阅读了有关findre_format 的Apple 联机帮助页后,我完全偏离了转义字符的轨道。

回答by jdi

man re_formatexplains the specifics of the modern regex that findwill accept.

man re_format解释了find将接受的现代正则表达式的细节。

This works for me: -iregex '[0-9]{8}'

这对我有用: -iregex '[0-9]{8}'

回答by Wray Bowling

This has been a very eye-opening thread. I'm bringing to the table a solution to my own problem and hopefully clarifying a thing or two for you and other users looking for robustness (like I was).

这是一个非常令人大开眼界的话题。我正在为我自己的问题提出一个解决方案,并希望为您和其他寻求稳健性的用户(就像我一样)澄清一两件事。

In my case my mac had a bunch of duplicate photos. When macs make duplicates they append a space and a number to the end before the extension.

就我而言,我的 Mac 有一堆重复的照片。当 mac 进行复制时,它们会在扩展名之前附加一个空格和一个数字。

IMG_0001.JPGmight have multiplicity complex with IMG_0001 2.JPG, IMG_0001 3.JPGand so on. In my case, this went on and on making up about 2,600 useless files.

IMG_0001.JPG可能有多重复数IMG_0001 2.JPGIMG_0001 3.JPG等等。就我而言,这不断地构成了大约 2,600 个无用的文件。

To get things pumped up, I navigated to the folder in question.

为了让事情变得更有趣,我导航到了有问题的文件夹。

cd ~/Pictures/

Next, let's prove to ourselves that we can list all the files in the directory. You'll notice that in the regex it's necessary to include the .that says "look in this directory". Also, you have to match the whole file name so the .+is necessary to catch all the other characters.

接下来,让我们向自己证明我们可以列出目录中的所有文件。您会注意到,在正则表达式中,必须包含.“在此目录中查找”的内容。此外,您必须匹配整个文件名,以便.+捕获所有其他字符。

find -E . -regex '\..+'

Appropriately, the results will yield the strings that you'll have to match including the .i mentioned earlier, the slash /, and everything else.

适当地,结果将产生您必须匹配的字符串,包括.前面提到的i、斜杠/和其他所有内容。

./IMG_1788.JPG
./IMG_1789.JPG
./IMG_1790.JPG
./IMG_1791.JPG

So I can'twrite this to find duplicates because it doesn't include the "./"

所以我不能写这个来查找重复项,因为它不包含“./”

find -E . -regex 'IMG_[0-9]{4} .+'

but I canwrite this to find duplicates because it does include the "./"

但我可以写这个来查找重复项,因为它确实包含“./”

find -E . -regex '\./IMG_[0-9]{4} .+`

or the more fancy version with .*/as mentioned by @Hymanjr300 does the same thing.

或者.*/@Hymanjr300 提到的更花哨的版本也做同样的事情。

find -E . -regex '.*/IMG_[0-9]{4} .+`

Lastly is the confusing part. \disn't recognized in BSD. [0-9]works just as well. Other users' answers cited the re_formatmanual which lists out how to write common patterns that replace things like \dwith a funny square-colon syntax that looks like this: [:digit:]. I tried and tried, but it never works. Just use [0-9]. In my case, I wasted a bunch of time thinking I should have used [:space:]instead of a space, but I found (as usual!) that I just needed to breath and really read the regex. It turned out to be my mistake. :)

最后是令人困惑的部分。\d在 BSD 中不被识别。[0-9]效果也一样。其他用户的回答引请参阅re_format手册,该手册列出了如何编写通用模式,取代之类的东西\d用一个有趣的方形冒号语法看起来像这样:[:digit:]。我试了又试,但它从来没有奏效。只需使用[0-9]. 就我而言,我浪费了大量时间认为我应该使用[:space:]而不是空格,但我发现(像往常一样!)我只需要呼吸并真正阅读正则表达式。结果证明是我的错误。:)

Hope this helps someone!

希望这对某人有帮助!

回答by Mircea Stanciu

I am using this regex to find and delete iPhone dups:

我正在使用这个正则表达式来查找和删除 iPhone 副本:

find -E . -regex '.*/IMG_[0-9]{4}[ ]1.JPG' -print -exec rm '{}' \;

找到 -E 。-regex '.*/IMG_[0-9]{4}[ ]1.JPG' -print -exec rm '{}' \;