bash 如何检查文件名是否与shell脚本中的正则表达式匹配
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37037767/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to check if a file name matches regex in shell script
提问by jlp
I have a shell script that needs to check if a file name matches a certain regex, but it always shows "not match". Can anyone let me know what's wrong with my code?
我有一个 shell 脚本,需要检查文件名是否与某个正则表达式匹配,但它总是显示“不匹配”。任何人都可以让我知道我的代码有什么问题吗?
fileNamePattern=abcd_????_def_*.txt
realFilePath=/data/file/abcd_12bd_def_ghijk.txt
if [[ $realFilePath =~ $fileNamePattern ]]
then
echo $realFilePath match $fileNamePattern
else
echo $realFilePath not match $fileNamePattern
fi
回答by Benjamin W.
There is a confusion between regexesand the simpler "glob"/"wildcard"/"normal" patterns– whatever you want to call them. You're using the latter, but call it a regex.
正则表达式和更简单的“glob”/“wildcard”/“normal”模式之间存在混淆——不管你想怎么称呼它们。您正在使用后者,但称其为正则表达式。
If you want to use a pattern, you should
如果你想使用一个模式,你应该
Quote it when assigning1:
fileNamePattern="abcd_????_def_*.txt"
You don't want anything to expand quite yet.
Make it match the complete path. This doesn't match:
$ mypath="/mydir/myfile1.txt" $ mypattern="myfile?.txt" $ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!" Doesn't match!
But after extending the pattern to start with
*
:$ mypattern="*myfile?.txt" $ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!" Matches!
The first one doesn't match because it matches only the filename, but not the complete path. Alternatively, you could use the first pattern, but remove the rest of the path with parameter expansion:
$ mypattern="myfile?.txt" $ mypath="/mydir/myfile1.txt" $ echo "${mypath##*/}" myfile1.txt $ [[ ${mypath##*/} == $mypattern ]] && echo "Matches!" || echo "Doesn't match!" Matches!
Use
==
and not=~
, as shown in the above examples. You could also use the more portable=
instead, but since we're already using the non-POSIX[[ ]]
instead of[ ]
, we can as well use==
.
分配1时引用它:
fileNamePattern="abcd_????_def_*.txt"
你还不想让任何东西完全扩展。
使其匹配完整路径。这不匹配:
$ mypath="/mydir/myfile1.txt" $ mypattern="myfile?.txt" $ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!" Doesn't match!
但是在扩展模式以开始之后
*
:$ mypattern="*myfile?.txt" $ [[ $mypath == $mypattern ]] && echo "Matches!" || echo "Doesn't match!" Matches!
第一个不匹配,因为它只匹配文件名,而不匹配完整路径。或者,您可以使用第一个模式,但使用参数扩展删除路径的其余部分:
$ mypattern="myfile?.txt" $ mypath="/mydir/myfile1.txt" $ echo "${mypath##*/}" myfile1.txt $ [[ ${mypath##*/} == $mypattern ]] && echo "Matches!" || echo "Doesn't match!" Matches!
使用
==
and not=~
,如上例所示。您也可以使用更便携的=
,但由于我们已经在使用非 POSIX[[ ]]
而不是[ ]
,我们也可以使用==
.
If you want to use a regex, you should:
如果你想使用正则表达式,你应该:
Write your pattern as one:
?
and*
have a different meaning in regexes; they modify what they stand after, whereas in glob patterns, they can stand on their own (see the manual). The corresponding pattern would become:fileNameRegex="abcd_.{4}_def_.*.txt"
and could be used like this:
$ realFilePath="/data/file/abcd_12bd_def_ghijk.txt" $ [[ $mypath =~ $fileNameRegex ]] && echo "Matches!" || echo "Doesn't match!" Matches!
Keep your habit of writing the regex into a separate parameter and then use it unquoted in the conditional operator
[[ ]]
, or escaping gets very messy – it's also more portable across Bash versions.
将您的模式写成一个:
?
并*
在正则表达式中具有不同的含义;他们修改他们所追求的东西,而在 glob 模式中,他们可以独立存在(参见手册)。相应的模式将变为:fileNameRegex="abcd_.{4}_def_.*.txt"
并且可以这样使用:
$ realFilePath="/data/file/abcd_12bd_def_ghijk.txt" $ [[ $mypath =~ $fileNameRegex ]] && echo "Matches!" || echo "Doesn't match!" Matches!
保持将正则表达式写入单独参数的习惯,然后在条件运算符中不加引号地使用它
[[ ]]
,否则转义会变得非常混乱——它在 Bash 版本之间也更具可移植性。
The BashGuide has a great articleabout the different types of patterns in Bash.
BashGuide 有一篇很棒的文章,介绍了 Bash 中不同类型的模式。
Notice that quoting your parameters is almost always a good habit. It's not required in conditional expressions in [[ ]]
, and actually suppressesinterpretation of the right-hand side as a pattern or regex. If you were using [ ]
(which doesn't support regexes and patterns anyway), quoting would be required to avoid unexpected side effects of special characters and empty strings.
请注意,引用参数几乎总是一个好习惯。在 中的条件表达式中不需要它[[ ]]
,实际上抑制了将右侧解释为模式或正则表达式的情况。如果您正在使用[ ]
(无论如何都不支持正则表达式和模式),则需要引用以避免特殊字符和空字符串的意外副作用。
1Not exactlytrue in this case, actually. When assigning to a variable, the manualsays that the following happens:
1实际上,在这种情况下并不完全正确。分配给变量时,手册说会发生以下情况:
[...] tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal [...]
[...] 波浪号扩展、参数和变量扩展、命令替换、算术扩展和引号删除 [...]
i.e., no pathname (glob) expansion. While in this very case using
即,没有路径名 (glob) 扩展。虽然在这种情况下使用
fileNamePattern=abcd_????_def_*.txt
would work just as well as the quoted version, using quotes prevents surprises in many other cases and is required as soon as you have a blank in the pattern.
与引用的版本一样有效,使用引号可以防止在许多其他情况下出现意外,并且一旦模式中有空白就需要使用引号。
回答by MaxU
Use RegExs instead of wildcards:
使用正则表达式代替通配符:
{ ~ } ? fileNamePattern="abcd_...._def_.*\.txt" ~
{ ~ } ? realFilePath=/data/file/abcd_12bd_def_ghijk.txt ~
{ ~ } ? if [[ $realFilePath =~ $fileNamePattern ]] ~
\ then
\ echo $realFilePath match $fileNamePattern
\ else
\ echo $realFilePath not match $fileNamePattern
\ fi
Output:
输出:
/data/file/abcd_12bd_def_ghijk.txt match abcd_...._def_.*\.txt