php 警告:preg_replace():未知修饰符“]”
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20705399/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Warning: preg_replace(): Unknown modifier ']'
提问by user3122995
I have the following error:
我有以下错误:
Warning: preg_replace(): Unknown modifier ']' in xxx.php on line 38
警告:preg_replace():xxx.php 中第 38 行的未知修饰符“]”
This is the code on line 38:
这是第 38 行的代码:
<?php echo str_replace("</ul></div>", "", preg_replace("<div[^>]*><ul[^>]*>", "", wp_nav_menu(array('theme_location' => 'nav', 'echo' => false)) )); ?>
How can I fix this problem?
我该如何解决这个问题?
回答by Amal Murali
Why the error occurs
为什么会出现错误
In PHP, a regular expression needs to be enclosed within a pair of delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character; /, #, ~are the most commonly used ones. Note that it is also possible to use bracket style delimiters where the opening and closing brackets are the starting and ending delimiter, i.e. <pattern_goes_here>, [pattern_goes_here]etc. are all valid.
在 PHP 中,正则表达式需要包含在一对分隔符中。分隔符可以是任何非字母数字、非反斜杠、非空白字符;/, #,~是最常用的。需要注意的是,还可以使用支架风格分隔符在打开和关闭括号开始和结束的分隔符,即<pattern_goes_here>,[pattern_goes_here]等等都是有效的。
The "Unknown modifier X" error usually occurs in the following two cases:
“未知修饰符X”错误通常发生在以下两种情况:
When your regular expression is missing delimiters.
When you use the delimiter insidethe pattern without escapingit.
当您的正则表达式缺少 delimiters 时。
当您在模式中使用分隔符而不对其进行转义时。
In this case, the regular expression is <div[^>]*><ul[^>]*>. The regex engine considers everything from <to >as the regex pattern, and everything afterwards as modifiers.
在这种情况下,正则表达式是<div[^>]*><ul[^>]*>。正则表达式引擎将所有从<to>视为正则表达式模式,并将之后的所有内容视为修饰符。
Regex: <div[^> ]*><ul[^>]*>
│ │ │ │
└──┬──┘ └────┬─────┘
pattern modifiers
]here is an unknown modifier, because it appears after the closing >delimiter. Which is why PHP throws that error.
]这是一个未知的修饰符,因为它出现在结束>定界符之后。这就是 PHP 抛出该错误的原因。
Depending on the pattern, the unknown modifier complaint might as well have been about *, +, p, /or )or almost any other letter/symbol. Only imsxeADSUXJuare valid PCRE modifiers.
根据不同的模式,未知的修饰投诉可能也已经约*,+,p,/或)或几乎任何其他字母/符号。只有imsxeADSUXJu在有效PCRE修饰符。
How to fix it
如何修复
The fix is easy. Just wrap your regex pattern with any valid delimiters. In this case, you could chose ~and get the following:
修复很容易。只需用任何有效的分隔符包装您的正则表达式模式。在这种情况下,您可以选择~并获得以下内容:
~<div[^>]*><ul[^>]*>~
│ │
│ └─ ending delimiter
└───────────────────── starting delimiter
If you're receiving this error despite having used a delimiter, it might be because the pattern itself contains unescaped occurrences of the said delimiter.
如果您在使用分隔符后仍收到此错误,可能是因为模式本身包含未转义的所述分隔符。
Or escape delimiters
或转义分隔符
/foo[^/]+bar/iwould certainly throw an error. So you can escape it using a \backslash if it appears anywhere within the regex:
/foo[^/]+bar/i肯定会抛出错误。因此,\如果它出现在正则表达式中的任何位置,您可以使用反斜杠对其进行转义:
/foo[^\/]+bar/i
│ │ │
└──────┼─────┴─ actual delimiters
└─────── escaped slash(/) character
This is a tedious job if your regex pattern contains so many occurrences of the delimiter character.
如果您的正则表达式模式包含如此多的分隔符,这将是一项乏味的工作。
The cleaner way, of course, would be to use a different delimiter altogether. Ideally a character that does not appear anywhere inside the regex pattern, say #- #foo[^/]+bar#i.
当然,更简洁的方法是完全使用不同的分隔符。理想情况下,字符不会出现在正则表达式模式中的任何地方,例如#- #foo[^/]+bar#i。
More reading:
更多阅读:
回答by mario
Other examples
其他例子
The reference answeralready explains the reason for "Unknown modifier" warnings. This is just a comparison of other typical variants.
该参考答案已经解释了“未知的修饰词”警告的原因。这只是其他典型变体的比较。
When forgetting to add regex
/delimiters/, the first non-letter symbol will be assumed to be one. Therefore the warning is often about what follows a grouping(…),[…]meta symbol:preg_match("[a-zA-Z]+:\s*.$" ↑ ↑?Sometimes your regex already uses a custom delimiter (
:here), but still contains the same character as unescaped literal. It's then mistaken as premature delimiter. Which is why the very next symbol receives the "Unknown modifier ?" trophy:preg_match(":\[[\d:/]+\]:" ↑ ? ↑When using the classic
/delimiter, take care to not have it within the regex literally. This most frequently happens when trying to match unescaped filenames:preg_match("/pathname/filename/i" ↑ ? ↑Or when matching angle/square bracket style tags:
preg_match("/<%tmpl:id>(.*)</%tmpl:id>/Ui" ↑ ? ↑Templating-style (Smarty or BBCode) regex patterns often require
{…}or[…]brackets. Both should usually be escaped. (An outermost{}pair being the exception though).They also get misinterpreted as paired delimiterswhen no actual delimiter is used. If they're then also used as literal character within, then that's, of course … an error.
preg_match("{bold[^}]+}" ↑ ? ↑Whenever the warning says "Delimiter must not be alphanumeric or backslash" then you also entirely forgot delimiters:
preg_match("ab?c*" ↑"Unkown modifier 'g'" often indicates a regex that was copied verbatimly from JavaScript or Perl.
preg_match("/abc+/g" ?PHP doesn't use the
/gglobal flag. Instead thepreg_replacefunction works on all occurences, andpreg_match_allis the "global" searching pendant to the one-occurencepreg_match.So, just remove the
/gflag.See also:
· Warning: preg_replace(): Unknown modifier 'g'
· preg_replace: bad regex == 'Unknown Modifier'?A more peculiar case pertains the PCRE_EXTENDED
/xflag. This is often (or should be) used for making regexps more lofty and readable.This allows to use inline
#comments. PHP implements the regex delimiters atop PCRE. But it doesn't treat#in any special way. Which is how a literal delimiter in a#comment can become an error:preg_match("/ ab?c+ # Comment with / slash in between /x"(Also noteworthy that using
#as#abc+#xdelimiter can be doubly inadvisable.)Interpolating variables into a regex requires them to be pre-escaped, or be valid regexps themselves. You can't tell beforehand if this is gonna work:
preg_match("/id=$var;/" ↑ ? ↑It's best to apply
$var = preg_quote($var, "/")in such cases.See also:
· Unknown modifier '/' in ...? what is it?Another alternative is using
\Q…\Eescapes for unquoted literal strings:preg_match("/id=\Q{$var}\E;/mix");Note that this is merely a convenience shortcut for meta symbols, not dependable/safe. It would fall apart in case that
$varcontained a literal'\E'itself (however unlikely). And it does notmask the delimiteritself.Deprecated modifier /eis an entirely different problem. This has nothing to do with delimiters, but the implicit expression interpretation mode being phased out. See also: Replace deprecated preg_replace /e with preg_replace_callback
当忘记添加正则表达式
/分隔符时/,第一个非字母符号将被假定为一个。因此,警告通常是关于 grouping(…),[…]meta 符号之后的内容:preg_match("[a-zA-Z]+:\s*.$" ↑ ↑?有时您的正则表达式已经使用了自定义分隔符(
:此处),但仍包含与未转义文字相同的字符。然后它被误认为是过早的分隔符。这就是为什么下一个符号会收到“未知修饰符”?杯:preg_match(":\[[\d:/]+\]:" ↑ ? ↑使用经典
/分隔符时,请注意不要在正则表达式中使用它。这在尝试匹配未转义的文件名时最常发生:preg_match("/pathname/filename/i" ↑ ? ↑或者当匹配角/方括号样式标签时:
preg_match("/<%tmpl:id>(.*)</%tmpl:id>/Ui" ↑ ? ↑模板样式(Smarty 或 BBCode)正则表达式模式通常需要
{…}或[…]括号。两者通常都应该逃脱。({}尽管最外面的一对是例外)。当不使用实际分隔符时,它们也会被误解为成对分隔符。如果它们随后也被用作内部的文字字符,那么这当然是……一个错误。
preg_match("{bold[^}]+}" ↑ ? ↑每当警告说“分隔符不能是字母数字或反斜杠”时,您也完全忘记了分隔符:
preg_match("ab?c*" ↑“未知修饰符 'g'”通常表示从 JavaScript 或 Perl 逐字复制的正则表达式。
preg_match("/abc+/g" ?PHP 不使用
/g全局标志。相反,该preg_replace函数适用于所有出现,并且preg_match_all是一次出现的“全局”搜索挂件preg_match。所以,只需删除
/g标志。另见:
·警告:preg_replace():未知修饰符'g'
· preg_replace:错误的正则表达式=='未知修饰符'?一个更奇特的情况与PCRE_EXTENDED
/x标志有关。这通常(或应该)用于使正则表达式更加高级和可读。这允许使用内联
#注释。PHP 在 PCRE 之上实现了正则表达式分隔符。但它不会#以任何特殊方式对待。这就是#注释中的文字定界符如何成为错误的原因:preg_match("/ ab?c+ # Comment with / slash in between /x"(同样值得注意的是,使用
#作为#abc+#x分隔符可能是双重不可取的。)将变量插入正则表达式需要它们预先转义,或者本身就是有效的正则表达式。你不能事先知道这是否会奏效:
preg_match("/id=$var;/" ↑ ? ↑$var = preg_quote($var, "/")在这种情况下最好应用。另请参阅:
·未知修饰符“/”在...?它是什么?另一种选择是
\Q…\E对不带引号的文字字符串使用转义:preg_match("/id=\Q{$var}\E;/mix");请注意,这只是元符号的便捷快捷方式,而不可靠/安全。如果
$var包含文字'\E'本身(但不太可能),它会崩溃。它并不能掩盖分隔符本身。不推荐使用的修饰符 /e是一个完全不同的问题。这与分隔符无关,而是隐式表达式解释模式正在逐步淘汰。另请参阅: 用 preg_replace_callback 替换已弃用的 preg_replace /e
Alternative regex delimiters
替代正则表达式分隔符
As mentioned already, the quickest solution to this error is just picking a distinct delimiter. Any non-letter symbol can be used. Visually distinctive ones are often preferred:
如前所述,解决此错误的最快方法就是选择一个不同的分隔符。可以使用任何非字母符号。视觉上与众不同的通常是首选:
~abc+~!abc+!@abc+@#abc+#=abc+=%abc+%
~abc+~!abc+!@abc+@#abc+#=abc+=%abc+%
Technically you could use $abc$or |abc|for delimiters. However, it's best to avoid symbols that serve as regex meta characters themselves.
从技术上讲,您可以使用$abc$或|abc|分隔符。但是,最好避免使用本身作为正则表达式元字符的符号。
The hash #as delimiter is rather popular too. But care should be taken in combination with the x/PCRE_EXTENDEDreadability modifier. You can't use # inlineor (?#…)comments then, because those would be confused as delimiters.
#作为分隔符的哈希也相当流行。但是在与x/ PCRE_EXTENDEDreadability 修饰符结合使用时应该小心。你不能使用# inlineor(?#…)注释,因为它们会被混淆为分隔符。
Quote-only delimiters
仅引号分隔符
Occassionally you see "and 'used as regex delimiters paired with their conterpart as PHP string enclosure:
偶尔你会看到"并'用作正则表达式分隔符,与它们的对应物配对作为 PHP 字符串外壳:
preg_match("'abc+'"
preg_match('"abc+"'
Which is perfectly valid as far as PHP is concerned. It's sometimes convenient and unobtrusive, but not always legible in IDEs and editors.
就 PHP 而言,这是完全有效的。它有时方便且不引人注目,但在 IDE 和编辑器中并不总是清晰易读。
Paired delimiters
成对的分隔符
An interesting variation are paired delimiters. Instead of using the same symbol on both ends of a regex, you can use any <...>(...)[...]{...}bracket/braces combination.
一个有趣的变化是成对的分隔符。您可以使用任何<...>(...)[...]{...}括号/大括号组合,而不是在正则表达式的两端使用相同的符号。
preg_match("(abc+)" # just delimiters here, not a capture group
While most of them also serve as regex meta characters, you can often use them without further effort. As long as those specific braces/parens within the regex are paired or escaped correctly, these variants are quite readable.
虽然它们中的大多数也用作正则表达式元字符,但您通常可以毫不费力地使用它们。只要正则表达式中的那些特定大括号/括号正确配对或转义,这些变体就非常易读。
Fancy regex delimiters
花哨的正则表达式分隔符
A somewhat lazy trick (which is not endorsed hereby) is using non-printable ASCII characters as delimiters. This works easily in PHP by using double quotes for the regex string, and octal escapes for delimiters:
一个有点懒惰的技巧(这里不认可)是使用不可打印的 ASCII 字符作为分隔符。通过对正则表达式字符串使用双引号,并为分隔符使用八进制转义,这在 PHP 中很容易工作:
preg_match("delimiter = *p++;
if (isalnum((int)*(unsigned char *)&delimiter) || delimiter == '\') {
php_error_docref(NULL,E_WARNING, "Delimiter must not…");
return NULL;
}
1 abc+ int brackets = 1; /* brackets nesting level */
while (*pp != 0) {
if (*pp == '\' && pp[1] != 0) pp++;
else if (*pp == end_delimiter && --brackets <= 0)
break;
else if (*pp == start_delimiter)
brackets++;
pp++;
}
1mix"
The \001is just a control character ?that's not usually needed. Therefore it's highly unlikely to appear within most regex patterns. Which makes it suitable here, even though not very legible.
这\001只是一个?通常不需要的控制字符。因此,它极不可能出现在大多数正则表达式中。这使它适合这里,即使不是很清晰。
Sadly you can't use Unicode glyps ?as delimiters. PHP only allows single-byte characters. And why is that? Well, glad you asked:
遗憾的是,您不能使用 Unicode glyps?作为分隔符。PHP 只允许使用单字节字符。那为什么呢?好吧,很高兴你问:
PHPs delimiters atop PCRE
PCRE 上的 PHP 分隔符
The preg_*functions utilize the PCREregex engine, which itself doesn't care or provide for delimiters. For resemblence with Perl the preg_*functions implement them. Which is also why you can use modifier letters /isminstead of just constants as parameter.
这些preg_*函数使用PCRE正则表达式引擎,它本身并不关心或提供分隔符。为了与 Perl 相似,preg_*函数实现了它们。这也是为什么你可以使用修饰字母/ism而不是常量作为参数的原因。
See ext/pcre/php_pcre.con how the regex string is preprocessed:
有关如何预处理正则表达式字符串的信息,请参见ext/pcre/php_pcre.c:
First all leading whitespace is ignored.
Any non-alphanumeric symbol is taken as presumed delimiter. Note that PHP only honors single-byte characters:
delimiter = *p++; if (isalnum((int)*(unsigned char *)&delimiter) || delimiter == '\') { php_error_docref(NULL,E_WARNING, "Delimiter must not…"); return NULL; }The rest of the regex string is traversed left-to-right. Only backslash
\\-escaped symbols are ignored.\Qand\Eescapingis not honored.Should the delimiter be found again, the remainder is verified to only contain modifier letters.
If the delimiter is one of the
([{< )]}> )]}>pairable braces/brackets, then the processing logic is more elaborate.int brackets = 1; /* brackets nesting level */ while (*pp != 0) { if (*pp == '\' && pp[1] != 0) pp++; else if (*pp == end_delimiter && --brackets <= 0) break; else if (*pp == start_delimiter) brackets++; pp++; }It looks for correctly paired left and right delimiter, but ignores other braces/bracket types when counting.
The raw regex string is passed to the PCRE backend only after delimiter and modifier flags have been cut out.
首先,所有前导空格都被忽略。
任何非字母数字符号都被视为假定的分隔符。请注意,PHP 仅支持单字节字符:
<?php try { return pattern('invalid] pattern')->match($s)->all(); } catch (MalformedPatternException $e) { // your pattern was invalid }正则表达式字符串的其余部分从左到右遍历。只有反斜杠
\\转义的符号会被忽略。\Q和\E逃避不兑现。如果再次找到分隔符,则余数将被验证为仅包含修饰符字母。
如果分隔符是
##代码##([{< )]}> )]}>可配对的大括号/括号之一,则处理逻辑更加复杂。它寻找正确配对的左右分隔符,但在计数时忽略其他大括号/括号类型。
只有在分隔符和修饰符标志被删除后,原始正则表达式字符串才会传递到 PCRE 后端。
Now this is all somewhat irrelevant. But explains where the delimiter warnings come from. And this whole procedure is all to have a minimum of Perl compatibility. There are a few minor deviations of course, like the […]character class context not receiving special treatment in PHP.
现在这一切都有些无关紧要。但解释了分隔符警告的来源。而这整个程序都是为了有最低限度的Perl兼容性。当然有一些小的偏差,比如[…]字符类上下文在 PHP 中没有得到特殊处理。
More references
更多参考
- preg_match(); - Unknown modifier '+'
- Unknown modifier '/' error in PHP
- PHP RegExpr error Unkown modifier '('
- Unknown modifier '(' when using preg_match() with a REGEX expression
- PHP: Regex - Unknown modifier error
- Warning: preg_match() [function.preg-match]: Unknown modifier '('
- When does preg_match(): Unknown modifier error occur?
(Just a well-written question demonstrating prior research)
回答by Danon
If you would like to get an exception (MalformedPatternException), instead of warnings or using preg_last_error()- consider using T-Regx library:
如果您想获得异常 ( MalformedPatternException),而不是警告或使用preg_last_error()- 考虑使用T-Regx 库:

