php 用 preg_replace_callback 替换 preg_replace() e 修饰符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15454220/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Replace preg_replace() e modifier with preg_replace_callback
提问by Casey
I'm terrible with regular expressions. I'm trying to replace this:
我对正则表达式很糟糕。我正在尝试替换这个:
public static function camelize($word) {
return preg_replace('/(^|_)([a-z])/e', 'strtoupper("\2")', $word);
}
with preg_replace_callback with an anonymous function. I don't understand what the \\2 is doing. Or for that matter exactly how preg_replace_callback works.
使用带有匿名函数的 preg_replace_callback。我不明白 \\2 在做什么。或者就此而言, preg_replace_callback 究竟是如何工作的。
What would be the correct code for achieving this?
实现这一目标的正确代码是什么?
采纳答案by IMSoP
In a regular expression, you can "capture" parts of the matched string with (brackets); in this case, you are capturing the (^|_)and ([a-z])parts of the match. These are numbered starting at 1, so you have back-references 1 and 2. Match 0 is the whole matched string.
在正则表达式中,您可以使用(brackets);来“捕获”匹配字符串的一部分。在这种情况下,您正在捕获匹配的(^|_)和([a-z])部分。它们从 1 开始编号,因此您有反向引用 1 和 2。匹配 0 是整个匹配的字符串。
The /emodifier takes a replacement string, and substitutes backslash followed by a number (e.g. \1) with the appropriate back-reference - but because you're inside a string, you need to escape the backslash, so you get '\\1'. It then (effectively) runs evalto run the resulting string as though it was PHP code (which is why it's being deprecated, because it's easy to use evalin an insecure way).
该/e调节器将替换字符串,以及替代反斜线后面的数字(例如\1)用适当的反向参考-而是因为你是一个字符串中,你需要转义反斜线,让您得到'\\1'。然后它(有效地)运行eval以运行结果字符串,就好像它是 PHP 代码一样(这就是它被弃用的原因,因为它很容易以eval不安全的方式使用)。
The preg_replace_callbackfunction instead takes a callback function and passes it an array containing the matched back-references. So where you would have written '\\1', you instead access element 1 of that parameter - e.g. if you have an anonymous function of the form function($matches) { ... }, the first back-reference is $matches[1]inside that function.
该preg_replace_callback函数取而代之的是一个回调函数,并将一个包含匹配的反向引用的数组传递给它。因此,您将在何处编写'\\1',而是访问该参数的元素 1 - 例如,如果您有一个匿名函数function($matches) { ... },则第一个反向引用$matches[1]位于该函数内部。
So a /eargument of
所以一个/e论点
'do_stuff(\1) . "and" . do_stuff(\2)'
could become a callback of
可能成为回调
function($m) { return do_stuff($m[1]) . "and" . do_stuff($m[2]); }
Or in your case
或者在你的情况下
'strtoupper("\2")'
could become
可以成为
function($m) { return strtoupper($m[2]); }
Note that $mand $matchesare not magic names, they're just the parameter name I gave when declaring my callback functions. Also, you don't have to pass an anonymous function, it could be a function name as a string, or something of the form array($object, $method), as with any callback in PHP, e.g.
请注意,$m并$matches没有神奇的名字,他们只是说出我的回调函数,当我把参数名称。此外,您不必传递匿名函数,它可以是字符串形式的函数名称,也可以是某种形式的函数array($object, $method),就像 PHP 中的任何回调一样,例如
function stuffy_callback($things) {
return do_stuff($things[1]) . "and" . do_stuff($things[2]);
}
$foo = preg_replace_callback('/([a-z]+) and ([a-z]+)/', 'stuffy_callback', 'fish and chips');
As with any function, you can't access variables outside your callback (from the surrounding scope) by default. When using an anonymous function, you can use the usekeyword to import the variables you need to access, as discussed in the PHP manual. e.g. if the old argument was
与任何函数一样,默认情况下您无法访问回调之外(从周围范围)的变量。使用匿名函数时,可以使用use关键字导入需要访问的变量,如 PHP 手册中所述。例如,如果旧的论点是
'do_stuff(\1, $foo)'
then the new callback might look like
那么新的回调可能看起来像
function($m) use ($foo) { return do_stuff($m[1], $foo); }
Gotchas
陷阱
- Use of
preg_replace_callbackis instead ofthe/emodifier on the regex, so you need to remove that flag from your "pattern" argument. So a pattern like/blah(.*)blah/meiwould become/blah(.*)blah/mi. - The
/emodifier used a variant ofaddslashes()internally on the arguments, so some replacements usedstripslashes()to remove it; in most cases, you probably want to remove the call tostripslashesfrom your new callback.
- 使用
preg_replace_callbackis而不是/e正则表达式上的修饰符,因此您需要从“模式”参数中删除该标志。所以像这样的模式/blah(.*)blah/mei会变成/blah(.*)blah/mi. - 所述
/e改性剂中使用的变体addslashes()上的内部参数,所以一些替代用于stripslashes()将其取出; 在大多数情况下,您可能希望stripslashes从新回调中删除对的调用。
回答by mario
preg_replace shim with eval support
带有 eval 支持的 preg_replace shim
This is very inadvisable. But if you're not a programmer, or really prefer terrible code, you could use a substitute preg_replacefunction to keep your /eflag working temporarily.
这是非常不可取的。但是如果你不是程序员,或者真的更喜欢糟糕的代码,你可以使用替代preg_replace函数来让你的/e标志暂时工作。
/**
* Can be used as a stopgap shim for preg_replace() calls with /e flag.
* Is likely to fail for more complex string munging expressions. And
* very obviously won't help with local-scope variable expressions.
*
* @license: CC-BY-*.*-comment-must-be-retained
* @security: Provides `eval` support for replacement patterns. Which
* poses troubles for user-supplied input when paired with overly
* generic placeholders. This variant is only slightly stricter than
* the C implementation, but still susceptible to varexpression, quote
* breakouts and mundane exploits from unquoted capture placeholders.
* @url: https://stackoverflow.com/q/15454220
*/
function preg_replace_eval($pattern, $replacement, $subject, $limit=-1) {
# strip /e flag
$pattern = preg_replace('/(\W[a-df-z]*)e([a-df-z]*)$/i', '', $pattern);
# warn about most blatant misuses at least
if (preg_match('/\(\.[+*]/', $pattern)) {
trigger_error("preg_replace_eval(): regex contains (.*) or (.+) placeholders, which easily causes security issues for unconstrained/user input in the replacement expression. Transform your code to use preg_replace_callback() with a sane replacement callback!");
}
# run preg_replace with eval-callback
return preg_replace_callback(
$pattern,
function ($matches) use ($replacement) {
# substitute //… with literals from $matches[]
$repl = preg_replace_callback(
'/(?<!\\)(?:[$]|\\)(\d+)/',
function ($m) use ($matches) {
if (!isset($matches[$m[1]])) { trigger_error("No capture group for '$m[0]' eval placeholder"); }
return addcslashes($matches[$m[1]], '\"\'\`$\/**
* Use once to generate a crude preg_replace_callback() substitution. Might often
* require additional changes in the `return …;` expression. You'll also have to
* refit the variable names for input/output obviously.
*
* >>> preg_replace_eval_replacement("/\w+/", 'strtopupper("")', $ignored);
*/
function preg_replace_eval_replacement($pattern, $replacement, $subjectvar="IGNORED") {
$pattern = preg_replace('/(\W[a-df-z]*)e([a-df-z]*)$/i', '', $pattern);
$replacement = preg_replace_callback('/[\'\"]?(?<!\\)(?:[$]|\\)(\d+)[\'\"]?/', function ($m) { return "$m[{$m[1]}]"; }, $replacement);
$ve = "var_export";
$bt = debug_backtrace(0, 1)[0];
print "<pre><code>
#----------------------------------------------------
# replace preg_*() call in '$bt[file]' line $bt[line] with:
#----------------------------------------------------
$OUTPUT_VAR = preg_replace_callback(
{$ve($pattern, TRUE)},
function ($m) {
return {$replacement};
},
$YOUR_INPUT_VARIABLE_GOES_HERE
)
#----------------------------------------------------
</code></pre>\n";
}
'); # additionally escapes '$' and backticks
},
$replacement
);
# run the replacement expression
return eval("return $repl;");
},
$subject,
$limit
);
}
In essence, you just include that function in your codebase, and edit preg_replaceto preg_replace_evalwherever the /eflag was used.
本质上,您只需将该函数包含在您的代码库中,然后编辑 preg_replace到使用preg_replace_eval该/e标志的任何位置。
Pros and cons:
优缺点:
- Really just tested with a few samples from Stack Overflow.
- Does only support the easy cases (function calls, not variable lookups).
- Contains a few more restrictions and advisory notices.
- Will yield dislocated and less comprehensible errors for expression failures.
- However is still a usable temporary solution and doesn't complicate a proper transition to
preg_replace_callback. - And the license comment is just meant to deter people from overusing or spreading this too far.
- 真的只是用 Stack Overflow 的几个样本进行了测试。
- 只支持简单的情况(函数调用,不支持变量查找)。
- 包含更多限制和咨询通知。
- 对于表达失败,将产生错位且难以理解的错误。
- 但是仍然是一个可用的临时解决方案,并且不会使到
preg_replace_callback. - 许可证注释只是为了阻止人们过度使用或传播得太远。
Replacement code generator
替换码生成器
Now this is somewhat redundant. But might help those users who are still overwhelmed
with manually restructuring their code to preg_replace_callback. While this is effectively more time consuming, a code generator has less trouble to expand the /ereplacement string into an expression. It's a very unremarkable conversion, but likely suffices for the most prevalent examples.
现在这有点多余。但可能会帮助那些仍然不知所措的用户手动将其代码重构为preg_replace_callback. 虽然这实际上更耗时,但代码生成器将/e替换字符串扩展为表达式的麻烦更少。这是一个非常不起眼的转换,但对于最普遍的例子来说可能就足够了。
To use this function, edit any broken preg_replacecall into preg_replace_eval_replacementand run it once. This will print outthe according preg_replace_callbackblock to be used in its place.
要使用此功能,请编辑任何中断的preg_replace调用preg_replace_eval_replacement并运行一次。这将打印出preg_replace_callback要在其位置使用的相应块。
pattern('(^|_)([a-z])')->replace($word)->by()->group(2)->callback('strtoupper');
Take in mind that mere copy&pasting is notprogramming. You'll have to adapt the generated code back to your actual input/output variable names, or usage context.
请记住,仅仅复制和粘贴不是编程。您必须将生成的代码调整回您的实际输入/输出变量名称或使用上下文。
- Specificially the
$OUTPUT =assignment would have to go if the previouspreg_replacecall was used in anif. - It's best to keep temporary variables or the multiline code block structure though.
- 特别是
$OUTPUT =,如果前一个preg_replace调用在if. - 不过最好保留临时变量或多行代码块结构。
And the replacement expression may demand more readability improvements or rework.
并且替换表达式可能需要更多的可读性改进或返工。
- For instance
stripslashes()often becomes redundant in literal expressions. - Variable-scope lookups require a
useorglobalreference for/within the callback. - Unevenly quote-enclosed
"-$1-$2"capture references will end up syntactically broken by the plain transformation into"-$m[1]-$m[2].
- 例如,
stripslashes()在文字表达式中经常变得多余。 - 变量范围查找需要一个
useorglobal引用用于/在回调中。 - 不均匀的引号括起来的
"-$1-$2"捕获引用最终会被简单地转换为"-$m[1]-$m[2].
The code output is merely a starting point. And yes, this would have been more useful as an online tool. This code rewriting approach (edit, run, edit, edit) is somewhat impractical. Yet could be more approachable to those who are accustomed to task-centric coding (more steps, more uncoveries). So this alternative might curb a few more duplicate questions.
代码输出只是一个起点。是的,这作为在线工具会更有用。这种代码重写方法(编辑、运行、编辑、编辑)有点不切实际。然而,对于习惯于以任务为中心的编码(更多步骤,更多发现)的人来说可能更平易近人。所以这个替代方案可能会抑制更多重复的问题。
回答by Danon
You shouldn't use flag e(or evalin general).
您不应该使用标志e(或eval一般情况下)。
You can also use T-Regx library
您还可以使用T-Regx 库
##代码##
