php mb_str_replace()... 很慢。任何替代方案?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3489495/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
mb_str_replace()... is slow. any alternatives?
提问by onassar
I want to make sure some string replacement's I'm running are multi byte safe. I've found a few mb_str_replace functions around the net but they're slow. I'm talking 20% increase after passing maybe 500-900 bytes through it.
我想确保我正在运行的一些字符串替换是多字节安全的。我在网上发现了一些 mb_str_replace 函数,但它们很慢。我说的是通过它传递 500-900 个字节后增加了 20%。
Any recommendations? I'm thinking about using preg_replace as it's native and compiled in so it might be faster. Any thoughts would be appreciated.
有什么建议吗?我正在考虑使用 preg_replace ,因为它是原生的并已编译,因此它可能会更快。任何想法将不胜感激。
回答by áxel Costas Pena
As said there, str_replace is safe to use in utf-8 contexts, as long as all parameters are utf-8 valid, because it won't be any ambiguous match between both multibyte encoded strings. If you check the validity of your input, then you have no need to look for a different function.
正如那里所说, str_replace 可以安全地用于 utf-8 上下文,只要所有参数都是 utf-8 有效的,因为它不会在两个多字节编码字符串之间出现任何歧义匹配。如果您检查输入的有效性,则无需寻找不同的函数。
回答by Alain Tiemblo
As encoding is a real challenge when there are inputs from everywhere (utf8 or others), I prefer using only multibyte-safe functions. For str_replace
, I am using this onewhich is fast enough.
当有来自任何地方(utf8 或其他)的输入时,编码是一个真正的挑战,我更喜欢只使用多字节安全函数。对于str_replace
,我正在使用这个足够快的。
if (!function_exists('mb_str_replace'))
{
function mb_str_replace($search, $replace, $subject, &$count = 0)
{
if (!is_array($subject))
{
$searches = is_array($search) ? array_values($search) : array($search);
$replacements = is_array($replace) ? array_values($replace) : array($replace);
$replacements = array_pad($replacements, count($searches), '');
foreach ($searches as $key => $search)
{
$parts = mb_split(preg_quote($search), $subject);
$count += count($parts) - 1;
$subject = implode($replacements[$key], $parts);
}
}
else
{
foreach ($subject as $key => $value)
{
$subject[$key] = mb_str_replace($search, $replace, $value, $count);
}
}
return $subject;
}
}
回答by mpen
Here's my implementation, based off Alain's answer:
这是我的实现,基于Alain 的回答:
/**
* Replace all occurrences of the search string with the replacement string. Multibyte safe.
*
* @param string|array $search The value being searched for, otherwise known as the needle. An array may be used to designate multiple needles.
* @param string|array $replace The replacement value that replaces found search values. An array may be used to designate multiple replacements.
* @param string|array $subject The string or array being searched and replaced on, otherwise known as the haystack.
* If subject is an array, then the search and replace is performed with every entry of subject, and the return value is an array as well.
* @param string $encoding The encoding parameter is the character encoding. If it is omitted, the internal character encoding value will be used.
* @param int $count If passed, this will be set to the number of replacements performed.
* @return array|string
*/
public static function mbReplace($search, $replace, $subject, $encoding = 'auto', &$count=0) {
if(!is_array($subject)) {
$searches = is_array($search) ? array_values($search) : [$search];
$replacements = is_array($replace) ? array_values($replace) : [$replace];
$replacements = array_pad($replacements, count($searches), '');
foreach($searches as $key => $search) {
$replace = $replacements[$key];
$search_len = mb_strlen($search, $encoding);
$sb = [];
while(($offset = mb_strpos($subject, $search, 0, $encoding)) !== false) {
$sb[] = mb_substr($subject, 0, $offset, $encoding);
$subject = mb_substr($subject, $offset + $search_len, null, $encoding);
++$count;
}
$sb[] = $subject;
$subject = implode($replace, $sb);
}
} else {
foreach($subject as $key => $value) {
$subject[$key] = self::mbReplace($search, $replace, $value, $encoding, $count);
}
}
return $subject;
}
His doesn't accept a character encoding, although I suppose you could set it via mb_regex_encoding
.
他不接受字符编码,尽管我想您可以通过mb_regex_encoding
.
My unit tests pass:
我的单元测试通过:
function testMbReplace() {
$this->assertSame('bbb',Str::mbReplace('a','b','aaa','auto',$count1));
$this->assertSame(3,$count1);
$this->assertSame('ccc',Str::mbReplace(['a','b'],['b','c'],'aaa','auto',$count2));
$this->assertSame(6,$count2);
$this->assertSame("\xbf\x5c\x27",Str::mbReplace("\x27","\x5c\x27","\xbf\x27",'iso-8859-1'));
$this->assertSame("\xbf\x27",Str::mbReplace("\x27","\x5c\x27","\xbf\x27",'gbk'));
}
回答by Shaunak Sontakke
Top rated note on http://php.net/manual/en/ref.mbstring.php#109937says str_replace
works for multibyte strings.
http://php.net/manual/en/ref.mbstring.php#109937上评分最高的注释说str_replace
适用于多字节字符串。