php iconv UTF-8//IGNORE 仍然产生“非法字符”错误
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/9375909/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
iconv UTF-8//IGNORE still produces "illegal character" error
提问by Znarkus
$string = iconv("UTF-8", "UTF-8//IGNORE", $string);
I thought this code would remove invalid UTF-8 characters, but it produces [E_NOTICE] "iconv(): Detected an illegal character in input string"
. What am I missing, how do I properly strip a string from illegal characters?
我认为这段代码会删除无效的 UTF-8 字符,但它会产生[E_NOTICE] "iconv(): Detected an illegal character in input string"
. 我错过了什么,如何正确地从非法字符中去除字符串?
采纳答案by msgmash.com
The output character set (the second parameter) should be different from the input character set (first param). If they are the same, then if there are illegal UTF-8 characters in the string, iconv
will reject them as being illegal according to the input character set.
输出字符集(第二个参数)应该与输入字符集(第一个参数)不同。如果它们相同,那么如果字符串中有非法的 UTF-8 字符,iconv
将根据输入的字符集拒绝它们为非法。
回答by Paul Melekhov
I know 2 methods how to fix UTF-8 string containing illegal characters:
我知道 2 种方法如何修复包含非法字符的 UTF-8 字符串:
- Illegal characters will be replaced by question marks ("?"):
- 非法字符将被问号(“?”)替换:
$message = mb_convert_encoding($message, 'UTF-8', 'UTF-8');
$message = mb_convert_encoding($message, 'UTF-8', 'UTF-8');
- Illegal characters will be removedL
- 非法字符将被删除L
$message = iconv('UTF-8', 'UTF-8//IGNORE', $message);
$message = iconv('UTF-8', 'UTF-8//IGNORE', $message);
The second method actually was described in question. But it doesn't produce any E_NOTICE
in my case. I tested with different corrupted UTF-8 strings with error_reporting(E_ALL);
and always result was as expected. Possible something was changed since 2012. I tested on PHP 7.2.9 Win.
第二种方法实际上是在问题中描述的。但E_NOTICE
在我的情况下它不会产生任何结果。我用不同的损坏的 UTF-8 字符串进行了测试,error_reporting(E_ALL);
结果总是如预期的那样。自 2012 年以来可能发生了一些变化。我在 PHP 7.2.9 Win 上进行了测试。
回答by dbekin
To simply ignore notice, you can use "@":
要简单地忽略通知,您可以使用“@”:
$string = @iconv("UTF-8", "UTF-8//IGNORE", $string);
$string = @iconv("UTF-8", "UTF-8//IGNORE", $string);