php iconv UTF-8//IGNORE 仍然产生“非法字符”错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9375909/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-26 06:41:07  来源:igfitidea点击:

iconv UTF-8//IGNORE still produces "illegal character" error

phputf-8iconv

提问by Znarkus

$string = iconv("UTF-8", "UTF-8//IGNORE", $string);

I thought this code would remove invalid UTF-8 characters, but it produces [E_NOTICE] "iconv(): Detected an illegal character in input string". What am I missing, how do I properly strip a string from illegal characters?

我认为这段代码会删除无效的 UTF-8 字符,但它会产生[E_NOTICE] "iconv(): Detected an illegal character in input string". 我错过了什么,如何正确地从非法字符中去除字符串?

采纳答案by msgmash.com

The output character set (the second parameter) should be different from the input character set (first param). If they are the same, then if there are illegal UTF-8 characters in the string, iconvwill reject them as being illegal according to the input character set.

输出字符集(第二个参数)应该与输入字符集(第一个参数)不同。如果它们相同,那么如果字符串中有非法的 UTF-8 字符,iconv将根据输入的字符集拒绝它们为非法。

回答by Paul Melekhov

I know 2 methods how to fix UTF-8 string containing illegal characters:

我知道 2 种方法如何修复包含非法字符的 UTF-8 字符串:

  1. Illegal characters will be replaced by question marks ("?"):
  1. 非法字符将被问号(“?”)替换:

$message = mb_convert_encoding($message, 'UTF-8', 'UTF-8');

$message = mb_convert_encoding($message, 'UTF-8', 'UTF-8');

  1. Illegal characters will be removedL
  1. 非法字符将被删除L

$message = iconv('UTF-8', 'UTF-8//IGNORE', $message);

$message = iconv('UTF-8', 'UTF-8//IGNORE', $message);

The second method actually was described in question. But it doesn't produce any E_NOTICEin my case. I tested with different corrupted UTF-8 strings with error_reporting(E_ALL);and always result was as expected. Possible something was changed since 2012. I tested on PHP 7.2.9 Win.

第二种方法实际上是在问题中描述的。但E_NOTICE在我的情况下它不会产生任何结果。我用不同的损坏的 UTF-8 字符串进行了测试,error_reporting(E_ALL);结果总是如预期的那样。自 2012 年以来可能发生了一些变化。我在 PHP 7.2.9 Win 上进行了测试。

回答by dbekin

To simply ignore notice, you can use "@":

要简单地忽略通知,您可以使用“@”:

$string = @iconv("UTF-8", "UTF-8//IGNORE", $string);

$string = @iconv("UTF-8", "UTF-8//IGNORE", $string);