PHP:使用 iconv 处理特殊字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4794647/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 14:23:49  来源:igfitidea点击:

PHP: Dealing special characters with iconv

phpspecial-charactersiconv

提问by laukok

I still don't understand how iconvworks.

我仍然不明白是如何iconv工作的。

For instance,

例如,

$string = "L?ic & René";
$output = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $string); 

I get,

我明白了,

Notice: iconv() [function.iconv]: Detected an illegal character in input string in...

注意: iconv() [function.iconv]: 在输入字符串中检测到非法字符...

$string = "L?ic";or $string = "René";

$string = "L?ic";或者 $string = "René";

I get,

我明白了,

Notice: iconv() [function.iconv]:Detected an incomplete multibyte character in input string in.

注意:iconv() [function.iconv]:在输入字符串中检测到一个不完整的多字节字符。

I get nothing with $string = "&";

我一无所获 $string = "&";

There are two sets of different outputs I need store them in the two different columns inside the table of my database,

我需要将两组不同的输出存储在数据库表内的两个不同列中,

  1. I need to convert L?ic & Renéto Loic & Renefor clean url purposes.

  2. I need to keep them as they are - L?ic & Renéas L?ic & Renéthen only convert them with htmlentities($string, ENT_QUOTES);when displaying them on my html page.

  1. 我需要转换L?ic & RenéLoic & Rene清洁网址的目的。

  2. 我需要保持它们原样 -L?ic & René因为L?ic & René只有htmlentities($string, ENT_QUOTES);在我的 html 页面上显示它们时才将它们转换。

I tried with some of the suggestions in php.netbelow, but still don't work,

我尝试了下面php.net中的一些建议,但仍然不起作用,

I had a situation where I needed some characters transliterated, but the others ignored (for weird diacritics like ayn or hamza). Adding //TRANSLIT//IGNORE seemed to do the trick for me. It transliterates everything that is able to be transliterated, but then throws out stuff that can't be.

我遇到过需要音译某些字符的情况,但其他字符却被忽略了(对于 ayn 或 hamza 等奇怪的变音符号)。添加 //TRANSLIT//IGNORE 似乎对我有用。它会音译所有可以音译的东西,然后丢弃不能音译的东西。

So:

所以:

$string = "?ABBāSāBāD";

echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $string);
// output: [nothing, and you get a notice]

echo iconv('UTF-8', 'ISO-8859-1//IGNORE', $string);
// output: ABBSBD

echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $string);
// output: ABBASABAD
// Yay! That's what I wanted!

and another,

而另一个,

Andries Seutens 07-Nov-2009 07:38
When doing transliteration, you have to make sure that your LC_COLLATE is properly set, otherwise the default POSIX will be used.

To transform "rené" into "rene" we could use the following code snippet:
setlocale(LC_CTYPE, 'nl_BE.utf8');

$string = 'rené';
$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string);

echo $string; // outputs rene

How can I actually work them out?

我怎样才能真正解决它们?

Thanks.

谢谢。

EDIT:

编辑:

This is the source file I test the code,

这是我测试代码的源文件,

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" class="no-js">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<?php
$string = "L?ic & René";
$output = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $string); 
?>
</html>

采纳答案by wimvds

And did you save your source file in UTF-8 encoding? If not (and I guess you didn't since that will produce the "incomplete multibyte character" error), then try that first.

您是否以 UTF-8 编码保存了源文件?如果没有(我猜你没有,因为这会产生“不完整的多字节字符”错误),然后先尝试。

回答by Riccardo

$clean = iconv('UTF-8', 'ASCII//TRANSLIT', utf8_encode($s));