PHP:将 Unicode 字符串转换为 ANSI 字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4691477/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 13:59:28  来源:igfitidea点击:

PHP: Converting Unicode strings to ANSI strings

phpstringunicodeutf-8

提问by pyon

Does PHP have any standard function(s) to convert Unicode strings to plain, good old-fashioned ANSI strings (or whatever format PHP's htmlentitiesunderstands?

PHP 是否有任何标准函数可以将 Unicode 字符串转换为普通的、老式的 ANSI 字符串(或 PHP 能够htmlentities理解的任何格式?

Is there any function that converts UTF-8 strings to HTML that can be understood by the most popular browsers?

是否有任何函数可以将 UTF-8 字符串转换为最流行的浏览器可以理解的 HTML?

回答by Christian Kuetbach

This can't work properly. Stored with Unicode there are many more Characters than with ANSI. So if you "convert" to ANSI, you will loose lots of charackters.

这不能正常工作。使用 Unicode 存储的字符比使用 ANSI 存储的字符多得多。因此,如果您“转换”为 ANSI,您将丢失很多字符。

http://php.net/manual/en/function.htmlentities.php

http://php.net/manual/en/function.htmlentities.php

You can use Unicode (UTF-8) charset with htmlentities:

您可以在 htmlentities 中使用 Unicode (UTF-8) 字符集:

string htmlentities ( string $string [, int $flags = ENT_COMPAT [, string $charset [, bool $double_encode = true ]]] )

string htmlentities ( string $string [, int $flags = ENT_COMPAT [, string $charset [, bool $double_encode = true ]]] )

htmlentities($myString, ENT_COMPAT, "UTF-8");should work.

htmlentities($myString, ENT_COMPAT, "UTF-8");应该管用。

回答by John Parker

Whilst I'd reallyrecommend keeping everything in UTF-8 (as per my comment on the question), you can use the mb_convert_encodingfunction to convert any known UTF-8 string to US-ASCII as such:

虽然我真的建议将所有内容都保留在 UTF-8 中(根据我对问题的评论),但您可以使用mb_convert_encoding函数将任何已知的 UTF-8 字符串转换为 US-ASCII,如下所示:

$asciiString = mb_convert_encoding ($sourceString, 'US-ASCII', 'UTF-8');

However, this may not be a lossless conversion depending on the source character string. (Characters such as "é" will simply disappear into the void.)

但是,这可能不是无损转换,具体取决于源字符串。(诸如“é”之类的字符将直接消失在空白中。)

回答by Ignacio Vazquez-Abrams

Browsers already understand UTF-8. If you want them to knowthat you're sending them UTF-8 then you need to tell them.

浏览器已经理解 UTF-8。如果您想让他们知道您正在向他们发送 UTF-8,那么您需要告诉他们