PHP 用重音符号转换外来字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5782506/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PHP convert foreign characters with accents
提问by Devin Crossman
Hi I'm trying to compare some text to the text in a database.. in the database any text with an accent is encoded like in html (ie. é) when I compare the database text to my string it doesn't match because my string just shows é .. when I use the php function htmlentities to encode the string first the é turns into é weird? using htmlspecialchars doesn't encode the é at all..
嗨,我正在尝试将一些文本与数据库中的文本进行比较......在数据库中,当我将数据库文本与我的字符串进行比较时,任何带有重音符号的文本都像在 html 中一样编码(即 é)因为我的字符串只显示 é .. 当我首先使用 php 函数 htmlentities 对字符串进行编码时,é 会变成 é 奇怪吗?使用 htmlspecialchars 根本不编码 é..
how would you suggest I compare é to é as well as all the other accented characters?
您建议我如何将 é 与 é 以及所有其他重音字符进行比较?
回答by Emil Vikstr?m
You need to send in the correct charset to htmlentities. It looks like you're using UTF-8, but the default is ISO-8859-1. Change it like this:
您需要将正确的字符集发送到 htmlentities。看起来您使用的是 UTF-8,但默认值为 ISO-8859-1。像这样改变它:
$encoded = htmlentities($text, ENT_COMPAT, 'UTF-8');
Another solution is to convert the text to ISO-8859-1 before encoding, but that may destroy information (ISO-8859-1 does not contain nearly as many characters as UTF-8). If you want to try that instead, do like this:
另一种解决方案是在编码之前将文本转换为 ISO-8859-1,但这可能会破坏信息(ISO-8859-1 包含的字符数远不及 UTF-8)。如果您想尝试一下,请执行以下操作:
$encoded = htmlentities(utf8_decode($text));
回答by kaha
I'm working on french site, and I also had same problem. This is the function that I use.
我在法国网站上工作,我也遇到了同样的问题。这是我使用的功能。
function convert_accent($string)
{
return htmlspecialchars_decode(htmlentities(utf8_decode($string)));
}
What it does it decodes your string to utf8, than converts everything HTML entities. even tags. But we want to convert tags back to normal, than htmlspecialchars_decode will convert them back. So in the end you will get a string with converted accents without touching tags. You can use pass through this function your email content before sending it to recipent.
它的作用是将您的字符串解码为 utf8,而不是转换所有 HTML 实体。甚至标签。但是我们想将标签转换回正常状态,而不是 htmlspecialchars_decode 将它们转换回来。所以最后你会得到一个带有转换重音的字符串,而不会触及标签。您可以在将电子邮件内容发送给 Recipent 之前使用通过此功能传递您的电子邮件内容。
Another issue you might face is that, sometimes with this function the content from database converts to ? . In this case you should do this before running your query:
您可能面临的另一个问题是,有时使用此功能,数据库中的内容会转换为 ? . 在这种情况下,您应该在运行查询之前执行此操作:
mysql_query("SET NAMES `utf8`");
But you might need to do it, it depends on encoding in your table. I hope it helps.
但是您可能需要这样做,这取决于您的表中的编码。我希望它有帮助。
回答by user3772503
Ran into similar issues recently. Followed Emil's answer and it worked fine locally but not on our dev/stage environments. I ended up using this and it worked all around:
最近遇到了类似的问题。按照 Emil 的回答,它在本地运行良好,但在我们的开发/阶段环境中却没有。我最终使用了它并且它在所有方面都有效:
$title = html_entity_decode(utf8_decode($item));
Thanks for leading me in the right direction!
感谢您带领我走向正确的方向!
回答by Vikramraj
Use simply as blow it works for Norwegian characters:
简单地使用它适用于挪威字符:
function convert_accent($string)
{
return htmlspecialchars(utf8_decode($string));
}
回答by PachinSV
The comparing task is related to the charset and the collation you selected when you create the database or the tables. If you are saving strings with a lot of accents like spanish I sugget you to use charset uft8 and the collation could be the more accurate to the language(english, french or whatever) you're using.
比较任务与您在创建数据库或表时选择的字符集和排序规则有关。如果您要保存带有很多重音的字符串,例如西班牙语,我建议您使用字符集 uft8,并且排序规则可能对您使用的语言(英语、法语或其他)更准确。
The best thing of using the correct charset in the database is that you can save the string in natural way e.g: my name I can store it as is "Mario Juárez" and I have no need of doing some weird conversions.
在数据库中使用正确字符集的最佳方式是您可以以自然的方式保存字符串,例如:我的名字我可以将它存储为“Mario Juárez”,我不需要做一些奇怪的转换。