Latin-1 / UTF-8 编码 php

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16165611/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 10:38:37  来源:igfitidea点击:

Latin-1 / UTF-8 encoding php

phpmysqlutf-8

提问by Paul Stanley

I have a db in UTF-8 encoding with a mixture of Latin-1. (I think that that is the problem)

我有一个 UTF-8 编码的 db,混合了 Latin-1。(我认为这就是问题所在)

This is how the characters look in the database.

这是字符在数据库中的外观。

?° (should be ?)
è

When I set the header to

当我将标题设置为

<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">

Then the characters come out as:

然后字符出来为:

 ?
 ?

When I remove the header, they come out as they are in the database. I want them to come out like this:

当我删除标题时,它们会像在数据库中一样出现。我希望他们像这样出来:

 ?
 è

I'm looking for a way to remedy this in PHP after the fact, if it is possible. I am unable to correct the data itself at this time, which would be the correct thing to do.

如果可能的话,我正在寻找一种在 PHP 中解决此问题的方法。我目前无法更正数据本身,这是正确的做法。

回答by Jon

Your HTML output needs to be in a single encoding, there is no way around that. This means that content in different encodings needs to be converted to your HTML encoding first. While that is possible to do with iconvor mb_convert_encoding, there are two problems you have to solve:

您的 HTML 输出需要采用单一编码,这是没有办法的。这意味着需要先将不同编码的内容转换为您的 HTML 编码。虽然这可以用iconv或来完成mb_convert_encoding,但您必须解决两个问题:

  1. You need to know (or guess) the current encoding of the content
  2. You need to do this manually, everywhere
  1. 您需要知道(或猜测)内容的当前编码
  2. 您需要在任何地方手动执行此操作

For example, a theoretical solution would be to pick UTF-8 as your HTML encoding and then do this for all strings you are going to output:

例如,理论上的解决方案是选择 UTF-8 作为 HTML 编码,然后对要输出的所有字符串执行此操作:

$string = '...'; // from the database

// If it's not already UTF-8, convert to it
if (mb_detect_encoding($string, 'utf-8', true) === false) {
    $string = mb_convert_encoding($string, 'utf-8', 'iso-8859-1');
}

echo $string;

The code above assumes that non-UTF-8 content is encoded in latin-1, which is reasonable according to your question.

上面的代码假设非 UTF-8 内容以 latin-1 编码,根据您的问题,这是合理的。

回答by Miro Markaravanes

Maybe you should choose the utf8 as the connection character set which will retrieve the characters right. The default one might be not right for your required characters.

也许您应该选择 utf8 作为连接字符集,它将正确检索字符。默认的可能不适合您所需的字符。

More details here mysql_set_charset

此处有更多详细信息mysql_set_charset

回答by Michael Eugene Yuen

I know this is an old post but in case something comes across this issue, here are what I did to solve the problem.

我知道这是一篇旧帖子,但如果遇到此问题,以下是我为解决该问题所做的工作。

1) export table(s) to sql

1) 将表导出到 sql

2) open sql with notepad++ or other editor

2)用notepad++或其他编辑器打开sql

3) copy all then paste it to a new file with BOM (or notepad and save as unicode)

3)复制所有然后将其粘贴到带有BOM(或记事本并另存为unicode)的新文件中

4) I have this on my exported file:

4)我导出的文件中有这个:

   /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
   /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
   /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
   /*!40101 SET NAMES latin1 */;

which I change SET NAMES from latin1 to utf8

我将 SET NAMES 从 latin1 更改为 utf8

   /*!40101 SET NAMES utf8 */;

if you don't have this line just simply add this new line and from

如果您没有此行,只需添加此新行并从

CREATE TABLE IF NOT EXISTS `table_name` (
  // column names....
) ENGINE=MyISAM AUTO_INCREMENT=301 DEFAULT CHARSET=latin1;

change

改变

DEFAULT CHARSET=latin1;

to

DEFAULT CHARSET=utf8;

delete the old tables (backup old tables of course) and import this new file.

删除旧表(当然备份旧表)并导入这个新文件。

It worked for me. Hope that helps.

它对我有用。希望有帮助。

回答by Adam Solymos

You have to collate 3 things in this case. Almost does not matter what is the character coding of a DB table's content, because in MySQL you can set the character coding of the communication between the DB server and your PHP script. See http://dev.mysql.com/doc/refman/5.0/en/charset-connection.htmlIf you use SET NAMES / SET CHARACTER SET the right way, you can set the communication as to get UTF-8 characters anyway.

在这种情况下,您必须整理 3 件事。DB表内容的字符编码几乎无关紧要,因为在MySQL中您可以设置DB服务器和PHP脚本之间通信的字符编码。请参阅http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html如果您以正确的方式使用 SET NAMES / SET CHARACTER SET,则无论如何都可以将通信设置为获取 UTF-8 字符.

You need to check the "physical" (byte-level) character coding of your PHP script file. Set it to UTF-8 in the text editor / IDE whichever you use.

您需要检查 PHP 脚本文件的“物理”(字节级)字符编码。无论您使用哪种文本编辑器/IDE,都将其设置为UTF-8。

You need to use the appropriate HTML header, you wrote it correctly above.

您需要使用适当的 HTML 标题,您在上面正确编写了它。

If all things match properly, the result should be alright.

如果所有事情都正确匹配,结果应该没问题。

The only possible trouble, when the textual content in the DB table have been stored with a incorrect char coding.

唯一可能的问题是,当 DB 表中的文本内容以不正确的字符编码存储时。