PHP/MySQL 有编码问题

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/405684/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 22:38:09  来源:igfitidea点击:

PHP/MySQL with encoding problems

phpmysqlencodingutf-8

提问by luiscubal

I am having trouble with PHP regarding encoding.

我在使用 PHP 编码时遇到问题。

I have a JavaScript/jQuery HTML5 page interact with my PHP script using $.post. However, PHP is facing a weird problem, probably related to encoding.

我有一个 JavaScript/jQuery HTML5 页面使用 $.post 与我的 PHP 脚本交互。但是,PHP 面临着一个奇怪的问题,可能与编码有关。

When I write

当我写

htmlentities("í")

I expect PHP to output í. However, instead it outputs íAt the beginning, I thought that I was making some mistake with the encodings, however

我希望 PHP 输出í. 然而,它输出í一开始,我以为我在编码上犯了一些错误,但是

htmlentities("í")=="í"?"Good":"Fail";

is outputing "Fail", where

正在输出“失败”,其中

htmlentities("í")=="í"?"Good":"Fail";

But htmlentities($search, null, "utf-8")works as expected.

htmlentities($search, null, "utf-8")按预期工作。

I want to have PHP communicate with a MySQL server, but it has encoding problems too, even if I use utf8_encode. What should I do?

我想让 PHP 与 MySQL 服务器通信,但它也有编码问题,即使我使用 utf8_encode。我该怎么办?

EDIT: On the SQL command, writing

编辑:在 SQL 命令上,写

SELECT id,uid,type,value FROM users,profile
WHERE uid=id AND type='name' AND value='XXX';

where XXX contains no í chars, works as expected, but it does not if there is any 'í' char.

其中 XXX 不包含 í 字符,按预期工作,但如果有任何 'í' 字符,则不会。

SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SELECT id,uid,type,value FROM users,profile
WHERE uid=id AND type='name' AND value='XXX';

Not only fails for í chars, but it ALSO fails for strings without any 'special' characters. Removing the ' chars from SET NAMES and SET CHARACTER SET doesn't seem to change anything.

不仅对于 í 字符失败,而且对于没有任何“特殊”字符的字符串也会失败。从 SET NAMES 和 SET CHARACTER SET 中删除 ' 字符似乎没有任何改变。

I am connecting to the MySQL database using PDO.

我正在使用 PDO 连接到 MySQL 数据库。

EDIT 2: I am using MySQL version 5.1.30 of XAMPP for Linux.

编辑 2:我在 Linux 上使用 XAMPP 的 MySQL 5.1.30 版。

EDIT 3: Running SHOW VARIABLES LIKE '%character%'from PhpMyAdmin outputs

编辑 3:SHOW VARIABLES LIKE '%character%'从 PhpMyAdmin 输出运行

character_set_client    utf8
character_set_connection    utf8
character_set_database  latin1
character_set_filesystem    binary
character_set_results   utf8
character_set_server    latin1
character_set_system    utf8
character_sets_dir  /opt/lampp/share/mysql/charsets/

Running the same query from my PHP script(with print_r) outputs:

从我的 PHP 脚本(使用 print_r)输出运行相同的查询:

Array
(
    [0] => Array
        (
            [Variable_name] => character_set_client
            [0] => character_set_client
            [Value] => latin1
            [1] => latin1
        )

    [1] => Array
        (
            [Variable_name] => character_set_connection
            [0] => character_set_connection
            [Value] => latin1
            [1] => latin1
        )

    [2] => Array
        (
            [Variable_name] => character_set_database
            [0] => character_set_database
            [Value] => latin1
            [1] => latin1
        )

    [3] => Array
        (
            [Variable_name] => character_set_filesystem
            [0] => character_set_filesystem
            [Value] => binary
            [1] => binary
        )

    [4] => Array
        (
            [Variable_name] => character_set_results
            [0] => character_set_results
            [Value] => latin1
            [1] => latin1
        )

    [5] => Array
        (
            [Variable_name] => character_set_server
            [0] => character_set_server
            [Value] => latin1
            [1] => latin1
        )

    [6] => Array
        (
            [Variable_name] => character_set_system
            [0] => character_set_system
            [Value] => utf8
            [1] => utf8
        )

    [7] => Array
        (
            [Variable_name] => character_sets_dir
            [0] => character_sets_dir
            [Value] => /opt/lampp/share/mysql/charsets/
            [1] => /opt/lampp/share/mysql/charsets/
        )

)

Running

跑步

SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SHOW VARIABLES LIKE '%character%'

outputs an empty array.

输出一个空数组。

回答by Eran Galperin

It's very important to specify the encoding of htmlentitiesto match that of the input, as you did in your final example but omitted in the first three.

指定htmlentities的编码以匹配输入的编码非常重要,就像您在最后一个示例中所做的那样,但在前三个示例中省略了。

htmlentities($text,ENT_COMPAT,'utf-8');

Regarding communications with MySQL, you need to make sure the connection collation and character set matches the data you are transmitting. You can either set this in the configuration file, or at runtime using the following queries:

关于与 MySQL 的通信,您需要确保连接排序规则和字符集与您正在传输的数据匹配。您可以在配置文件中或在运行时使用以下查询进行设置:

SET NAMES utf8;
SET CHARACTER SET utf8;

Make sure the table, database and server character sets match as well. There is one setting you can't change at run-time, and that's the server's character set. You need to modify it in the configuration file:

确保表、数据库和服务器字符集也匹配。有一种设置不能在运行时更改,那就是服务器的字符集。需要在配置文件中修改:

[mysqld]
character-set-server = utf8
default-character-set = utf8 
skip-character-set-client-handshake

Read more on characters sets and collations in MySQL in the manual.

在手册中阅读有关 MySQL字符集和排序规则的更多信息。

回答by Anthony Accioly

Late revival. But for further reference here are some extra tips:

迟来的复兴。但为了进一步参考,这里有一些额外的提示:

  1. Use mysql_set_charsetinstead of SET xxx
  2. Make sure you are saving the file with UTF-8 encoding (this is often overlooked)
  3. Set headers:
    <?php header("Content-type: text/html; charset=utf-8"); ?>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

  4. If your Apache server configuration contains a AddDefaultCharsetdirective with a different encoding go yell at your host administrator.
  1. 使用mysql_set_charset而不是SET xxx
  2. 确保使用 UTF-8 编码保存文件(这通常被忽视)
  3. 设置标题:
    <?php header("Content-type: text/html; charset=utf-8"); ?>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

  4. 如果您的 Apache 服务器配置包含具有不同编码的AddDefaultCharset指令,请向您的主机管理员大喊大叫。

回答by EffectiX

I just ran into this issue. I have a whole website's content in Spanish, with all the special characters you can expect (áéíóú?) and their capital letter versions.

我刚刚遇到了这个问题。我有一个完整的西班牙语网站内容,包含您可以期待的所有特殊字符 (áéíóú?) 及其大写字母版本。

In my case it was an inconsistency with the server charset/collation. Everything else was set to utf8, but the server charset, which had latin1. This caused all utf8 data entered in the database to display in its raw encoded form, likeL í would equal an A with tilde ~ ...

在我的情况下,它与服务器字符集/排序规则不一致。其他所有内容都设置为 utf8,但具有 latin1.txt 的服务器字符集。这导致输入到数据库中的所有 utf8 数据以其原始编码形式显示,例如 Lí 将等于带有波浪号的 A ~ ...

I am using mysqli, and to fix it, I made use of the method explained above by Anthony Accioly (using mysql_set_charset). Said method has a mysqliversion and that is what I used.

我正在使用 mysqli,为了修复它,我使用了 Anthony Accioly 上面解释的方法(使用 mysql_set_charset)。所述方法有一个mysqli版本,这就是我使用的。

After that, I was puzzled. I still had a mess when viewing my website. Of course, I didn't know that by changing that latin1 to utf8 I would also mess up the character encode/decode of the whole thing. So I used the help of an online string encoder/decoderto fix my table data.

之后,我就纳闷了。我在查看我的网站时仍然一团糟。当然,我不知道通过将 latin1 更改为 utf8 我也会弄乱整个事情的字符编码/解码。所以我使用在线字符串编码器/解码器的帮助来修复我的表数据。

I made various exports of all my content data (you can set them up to get update queries and that will be faster for your update process) and ran the sql output through the afore mentioned online encoder/decoder, then copy pasted the fixed queries on phpmyadmin sql panel... thus fixing my encoding errors. Everything is now how it should be, AND I am able to process lossy searches again: Maria, maria, maría, mariá will all match maría, maria, Maria, etc. All acute characters evaluate to their base vowel character. Epic Win.

我对所有内容数据进行了各种导出(您可以将它们设置为获取更新查询,这将使您的更新过程更快)并通过上述在线编码器/解码器运行 sql 输出,然后将固定查询复制粘贴到phpmyadmin sql 面板...从而修复我的编码错误。现在一切都应该是这样,我能够再次处理有损搜索:Maria、maria、maría、maria 都将匹配 maría、maria、Maria 等。所有尖字符都评估为其基本元音字符。大胜。