在 php 和 mysql 中使用 utf8mb4

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16893035/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 11:51:40  来源:igfitidea点击:

Using utf8mb4 with php and mysql

phpmysql

提问by nourdine

I have read that mysql >= 5.5.3 fully supports every possible character if you USE the encoding utf8mb4for a certain table/column http://mathiasbynens.be/notes/mysql-utf8mb4

我已经读到 mysql >= 5.5.3 完全支持每个可能的字符,如果你对某个表/列使用编码utf8mb4 http://mathiasbynens.be/notes/mysql-utf8mb4

looks nice. Only I noticed that the mb_functions in php does not! I cannot find it anywhere in the list: http://php.net/manual/en/mbstring.supported-encodings.php

看起来不错。只有我注意到 php 中的 mb_functions 没有!我在列表中的任何地方都找不到它:http: //php.net/manual/en/mbstring.supported-encodings.php

Not only have I read things but I also made a test.

我不仅阅读了一些东西,而且还做了一个测试。

I have added data to a mysql utf8mb4 table using a php script where the internal encoding was set to UTF-8: mb_internal_encoding("UTF-8");

我已使用内部编码设置为 UTF-8 的 php 脚本将数据添加到 mysql utf8mb4 表中: mb_internal_encoding("UTF-8");

and, as expected, the characters looks messy once in the db.

并且,正如预期的那样,字符在数据库中看起来很乱。

Any idea how I can make php and mysql talk the same encoding (possibly a 4 bytes one) and still have FULL support to any world language?

知道如何让 php 和 mysql 使用相同的编码(可能是 4 个字节的编码)并且仍然完全支持任何世界语言吗?

Also why is utf8mb4 different from utf32?

另外为什么 utf8mb4 与 utf32 不同?

回答by deceze

MySQL's utf8encoding is notactual UTF-8. It's an encoding that is kinda like UTF-8, but only supports a subset of what UTF-8 supports. utf8mb4is actualUTF-8. This difference is an internal implementation detailof MySQL. Both look like UTF-8 on the PHP side. Whether you use utf8or utf8mb4, PHP will get valid UTF-8 in both cases.

MySQL 的utf8编码不是实际的 UTF-8。这是一种有点像 UTF-8 的编码,但只支持 UTF-8 支持的一个子集。utf8mb4实际的UTF-8。这种差异是MySQL的内部实现细节。两者在 PHP 方面看起来都像 UTF-8。无论您使用utf8utf8mb4,PHP 都会在这两种情况下获得有效的 UTF-8。

What you need to make sure is that the connection encodingbetween PHP and MySQL is set to utf8mb4. If it's set to utf8, MySQL will not support all characters. You set this connection encoding using mysql_set_charset(), the PDO charsetDSN connection parameter or whatever other method is appropriate for your database API of choice.

您需要确保PHP 和 MySQL 之间的连接编码设置为utf8mb4. 如果设置为utf8,MySQL 将不支持所有字符。您可以使用mysql_set_charset()、PDO charsetDSN 连接参数或任何其他适合您选择的数据库 API 的方法来设置此连接编码。



mb_internal_encodingjust sets the default value for the $encodingparameter all mb_*functions have. It has nothing to do with MySQL.

mb_internal_encoding只需为$encoding所有mb_*函数具有的参数设置默认值。它与 MySQL 无关。

UTF-8 and UTF-32 differ in how they encode characters. UTF-8 uses a minimumof 1 byte for a character and a maximum of 4. UTF-32 alwaysuses 4 bytes for every character. UTF-16 uses a minimum of 2 bytes and a maximum of 4.
Due to its variable length, UTF-8 has a little bit of overhead. A character which can be encoded in 2 bytes in UTF-16 may take 3 or 4 in UTF-8; on the other hand, UTF-16 never uses lessthan 2 bytes. If you're storing lots of Asian text, UTF-16 may use less storage. If most of your text is English/ASCII, UTF-8 uses less storage. UTF-32 always uses the most storage.

UTF-8 和 UTF-32 编码字符的方式不同。UTF-8对一个字符使用最少1 个字节,最多使用 4 个字节。UTF-32总是对每个字符使用 4 个字节。UTF-16 最少使用 2 个字节,最多使用 4 个字节。
由于其长度可变,UTF-8 有一点点开销。UTF-16 中可以用 2 个字节编码的字符在 UTF-8 中可能需要 3 个或 4 个;另一方面,UTF-16 从不使用少于2 个字节。如果您要存储大量亚洲文本,则 UTF-16 可能会使用较少的存储空间。如果您的大部分文本是英语/ASCII,则UTF-8 使用的存储空间较少。UTF-32 总是使用最多的存储空间。

回答by Miguel

This is what i used, and worked good for my problem using euro sign and conversion for json_encode failure.

这就是我使用的,并且使用欧元符号和转换为 json_encode 失败很好地解决了我的问题。

php configurations script( api etc..)

php配置脚本(api等)

header('Content-Type: text/html; charset=utf-8');
ini_set("default_charset", "UTF-8");
mb_internal_encoding("UTF-8");
iconv_set_encoding("internal_encoding", "UTF-8");
iconv_set_encoding("output_encoding", "UTF-8");

mysql tables / or specific columns

mysql 表/或特定列

utf8mb4

mysql PDO connection

mysql PDO 连接

$dsn = 'mysql:host=yourip;dbname=XYZ;charset=utf8mb4';

(...your connection ...)

(...你的连接...)

before execute query (might not be required):

在执行查询之前(可能不需要):

$dbh->exec("set names utf8mb4");

回答by Arnaud Le Blanc

  • utf-32: This is a character encoding using a fixed 4-bytes per characters
  • utf-8: This is a character encoding using up to 4 bytes per characters, but the most frequent characters are coded on only 1, 2 or 3 characters.
  • utf-32:这是一种字符编码,每个字符使用固定的 4 个字节
  • utf-8:这是一种字符编码,每个字符最多使用 4 个字节,但最常见的字符仅编码为 1、2 或 3 个字符。

MySQL's utf-8 doesn't support characters coded on more than 3 characters, so they added utf-8mb4, which is really utf-8.

MySQL 的 utf-8 不支持超过 3 个字符编码的字符,因此他们添加了 utf-8mb4,这实际上是 utf-8。

回答by James Gwee

Before running your actual query, do a mysql_query ('SET NAMES utf8mb4')

在运行你的实际查询之前,做一个 mysql_query ('SET NAMES utf8mb4')

Also make sure your mysql server is configured to use utf8mb4 too. For more information on how, refer to article: https://mathiasbynens.be/notes/mysql-utf8mb4#utf8-to-utf8mb4

还要确保您的 mysql 服务器也配置为使用 utf8mb4。有关如何操作的更多信息,请参阅文章:https: //mathiasbynens.be/notes/mysql-utf8mb4#utf8-to-utf8mb4