如何让 MySQL 正确处理 UTF-8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/202205/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to make MySQL handle UTF-8 properly
提问by Ben
One of the responses to a question I asked yesterdaysuggested that I should make sure my database can handle UTF-8 characters correctly. How I can do this with MySQL?
其中一个答复的一个问题,我问昨天建议我要确保我的数据库能正确处理UTF-8字符。我如何用 MySQL 做到这一点?
采纳答案by Owen
Update:
更新:
Short answer - You should almost always be using the utf8mb4
charset and utf8mb4_unicode_ci
collation.
简短回答 - 您几乎应该总是使用utf8mb4
字符集和utf8mb4_unicode_ci
排序规则。
To alter database:
修改数据库:
ALTER DATABASE dbname CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
See:
看:
Aaron's comment on this answer How to make MySQL handle UTF-8 properly
What's the difference between utf8_general_ci and utf8_unicode_ci
Conversion guide: https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-conversion.html
Aaron 对此答案的评论如何使 MySQL 正确处理 UTF-8
转换指南:https: //dev.mysql.com/doc/refman/5.5/en/charset-unicode-conversion.html
Original Answer:
原答案:
MySQL 4.1 and above has a default character set of UTF-8. You can verify this in your my.cnf
file, remember to set bothclient and server (default-character-set
and character-set-server
).
MySQL 4.1 及更高版本的默认字符集为 UTF-8。您可以在验证这个my.cnf
文件,记得设置两个客户端和服务器(default-character-set
和character-set-server
)。
If you have existing data that you wish to convert to UTF-8, dump your database, and import it back as UTF-8 making sure:
如果您有希望转换为 UTF-8 的现有数据,请转储您的数据库,然后将其作为 UTF-8 重新导入,确保:
- use
SET NAMES utf8
before you query/insert into the database - use
DEFAULT CHARSET=utf8
when creating new tables - at this point your MySQL client and server should be in UTF-8 (see
my.cnf
). remember any languages you use (such as PHP) must be UTF-8 as well. Some versions of PHP will use their own MySQL client library, which may not be UTF-8 aware.
SET NAMES utf8
在查询/插入数据库之前使用DEFAULT CHARSET=utf8
创建新表时使用- 此时您的 MySQL 客户端和服务器应该是 UTF-8(参见 参考资料
my.cnf
)。请记住,您使用的任何语言(例如 PHP)也必须是 UTF-8。某些版本的 PHP 将使用自己的 MySQL 客户端库,可能不支持 UTF-8。
If you do want to migrate existing data remember to backup first! Lots of weird choping of data can happen when things don't go as planned!
如果您确实想迁移现有数据,请记住先备份!当事情没有按计划进行时,可能会发生许多奇怪的数据截断!
Some resources:
一些资源:
- complete UTF-8 migration(cdbaby.com)
- article on UTF-8 readiness of php functions(note some of this information is outdated)
- 完整的 UTF-8 迁移(cdbaby.com)
- 关于php 函数的 UTF-8 就绪性的文章(注意其中一些信息已过时)
回答by Javier
To make this 'permanent', in my.cnf
:
为了使这个“永久”,在my.cnf
:
[client]
default-character-set=utf8
[mysqld]
character-set-server = utf8
To check, go to the client and show some variables:
要检查,请转到客户端并显示一些变量:
SHOW VARIABLES LIKE 'character_set%';
Verify that they're all utf8
, except ..._filesystem
, which should be binary
and ..._dir
, that points somewhere in the MySQL installation.
验证它们都是utf8
,除了..._filesystem
,应该是binary
和..._dir
,指向 MySQL 安装中的某个位置。
回答by T.W.R. Cole
MySQL 4.1 and above has a default character set that it calls utf8
but which is actually only a subset of UTF-8 (allows only three-byte characters and smaller).
MySQL 4.1 及更高版本有一个它调用的默认字符集,utf8
但它实际上只是 UTF-8 的一个子集(只允许三字节或更小的字符)。
Use utf8mb4
as your charset if you want "full" UTF-8.
utf8mb4
如果您想要“完整”UTF-8,请用作您的字符集。
回答by Rick James
The short answer: Use utf8mb4
in 4 places:
简短的回答:utf8mb4
在 4 个地方使用:
- The bytes in your client are utf8, not latin1/cp1251/etc.
SET NAMES utf8mb4
or something equivalent when establishing the client's connection to MySQLCHARACTER SET utf8mb4
on all tables/columns -- except columns that are strictly ascii/hex/country_code/zip_code/etc.<meta charset charset=UTF-8>
if you are outputting to HTML. (Yes the spelling is different here.)
- 客户端中的字节是 utf8,而不是 latin1/cp1251/etc。
SET NAMES utf8mb4
或建立客户端与 MySQL 的连接时的等价物CHARACTER SET utf8mb4
在所有表/列上——除了严格为 ascii/hex/country_code/zip_code/etc 的列。<meta charset charset=UTF-8>
如果您要输出到 HTML。(是的,这里的拼写不同。)
The above links provide the "detailed canonical answer is required to address all the concerns". -- There is a space limit on this forum.
上述链接提供了“解决所有问题需要详细的规范答案”。-- 本论坛有空间限制。
Edit
编辑
In addition to CHARACTER SET utf8mb4
containing "all" the world's characters, COLLATION utf8mb4_unicode_520_ci
is arguable the 'best all-around' collation to use. (There are also Turkish, Spanish, etc, collations for those who want the nuances in those languages.)
除了CHARACTER SET utf8mb4
包含“所有”世界的字符之外,COLLATION utf8mb4_unicode_520_ci
可以说是使用的“最佳全能”排序规则。(对于那些想要这些语言的细微差别的人,还有土耳其语、西班牙语等校对。)
回答by extraneon
The charset is a property of the database (default) and the table. You can have a look (MySQL commands):
字符集是数据库(默认)和表的属性。你可以看看(MySQL命令):
show create database foo;
> CREATE DATABASE `foo`.`foo` /*!40100 DEFAULT CHARACTER SET latin1 */
show create table foo.bar;
> lots of stuff ending with
> ) ENGINE=InnoDB AUTO_INCREMENT=252 DEFAULT CHARSET=latin1
In other words; it's quite easy to check your database charset or change it:
换句话说; 检查您的数据库字符集或更改它非常容易:
ALTER TABLE `foo`.`bar` CHARACTER SET utf8;
回答by Nyein Aung
To change the character set encoding to UTF-8 for the database itself, type the following command at the mysql> prompt. USE ALTER DATABASE
.. Replace DBNAME with the database name:
要将数据库本身的字符集编码更改为 UTF-8,请在 mysql> 提示符下键入以下命令。USE ALTER DATABASE
.. 将 DBNAME 替换为数据库名称:
ALTER DATABASE DBNAME CHARACTER SET utf8 COLLATE utf8_general_ci;
This is a duplicate of this question How to convert an entire MySQL database characterset and collation to UTF-8?
回答by Vlad Balan
I followed Javier's solution, but I added some different lines in my.cnf:
我遵循了 Javier 的解决方案,但我在 my.cnf 中添加了一些不同的行:
[myslqd]
skip-character-set-client-handshake
collation_server=utf8_unicode_ci
character_set_server=utf8
I found this idea here: http://dev.mysql.com/doc/refman/5.0/en/charset-server.htmlin the first/only user comment on the bottom of the page. He mentions that skip-character-set-client-handshakehas some importance.
我在这里找到了这个想法:http: //dev.mysql.com/doc/refman/5.0/en/charset-server.html在页面底部的第一个/唯一的用户评论中。他提到skip-character-set-client-handshake有一些重要性。
回答by Edward Z. Yang
These tips on MySQL and UTF-8may be helpful. Unfortunately, they don't constitute a full solution, just common gotchas.
这些关于 MySQL 和 UTF-8 的提示可能会有所帮助。不幸的是,它们并不构成完整的解决方案,只是常见的问题。
回答by Gaurav Lad
Set your database collation
to UTF-8
then apply table collation
to database default.
将您设置database collation
为UTF-8
然后应用table collation
到数据库默认值。
回答by Vipin Jain
Your answer is you can configure by MySql Settings. In My Answer may be something gone out of context but this is also know is help for you.
how to configure Character Set
and Collation
.
您的答案是您可以通过 MySql 设置进行配置。在我的回答中可能有一些断章取义的内容,但这也知道对您有帮助。
如何配置Character Set
和Collation
。
For applications that store data using the default MySQL character set and collation (
latin1, latin1_swedish_ci
), no special configuration should be needed. If applications require data storage using a different character set or collation, you can configure character set information several ways:
对于使用默认 MySQL 字符集和排序规则 (
latin1, latin1_swedish_ci
)存储数据的应用程序,不需要特殊配置。如果应用程序需要使用不同的字符集或排序规则存储数据,您可以通过多种方式配置字符集信息:
- Specify character settings per database.For example, applications
that use one database might require
utf8
, whereas applications that use another database might require sjis. - Specify character settings at server startup.This causes the server to use the given settings for all applications that do not make other arrangements.
- Specify character settings at configuration time, if you build MySQL from source. This causes the server to use the given settings for all applications, without having to specify them at server startup.
- 指定每个数据库的字符设置。例如,使用一个数据库的应用程序可能需要
utf8
,而使用另一个数据库的应用程序可能需要 sjis。 - 在服务器启动时指定字符设置。这会导致服务器对所有不进行其他安排的应用程序使用给定的设置。
- 如果从源代码构建 MySQL,请在配置时指定字符设置。这会导致服务器对所有应用程序使用给定的设置,而不必在服务器启动时指定它们。
The examples shown here for your question to set utf8 character set , here also set collation for more helpful(utf8_general_ci
collation`).
此处显示的示例用于设置 utf8 字符集的问题,此处还设置了更有用的utf8_general_ci
排序规则(排序规则)。
Specify character settings per database
指定每个数据库的字符设置
CREATE DATABASE new_db
DEFAULT CHARACTER SET utf8
DEFAULT COLLATE utf8_general_ci;
Specify character settings at server startup
在服务器启动时指定字符设置
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
Specify character settings at MySQL configuration time
在 MySQL 配置时指定字符设置
shell> cmake . -DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci
To see the values of the character set and collation system variables that apply to your connection, use these statements:
要查看适用于您的连接的字符集和排序规则系统变量的值,请使用以下语句:
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';
This May be lengthy answer but there is all way, you can use. Hopeful my answer is helpful for you. for more information http://dev.mysql.com/doc/refman/5.7/en/charset-applications.html
这可能是冗长的答案,但有所有方法,您可以使用。希望我的回答对你有帮助。更多信息http://dev.mysql.com/doc/refman/5.7/en/charset-applications.html