MySQL utf8 和 latin1 的区别

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2708958/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 15:53:10  来源:igfitidea点击:

Differences between utf8 and latin1

mysqlutf-8installationlatin1

提问by binbash

what is the difference between utf8 and latin1?

utf8 和 latin1 有什么区别?

回答by BalusC

UTF-8is prepared for world domination, Latin1isn't.

UTF-8为统治世界做好了准备,而Latin1则不然。

If you're trying to store non-Latin characters like Chinese, Japanese, Hebrew, Russian, etc using Latin1 encoding, then they will end up as mojibake. You may find the introductory text of this articleuseful (and even more if you know a bit Java).

如果您尝试使用 Latin1 编码存储非拉丁字符,如中文、日语、希伯来语、俄语等,那么它们最终将成为mojibake。您可能会发现的介绍性文字这篇文章很有用(甚至更多,如果你知道一点的Java)。

Note that full 4-byte UTF-8 support was only introduced in MySQL 5.5. Before that version, it only goes up to 3 bytes per character, not 4 bytes per character. So, it supported only the BMP plane and not e.g. the Emoji plane. If you want full 4-byte UTF-8 support, upgrade MySQL to at least 5.5 or go for another RDBMS like PostgreSQL. In MySQL 5.5+ it's called utf8mb4.

请注意,完整的 4 字节 UTF-8 支持仅在 MySQL 5.5 中引入。在该版本之前,每个字符最多只能使用 3 个字节,而不是每个字符 4 个字节。所以,它只支持 BMP 平面,而不支持 Emoji 平面。如果您想要完整的 4 字节 UTF-8 支持,请将 MySQL 升级到至少 5.5 或使用另一个 RDBMS,如 PostgreSQL。在 MySQL 5.5+ 中,它被称为utf8mb4.

回答by sepp2k

In latin1 each character is exactly one byte long. In utf8 a character can consist of more than one byte. Consequently utf8 has more characters than latin1 (and the characters they do have in common aren't necessarily represented by the same byte/bytesequence).

在 latin1 中,每个字符正好是一个字节长。在 utf8 中,一个字符可以由多个字节组成。因此 utf8 比 latin1 具有更多的字符(并且它们共有的字符不一定由相同的字节/字节序列表示)。