MySQL 如何在整个数据库中更改 CHARACTER SET（和 COLLATION）？

Question

提问by Jeg Bagus

Our previous programmer set the wrong collation in a table (Mysql). He set it up with Latin collation, when it should be UTF8, and now I have issues. Every record with Chinese and Japan character turn to ??? character.

我们以前的程序员在表（Mysql）中设置了错误的排序规则。他用拉丁语排序设置它，当它应该是 UTF8 时，现在我有问题。中日文的每张唱片都转了？？？特点。

Is possible to change collation and get back the detail of character?

是否可以更改排序规则并取回字符的详细信息？

Answer 1

回答by Timo Huovinen

change database collation:

更改数据库排序规则：

ALTER DATABASE <database_name> CHARACTER SET utf8 COLLATE utf8_unicode_ci;

change table collation:

更改表排序规则：

ALTER TABLE <table_name> CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

change column collation:

更改列排序规则：

ALTER TABLE <table_name> MODIFY <column_name> VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci;

More info:

更多信息：

Answer 2

回答by David Whittaker

Heres how to change all databases/tables/columns. Run these queries and they will output all of the subsequent queries necessary to convert your entire schema to utf8. Hope this helps!

这是更改所有数据库/表/列的方法。运行这些查询，它们将输出将整个架构转换为 utf8 所需的所有后续查询。希望这可以帮助！

-- Change DATABASE Default Collation

-- 更改数据库默认排序规则

SELECT DISTINCT concat('ALTER DATABASE `', TABLE_SCHEMA, '` CHARACTER SET utf8 COLLATE utf8_unicode_ci;')
from information_schema.tables
where TABLE_SCHEMA like  'database_name';

-- Change TABLE Collation / Char Set

-- 更改表排序规则/字符集

SELECT concat('ALTER TABLE `', TABLE_SCHEMA, '`.`', table_name, '` CHARACTER SET utf8 COLLATE utf8_unicode_ci;')
from information_schema.tables
where TABLE_SCHEMA like 'database_name';

-- Change COLUMN Collation / Char Set

-- 更改 COLUMN 排序规则/字符集

SELECT concat('ALTER TABLE `', t1.TABLE_SCHEMA, '`.`', t1.table_name, '` MODIFY `', t1.column_name, '` ', t1.data_type , '(' , t1.CHARACTER_MAXIMUM_LENGTH , ')' , ' CHARACTER SET utf8 COLLATE utf8_unicode_ci;')
from information_schema.columns t1
where t1.TABLE_SCHEMA like 'database_name' and t1.COLLATION_NAME = 'old_charset_name';

Answer 3

回答by bluecollarcoder

Beware that in Mysql, the utf8character set is only a subset of the real UTF8 character set. In order to save one byte of storage, the Mysql team decided to store only three bytes of a UTF8 characters instead of the full four-bytes. That means that some east asian language and emoji aren't fully supported. To make sure you can store all UTF8 characters, use the utf8mb4data type, and utf8mb4_binor utf8mb4_general_ciin Mysql.

请注意，在 Mysql 中，utf8字符集只是真正的 UTF8 字符集的一个子集。为了节省一个字节的存储空间，Mysql 团队决定只存储一个 UTF8 字符的三个字节，而不是完整的四个字节。这意味着不完全支持某些东亚语言和表情符号。为确保您可以存储所有 UTF8 字符，请使用Mysql 中的utf8mb4数据类型和utf8mb4_bin或utf8mb4_general_ci。

Answer 4

回答by Jacob Hundley

Adding to what David Whittaker posted, I have created a query that generates the complete table and columns alter statement that will convert each table. It may be a good idea to run

除了 David Whittaker 发布的内容之外，我还创建了一个查询，该查询生成完整的表和列的 alter 语句，该语句将转换每个表。运行可能是个好主意

SET SESSION group_concat_max_len = 100000;

设置会话 group_concat_max_len = 100000;

first to make sure your group concat doesn't go over the very small limit as seen here.

首先要确保您的组连接不会超过此处所见的非常小的限制。

     SELECT a.table_name, concat('ALTER TABLE ', a.table_schema, '.', a.table_name, ' DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_unicode_ci, ',
        group_concat(distinct(concat(' MODIFY ',  column_name, ' ', column_type, ' CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci ', if (is_nullable = 'NO', ' NOT', ''), ' NULL ',
        if (COLUMN_DEFAULT is not null, CONCAT(' DEFAULT \'', COLUMN_DEFAULT, '\''), ''), if (EXTRA != '', CONCAT(' ', EXTRA), '')))), ';') as alter_statement
    FROM information_schema.columns a
    INNER JOIN INFORMATION_SCHEMA.TABLES b ON a.TABLE_CATALOG = b.TABLE_CATALOG
        AND a.TABLE_SCHEMA = b.TABLE_SCHEMA
        AND a.TABLE_NAME = b.TABLE_NAME
        AND b.table_type != 'view'
    WHERE a.table_schema = ? and (collation_name = 'latin1_swedish_ci' or collation_name = 'utf8mb4_general_ci')
    GROUP BY table_name;

A difference here between the previous answer is it was using utf8 instead of ut8mb4 and using t1.data_type with t1.CHARACTER_MAXIMUM_LENGTH didn't work for enums. Also, my query excludes views since those will have to altered separately.

上一个答案之间的区别在于它使用 utf8 而不是 ut8mb4 并且使用 t1.data_type 和 t1.CHARACTER_MAXIMUM_LENGTH 不适用于枚举。此外，我的查询不包括视图，因为这些视图必须单独更改。

I simply used a Perl script to return all these alters as an array and iterated over them, fixed the columns that were too long (generally they were varchar(256) when the data generally only had 20 characters in them so that was an easy fix).

我只是使用 Perl 脚本将所有这些更改作为数组返回并对其进行迭代，修复太长的列（通常它们是 varchar(256) 当数据中通常只有 20 个字符时，这是一个简单的修复）。

I found some data was corrupted when altering from latin1 -> utf8mb4. It appeared to be utf8 encoded latin1 characters in columns would get goofed in the conversion. I simply held data from the columns I knew was going to be an issue in memory from before and after the alter and compared them and generated update statements to fix the data.

我发现从 latin1 -> utf8mb4 更改时，某些数据已损坏。列中的 utf8 编码 latin1 字符似乎会在转换中出错。我只是从我知道在更改前后将成为内存问题的列中保存数据，然后比较它们并生成更新语句来修复数据。

Answer 5

回答by MJB

heredescribes the process well. However, some of the characters that didn't fit in latin space are gone forever. UTF-8 is a SUPERSET of latin1. Not the reverse. Most will fit in single byte space, but any undefined ones will not (check a list of latin1 - not all 256 characters are defined, depending on mysql's latin1 definition)

这里很好地描述了这个过程。然而，一些不适合拉丁空间的字符已经一去不复返了。UTF-8 是 latin1 的 SUPERSET。不是反过来。大多数将适合单字节空间，但任何未定义的将不适合（检查 latin1 列表 - 并非所有 256 个字符都已定义，具体取决于 mysql 的 latin1 定义）

MySQL 如何在整个数据库中更改 CHARACTER SET（和 COLLATION）？

提问by Jeg Bagus

回答by Timo Huovinen

回答by David Whittaker

回答by bluecollarcoder

回答by Jacob Hundley

回答by MJB

相关推荐

最近更新

标签

MySQL 如何在整个数据库中更改 CHARACTER SET（和 COLLATION）？

提问by Jeg Bagus

回答by Timo Huovinen

回答by David Whittaker

回答by bluecollarcoder

回答by Jacob Hundley

回答by MJB

相关推荐

MySQL 在sql中设置总和的小数位

MySQL phpMyAdmin 导出/导入导致重复主键错误

MySQL 如何将参数传递给nodejs中的mysql查询回调

按字母顺序对 MySQL 结果进行排序，但数字最后

相关推荐

最近更新

标签