如何将整个 MySQL 数据库字符集和排序规则转换为 UTF-8?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6115612/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 20:00:36  来源:igfitidea点击:

How to convert an entire MySQL database characterset and collation to UTF-8?

mysqlcharacter-encoding

提问by Dean

How can I convert entire MySQL database character-set to UTF-8 and collation to UTF-8?

如何将整个 MySQL 数据库字符集转换为 UTF-8 并将排序规则转换为 UTF-8?

回答by BalusC

Use the ALTER DATABASEand ALTER TABLEcommands.

使用ALTER DATABASEALTER TABLE命令。

ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Or if you're still on MySQL 5.5.2 or older which didn't support 4-byte UTF-8, use utf8instead of utf8mb4:

或者,如果你仍然在MySQL 5.5.2或以上,其不支持4字节的UTF-8,使用utf8,而不是utf8mb4

ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

回答by newspire

  1. Make a backup!

  2. Then you need to set the default char sets on the database. This does not convert existing tables, it only sets the default for newly created tables.

    ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;
    
  3. Then, you will need to convert the char set on all existing tables and their columns. This assumes that your current data is actually in the current char set. If your columns are set to one char set but your data is really stored in another then you will need to check the MySQL manualon how to handle this.

    ALTER TABLE tbl_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
    
  1. 做好备份!

  2. 然后您需要在数据库上设置默认字符集。这不会转换现有表,它只会为新创建的表设置默认值。

    ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;
    
  3. 然后,您需要转换所有现有表及其列上的字符集。这假设您当前的数据实际上在当前字符集中。如果您的列设置为一个字符集,但您的数据确实存储在另一个字符集中,那么您需要查看MySQL 手册以了解如何处理此问题。

    ALTER TABLE tbl_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
    

回答by Arnold Daniels

On the commandline shell

在命令行外壳上

If you're one the commandline shell, you can do this very quickly. Just fill in "dbname" :D

如果您是命令行 shell 中的一员,则可以非常快速地完成此操作。只需填写“dbname”:D

DB="dbname"
(
    echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'
    mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names \
    | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;'
) \
| mysql "$DB"

One-liner for simple copy/paste

用于简单复制/粘贴的单行

DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB"

回答by sdfor

You can create the sql to update all tables with:

您可以创建 sql 来更新所有表:

SELECT CONCAT("ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CHARACTER SET utf8 COLLATE utf8_general_ci;   ",
    "ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  ") 
    AS alter_sql
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = your_database_name;

Capture the output and run it.

捕获输出并运行它。

Arnold Daniels' answer above is more elegant.

上面阿诺德丹尼尔斯的回答更优雅。

回答by John Yin

Before proceeding, ensure that you: Have completed a full database backup!

在继续之前,请确保您: 已完成完整的数据库备份!

Step 1: Database Level Changes

步骤 1:数据库级别更改

  • Identifying the Collation and Character set of your database

    SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM 
    information_schema.SCHEMATA S
    WHERE schema_name = 'your_database_name'
    AND
    (DEFAULT_CHARACTER_SET_NAME != 'utf8'
        OR
     DEFAULT_COLLATION_NAME not like 'utf8%');
    
  • Fixing the collation for the database

    ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
    
  • 识别数据库的排序规则和字符集

    SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM 
    information_schema.SCHEMATA S
    WHERE schema_name = 'your_database_name'
    AND
    (DEFAULT_CHARACTER_SET_NAME != 'utf8'
        OR
     DEFAULT_COLLATION_NAME not like 'utf8%');
    
  • 修复数据库的排序规则

    ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
    

Step 2: Table Level Changes

第 2 步:表级别更改

  • Identifying Database Tables with the incorrect character set or collation

    SELECT CONCAT(
    'ALTER TABLE ',  table_name, ' CHARACTER SET utf8 COLLATE utf8_general_ci;  ', 
    'ALTER TABLE ',  table_name, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  ')
    FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C
    WHERE C.collation_name = T.table_collation
    AND T.table_schema = 'your_database_name'
    AND
    (C.CHARACTER_SET_NAME != 'utf8'
        OR
     C.COLLATION_NAME not like 'utf8%')
    
  • Adjusting table columns' collation and character set

  • 识别具有不正确字符集或排序规则的数据库表

    SELECT CONCAT(
    'ALTER TABLE ',  table_name, ' CHARACTER SET utf8 COLLATE utf8_general_ci;  ', 
    'ALTER TABLE ',  table_name, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  ')
    FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C
    WHERE C.collation_name = T.table_collation
    AND T.table_schema = 'your_database_name'
    AND
    (C.CHARACTER_SET_NAME != 'utf8'
        OR
     C.COLLATION_NAME not like 'utf8%')
    
  • 调整表格列的排序规则和字符集

Capture upper sql output and run it. (like following)

捕获上层 sql 输出并运行它。(如下)

ALTER TABLE rma CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_products CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_products CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_report_period CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_report_period CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_reservation CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_reservation CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_supplier_return CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_supplier_return_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  
ALTER TABLE rma_supplier_return_product CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_product CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; 

refer to: https://confluence.atlassian.com/display/CONFKB/How+to+Fix+the+Collation+and+Character+Set+of+a+MySQL+Database

参考:https: //confluence.atlassian.com/display/CONFKB/How+to+Fix+the+Collat​​ion+and+Character+Set+of+a+MySQL+Database

回答by Tom

Use HeidiSQL. Its free and a very good db tool.

使用HeidiSQL。它是免费的,是一个非常好的数据库工具。

From tools menu, enter Bulk table editor

从工具菜单,进入批量表编辑器

Select the complete database or pick tables to convert,

选择完整的数据库或选择要转换的表,

  • tick Change default collation: utf8mb4_general_ci
  • tick Convert to charset: utf8
  • 勾选更改默认排序规则:utf8mb4_general_ci
  • 勾选转换为字符集:utf8

Execute

执行

This converts complete database from latin to utf8 in just a few seconds.

这可以在几秒钟内将完整的数据库从拉丁语转换为 utf8。

Works like a charm :)

奇迹般有效 :)

HeidiSQL connects by default as utf8 so any special characters should now be seen as the character (? ? ?) and not as encoded when inspecting the table data.

HeidiSQL 默认连接为 utf8,因此在检查表数据时,任何特殊字符现在都应视为字符 (? ? ?) 而不是编码。

The real pitfall when moving from latin to utf8 is to make sure pdo connects with utf8 charset. If not you will get rubbish data inserted to the utf8 table and question marks all over the place on your web page, making you think the table data is not utf8...

从拉丁语到 utf8 的真正陷阱是确保 pdo 与 utf8 字符集连接。否则你会在utf8表中插入垃圾数据,网页上到处都是问号,让你认为表数据不是utf8...

回答by Camoflame

Inspired by @sdfor comment, here is a bash script that does the job

受到@sdfor 评论的启发,这里有一个 bash 脚本可以完成这项工作

#!/bin/bash

printf "### Converting MySQL character set ###\n\n"

printf "Enter the encoding you want to set: "
read -r CHARSET

# Get the MySQL username
printf "Enter mysql username: "
read -r USERNAME

# Get the MySQL password
printf "Enter mysql password for user %s:" "$USERNAME"
read -rs PASSWORD

DBLIST=( mydatabase1 mydatabase2 )

printf "\n"


for DB in "${DBLIST[@]}"
do
(
    echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;'
    mysql "$DB" -u"$USERNAME" -p"$PASSWORD" -e "SHOW TABLES" --batch --skip-column-names \
    | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;'
) \
| mysql "$DB" -u"$USERNAME" -p"$PASSWORD"

echo "$DB database done..."
done

echo "### DONE ###"
exit

回答by u9966042

DELIMITER $$  

CREATE PROCEDURE `databasename`.`update_char_set`()  

BEGIN  
 DECLARE done INT DEFAULT 0;  
 DECLARE t_sql VARCHAR(256);  
 DECLARE tableName VARCHAR(128);  
 DECLARE lists CURSOR FOR SELECT table_name FROM `information_schema`.`TABLES` WHERE table_schema = 'databasename';  
 DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done = 1;  
 OPEN lists;  
 FETCH lists INTO tableName;  
 REPEAT  
    SET @t_sql = CONCAT('ALTER TABLE ', tableName, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci');  
    PREPARE stmt FROM @t_sql;  
    EXECUTE stmt;  
    DEALLOCATE PREPARE stmt;  
 FETCH lists INTO tableName;  
 UNTIL done END REPEAT;  
 CLOSE lists;  
END$$  

DELIMITER ;  

CALL databasename.update_char_set();

回答by Dan Lucas

For databases that have a high number of tables you can use a simple php script to update the charset of the database and all of the tables using the following:

对于具有大量表的数据库,您可以使用一个简单的 php 脚本来更新数据库的字符集和使用以下内容的所有表:

$conn = mysqli_connect($host, $username, $password, $database);

if ($conn->connect_error) {
  die("Connection failed: " . $conn->connect_error);
}

$alter_database_charset_sql = "ALTER DATABASE ".$database." CHARACTER SET utf8 COLLATE utf8_unicode_ci";
mysqli_query($conn, $alter_database_charset_sql);

$show_tables_result = mysqli_query($conn, "SHOW TABLES");
$tables  = mysqli_fetch_all($show_tables_result);

foreach ($tables as $index => $table) {
  $alter_table_sql = "ALTER TABLE ".$table[0]." CONVERT TO CHARACTER SET utf8  COLLATE utf8_unicode_ci";
  $alter_table_result = mysqli_query($conn, $alter_table_sql);
  echo "<pre>";
  var_dump($alter_table_result);
  echo "</pre>";
}

回答by Frank Forte

In case the data is not in the same character set you might consider this snippet from http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html

如果数据不在相同的字符集中,您可以考虑来自http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html 的这个片段

If the column has a nonbinary data type (CHAR, VARCHAR, TEXT), its contents should be encoded in the column character set, not some other character set. If the contents are encoded in a different character set, you can convert the column to use a binary data type first, and then to a nonbinary column with the desired character set.

如果列具有非二进制数据类型(CHAR、VARCHAR、TEXT),则其内容应以列字符集而不是其他字符集进行编码。如果内容以不同的字符集编码,您可以先将列转换为使用二进制数据类型,然后再转换为具有所需字符集的非二进制列。

Here is an example:

下面是一个例子:

 ALTER TABLE t1 CHANGE c1 c1 BLOB;
 ALTER TABLE t1 CHANGE c1 c1 VARCHAR(100) CHARACTER SET utf8;

Make sure to choose the right collation, or you might get unique key conflicts. e.g. éleanore and Eleanore might be considered the same in some collations.

确保选择正确的排序规则,否则您可能会遇到唯一的密钥冲突。例如,在某些排序规则中,éleanore 和 Eleanore 可能被认为是相同的。

Aside:

在旁边:

I had a situation where certain characters "broke" in emails even though they were stored as UTF-8 in the database. If you are sending emails using utf8 data, you might want to also convert your emails to send in UTF8.

我遇到过某些字符在电子邮件中“损坏”的情况,即使它们以 UTF-8 格式存储在数据库中。如果您使用 utf8 数据发送电子邮件,您可能还希望将电子邮件转换为以 UTF8 发送。

In PHPMailer, just update this line: public $CharSet = 'utf-8';

在 PHPMailer 中,只需更新这一行: public $CharSet = 'utf-8';