MySQL 操作 '=' 的排序规则 (utf8_unicode_ci,IMPLICIT) 和 (utf8_general_ci,IMPLICIT) 的非法混合

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11770074/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 14:23:02  来源:igfitidea点击:

Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation '='

mysqlstored-procedures

提问by Manatax

Error message on MySql:

MySql 上的错误消息:

Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation '='

I have gone through several other posts and was not able to solve this problem. The part affected is something similar to this:

我已经浏览了其他几个帖子,但无法解决这个问题。受影响的部分与此类似:

CREATE TABLE users (
    userID INT UNSIGNED NOT NULL AUTO_INCREMENT,
    firstName VARCHAR(24) NOT NULL,
    lastName VARCHAR(24) NOT NULL,
    username VARCHAR(24) NOT NULL,
    password VARCHAR(40) NOT NULL,
    PRIMARY KEY (userid)
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;

CREATE TABLE products (
    productID INT UNSIGNED NOT NULL AUTO_INCREMENT,
    title VARCHAR(104) NOT NULL,
    picturePath VARCHAR(104) NULL,
    pictureThumb VARCHAR(104) NULL,
    creationDate DATE NOT NULL,
    closeDate DATE NULL,
    deleteDate DATE NULL,
    varPath VARCHAR(104) NULL,
    isPublic TINYINT(1) UNSIGNED NOT NULL DEFAULT '1',
    PRIMARY KEY (productID)
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;

CREATE TABLE productUsers (
    productID INT UNSIGNED NOT NULL,
    userID INT UNSIGNED NOT NULL,
    permission VARCHAR(16) NOT NULL,
    PRIMARY KEY (productID,userID),
    FOREIGN KEY (productID) REFERENCES products (productID) ON DELETE RESTRICT ON UPDATE NO ACTION,
    FOREIGN KEY (userID) REFERENCES users (userID) ON DELETE RESTRICT ON UPDATE NO ACTION
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;

The stored procedure I'm using is this:

我使用的存储过程是这样的:

CREATE PROCEDURE updateProductUsers (IN rUsername VARCHAR(24),IN rProductID INT UNSIGNED,IN rPerm VARCHAR(16))
BEGIN
    UPDATE productUsers
        INNER JOIN users
        ON productUsers.userID = users.userID
        SET productUsers.permission = rPerm
        WHERE users.username = rUsername
        AND productUsers.productID = rProductID;
END

I was testing with php, but the same error is given with SQLyog. I have also tested recreating the entire DB but to no good.

我正在用 php 进行测试,但是 SQLyog 出现了同样的错误。我还测试了重新创建整个数据库,但没有任何好处。

Any help will be much appreciated.

任何帮助都感激不尽。

回答by Ross Smith II

The default collation for stored procedure parameters is utf8_general_ciand you can't mix collations, so you have four options:

存储过程参数的默认排序规则是utf8_general_ci并且您不能混合排序规则,因此您有四个选项:

Option 1: add COLLATEto your input variable:

选项 1:添加COLLATE到您的输入变量:

SET @rUsername = ‘aname' COLLATE utf8_unicode_ci; -- COLLATE added
CALL updateProductUsers(@rUsername, @rProductID, @rPerm);

Option 2: add COLLATEto the WHEREclause:

选项 2:添加COLLATEWHERE条款:

CREATE PROCEDURE updateProductUsers(
    IN rUsername VARCHAR(24),
    IN rProductID INT UNSIGNED,
    IN rPerm VARCHAR(16))
BEGIN
    UPDATE productUsers
        INNER JOIN users
        ON productUsers.userID = users.userID
        SET productUsers.permission = rPerm
        WHERE users.username = rUsername COLLATE utf8_unicode_ci -- COLLATE added
        AND productUsers.productID = rProductID;
END

Option 3: add it to the INparameter definition:

选项 3:将其添加到IN参数定义中:

CREATE PROCEDURE updateProductUsers(
    IN rUsername VARCHAR(24) COLLATE utf8_unicode_ci, -- COLLATE added
    IN rProductID INT UNSIGNED,
    IN rPerm VARCHAR(16))
BEGIN
    UPDATE productUsers
        INNER JOIN users
        ON productUsers.userID = users.userID
        SET productUsers.permission = rPerm
        WHERE users.username = rUsername
        AND productUsers.productID = rProductID;
END

Option 4: alter the field itself:

选项 4:更改字段本身:

ALTER TABLE users CHARACTER SET utf8 COLLATE utf8_general_ci;

Unless you need to sort data in Unicode order, I would suggest altering all your tables to use utf8_general_cicollation, as it requires no code changes, and will speed sorts up slightly.

除非您需要以 Unicode 顺序对数据进行排序,否则我建议您更改所有表以使用utf8_general_ci排序规则,因为它不需要更改代码,并且会稍微加快排序速度。

UPDATE: utf8mb4/utf8mb4_unicode_ci is now the preferred character set/collation method. utf8_general_ci is advised against, as the performance improvement is negligible. See https://stackoverflow.com/a/766996/1432614

更新:utf8mb4/utf8mb4_unicode_ci 现在是首选的字符集/整理方法。不建议使用 utf8_general_ci,因为性能改进可以忽略不计。见https://stackoverflow.com/a/766996/1432614

回答by Nate Vaughan

I spent half a day searching for answers to an identical "Illegal mix of collations" error with conflicts between utf8_unicode_ci and utf8_general_ci.

我花了半天时间寻找相同的“非法混合排序规则”错误的答案,其中 utf8_unicode_ci 和 utf8_general_ci 之间存在冲突。

I found that some columns in my database were not specifically collated utf8_unicode_ci. It seems mysql implicitly collated these columns utf8_general_ci.

我发现我的数据库中的某些列没有专门整理utf8_unicode_ci。似乎 mysql 隐式整理了这些列utf8_general_ci

Specifically, running a 'SHOW CREATE TABLE table1' query outputted something like the following:

具体来说,运行“SHOW CREATE TABLE table1”查询会输出如下内容:

| table1 | CREATE TABLE `table1` (
`id` int(11) NOT NULL,
`col1` varchar(4) CHARACTER SET utf8 NOT NULL,
`col2` int(11) NOT NULL,
PRIMARY KEY (`col1`,`col2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |

Note the line 'col1' varchar(4) CHARACTER SET utf8 NOT NULLdoes not have a collation specified. I then ran the following query:

请注意,行'col1' varchar(4) CHARACTER SET utf8 NOT NULL没有指定排序规则。然后我运行了以下查询:

ALTER TABLE table1 CHANGE col1 col1 VARCHAR(4) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL;

ALTER TABLE table1 CHANGE col1 col1 VARCHAR(4) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL;

This solved my "Illegal mix of collations" error. Hope this might help someone else out there.

这解决了我的“非法混合排序规则”错误。希望这可以帮助其他人。

回答by bpile

I had a similar problem, but it occurred to me inside procedure, when my query param was set using variable e.g. SET @value='foo'.

我有一个类似的问题,但它发生在我内部过程中,当我的查询参数是使用变量设置时,例如SET @value='foo'

What was causing this was mismatched collation_connectionand Database collation. Changed collation_connectionto match collation_databaseand problem went away. I think this is more elegant approach than adding COLLATE after param/value.

导致这种情况的原因是collation_connection与数据库整理不匹配。更改collation_connection为匹配collation_database并且问题消失了。我认为这比在 param/value 之后添加 COLLATE 更优雅。

To sum up: all collations must match. Use SHOW VARIABLESand make sure collation_connectionand collation_databasematch (also check table collation using SHOW TABLE STATUS [table_name]).

总结一下:所有排序规则都必须匹配。使用SHOW VARIABLES并确保collation_connectioncollation_database匹配(也使用检查表核对SHOW TABLE STATUS [table_name])。

回答by Sebas

A bit similar to @bpile answer, my case was a my.cnf entry setting collation-server = utf8_general_ci. After I realized that (and after trying everything above), I forcefully switched my database to utf8_general_ci instead of utf8_unicode_ci and that was it:

有点类似于@bpil 答案,我的情况是 my.cnf 条目设置collation-server = utf8_general_ci。在我意识到这一点之后(并在尝试了上述所有内容之后),我强行将我的数据库切换到 utf8_general_ci 而不是 utf8_unicode_ci,就是这样:

ALTER DATABASE `db` CHARACTER SET utf8 COLLATE utf8_general_ci;

回答by Teejaygenius

In my own case I have the following error

在我自己的情况下,我有以下错误

Illegal mix of collations (utf8_general_ci,IMPLICIT) and (utf8_unicode_ci,IMPLICIT) for operation '='

操作 '=' 的排序规则 (utf8_general_ci,IMPLICIT) 和 (utf8_unicode_ci,IMPLICIT) 的非法混合

$this->db->select("users.username as matric_no, CONCAT(users.surname, ' ', users.first_name, ' ', users.last_name) as fullname") ->join('users', 'users.username=classroom_students.matric_no', 'left') ->where('classroom_students.session_id', $session) ->where('classroom_students.level_id', $level) ->where('classroom_students.dept_id', $dept);

$this->db->select("users.username as matric_no, CONCAT(users.surname, ' ', users.first_name, ' ', users.last_name) as fullname") ->join('users', 'users .username=classroom_students.matric_no', 'left') ->where('classroom_students.session_id', $session) ->where('classroom_students.level_id', $level) ->where('classroom_students.dept_id', $dept );

After weeks of google searching I noticed that the two fields I am comparing consists of different collation name. The first one i.e username is of utf8_general_ci while the second one is of utf8_unicode_ci so I went back to the structure of the second table and changed the second field (matric_no) to utf8_general_ci and it worked like a charm.

经过数周的谷歌搜索,我注意到我比较的两个字段包含不同的排序规则名称。第一个即用户名是 utf8_general_ci 而第二个是 utf8_unicode_ci 所以我回到第二个表的结构并将第二个字段 (matric_no) 更改为 utf8_general_ci 并且它就像一个魅力。

回答by Raffaele

Despite finding an enormous number of question about the same problem (1, 2, 3, 4) I have never found an answer that took performance into consideration, even here.

尽管发现了大量关于同一问题(1234)的问题,但我从未找到将性能考虑在内的答案,即使在这里也是如此。

Although multiple working solutions has been already given I would like to do a performance consideration.

尽管已经给出了多种可行的解决方案,但我还是想做一个性能方面的考虑。

EDIT: Thanks to Manatax for pointing out that option 1 does not suffer of performance issues.

编辑:感谢 Manatax 指出选项 1 没有性能问题。

Using Option 1 and2, aka the COLLATEcast approach, can lead to potential bottleneck, cause any index defined on the column will not be used causing a full scan.

使用Option 1 和2,也就是COLLATEcast 方法,可能会导致潜在的瓶颈,导致列上定义的任何索引都不会被使用,从而导致完全扫描

Even though I did not try out Option 3, my hunch is that it will suffer the same consequences of option 1 and2.

尽管我没有尝试Option 3,但我的预感是它会遭受与选项1 和2相同的后果。

Lastly, Option 4is the best option for very large tables when it is viable. I mean there are no other usage that rely on the original collation.

最后,如果可行,选项 4是非常大的表的最佳选择。我的意思是没有其他依赖于原始排序规则的用法。

Consider this simplified query:

考虑这个简化的查询:

SELECT 
    *
FROM
    schema1.table1 AS T1
        LEFT JOIN
    schema2.table2 AS T2 ON T2.CUI = T1.CUI
WHERE
    T1.cui IN ('C0271662' , 'C2919021')
;

In my original example, I had many more joins. Of course, table1 and table2 have different collations. Using the collateoperator to cast, it will lead to indexes not being used.

在我最初的例子中,我有更多的连接。当然,table1 和 table2 有不同的排序规则。使用collat​​e运算符进行强制转换,会导致索引不被使用。

See sql explanation in the picture below.

见下图sql说明。

Visual Query Explanation when using the COLLATE cast

使用 COLLATE 转换时的可视查询说明

On the other hand, option 4can take advantages of possible index and led to fast queries.

另一方面,选项 4可以利用可能的索引并导致快速查询。

In the picture below, you can see the same query being run after applied Option 4, aka altering the schema/table/column collation.

在下图中,您可以看到在应用Option 4后运行相同的查询,也就是更改架构/表/列排序规则。

Visual Query Explanation after the collation has been changed, and therefore without the collate cast

排序规则更改后的可视查询说明,因此没有排序规则转换

In conclusion, if performance are important and you can alter the collation of the table, go for Option 4. If you have to act on a single column, you can use something like this:

总之,如果性能很重要并且您可以更改表的排序规则,请选择 Option 4。如果您必须对单个列执行操作,则可以使用以下内容:

ALTER TABLE schema1.table1 MODIFY `field` VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

回答by raam86

This happens where a column is explicitly set to a different collation or the default collation is different in the table queried.

当列显式设置为不同的排序规则或查询的表中的默认排序规则不同时,就会发生这种情况。

if you have many tables you want to change collation on run this query:

如果您有许多表要在运行此查询时更改排序规则:

select concat('ALTER TABLE ', t.table_name , ' CONVERT TO CHARACTER 
SET utf8 COLLATE utf8_unicode_ci;') from (SELECT table_name FROM 
information_schema.tables where table_schema='SCHRMA') t;

this will output the queries needed to convert all the tables to use the correct collation per column

这将输出将所有表转换为每列使用正确排序规则所需的查询