MySQL utf8mb4,保存表情符号时出错
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35125933/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
MySQL utf8mb4, Errors when saving Emojis
提问by Loki
I try to save names from users from a service in my MySQL database. Those names can contain emojis like (just for examples)
我尝试从我的 MySQL 数据库中的服务中保存用户的姓名。这些名称可以包含表情符号(仅作为示例)
After searching a little bit I found this stackoverflowlinking to this tutorial. I followed the steps and it looks like everything is configured properly.
稍微搜索后,我发现这个stackoverflow链接到本教程。我按照步骤操作,看起来一切都配置正确。
I have a Database (charset and collation set to utf8mb4 (_unicode_ci)), a Table called TestTable, also configured this way, as well as a "Text" column, configured this way (VARCHAR(191) utf8mb4_unicode_ci).
我有一个数据库(字符集和排序规则设置为 utf8mb4 (_unicode_ci)),一个名为 TestTable 的表,也以这种方式配置,以及一个“Text”列,以这种方式配置(VARCHAR(191) utf8mb4_unicode_ci)。
When I try to save emojis I get an error:
当我尝试保存表情符号时,出现错误:
Example of error for shortcake ():
Warning: #1300 Invalid utf8 character string: 'F09F8D'
Warning: #1366 Incorrect string value: '\xF0\x9F\x8D\xB0' for column 'Text' at row 1
The only Emoji that I was able to save properly was the sun ??
我唯一能够正确保存的表情符号是太阳??
Though I didn't try all of them to be honest.
虽然老实说我没有尝试所有这些。
Is there something I'm missing in the configuration?
我在配置中缺少什么吗?
Please note:All tests of saving didn't involve a client side. I use phpmyadmin to manually change the values and save the data. So the proper configuration of the client side is something that I will take care of afterthe server properly saves emojis.
请注意:所有保存测试均不涉及客户端。我使用 phpmyadmin 手动更改值并保存数据。所以客户端的正确配置是在服务器正确保存表情符号后我会处理的。
Another Sidenote: Currently, when saving emojis I either get the error like above, or get no error and the data of Username
will be stored as Username ????
. Error or no error depends on the way I save. When creating/saving via SQL Statement I save with question marks, when editing inline I save with question marks, when editing using the edit button I get the error.
另一个旁注:目前,在保存表情符号时,我要么得到如上的错误,要么没有得到错误并且数据Username
将存储为Username ????
. 错误或没有错误取决于我的保存方式。通过 SQL 语句创建/保存时,我用问号保存,内联编辑时,我用问号保存,使用编辑按钮编辑时,出现错误。
thank you
谢谢你
EDIT 1:Alright so I think I found out the problem, but not the solution. It looks like the Database specific variables didn't change properly.
编辑 1:好的,所以我想我发现了问题,但没有找到解决方案。看起来数据库特定变量没有正确更改。
When I'm logged in as root on my server and read out the variables (global):
Query used: SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
当我在我的服务器上以 root 身份登录并读出变量(全局)时:
使用的查询:SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)
For my Database (in phpmyadmin, the same query) it looks like the following:
对于我的数据库(在 phpmyadmin 中,相同的查询),它如下所示:
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
How can I adjust these settings on the specific database? Also even though I have the first shown settings as default, when creating a new database I get the second one as settings.
如何在特定数据库上调整这些设置?此外,即使我将第一个显示的设置作为默认设置,在创建新数据库时,我也会将第二个设置作为设置。
Edit 2:
编辑2:
Here is my my.cnf
file:
这是我的my.cnf
文件:
[client]
port=3306
socket=/var/run/mysqld/mysqld.sock
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld_safe]
socket=/var/run/mysqld/mysqld.sock
[mysqld]
user=mysql
pid-file=/var/run/mysqld/mysqld.pid
socket=/var/run/mysqld/mysqld.sock
port=3306
basedir=/usr
datadir=/var/lib/mysql
tmpdir=/tmp
lc-messages-dir=/usr/share/mysql
log_error=/var/log/mysql/error.log
max_connections=200
max_user_connections=30
wait_timeout=30
interactive_timeout=50
long_query_time=5
innodb_file_per_table
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
!includedir /etc/mysql/conf.d/
回答by Rick James
character_set_client
, _connection
, and _results
must all be utf8mb4
for that shortcake to be eatable.
character_set_client
, _connection
, 和_results
必须都是utf8mb4
为了让那个脆饼可以吃。
Something, somewhere, is setting a subset of those individually. Rummage through my.cnf and phpmyadmin's settings -- something is not setting all three.
某处的某些东西正在单独设置这些子集。翻遍 my.cnf 和 phpmyadmin 的设置——有些东西没有设置所有三个。
If SET NAMES utf8mb4
is executed, all three set correctly.
如果SET NAMES utf8mb4
执行,所有三个设置正确。
The sun shone because it is only 3-bytes - E2 98 80
; utf8 is sufficient for 3-byte utf8 encodings of Unicode characters.
阳光明媚,因为它只有 3 个字节 - E2 98 80
;utf8 对于 Unicode 字符的 3 字节 utf8 编码就足够了。
回答by Pierce
It is likely that your service/application is connecting with "utf8" instead of "utf8mb4" for the client character set. That's up to the client application.
对于客户端字符集,您的服务/应用程序很可能使用“utf8”而不是“utf8mb4”进行连接。这取决于客户端应用程序。
For a PHP application see http://php.net/manual/en/function.mysql-set-charset.phpor http://php.net/manual/en/mysqli.set-charset.php
对于 PHP 应用程序,请参阅http://php.net/manual/en/function.mysql-set-charset.php或http://php.net/manual/en/mysqli.set-charset.php
For a Python application see https://github.com/PyMySQL/PyMySQL#exampleor http://docs.sqlalchemy.org/en/latest/dialects/mysql.html#mysql-unicode
对于 Python 应用程序,请参阅https://github.com/PyMySQL/PyMySQL#example或http://docs.sqlalchemy.org/en/latest/dialects/mysql.html#mysql-unicode
Also, check that your columns really are utf8mb4. One direct way is like this:
另外,请检查您的列是否真的是 utf8mb4。一种直接的方法是这样的:
mysql> SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_name = "user" AND column_name = "displayname";
+--------------------+
| character_set_name |
+--------------------+
| utf8mb4 |
+--------------------+
1 row in set (0.00 sec)
回答by user3624198
For me, it turned out that the problem lied in mysql client.
对我来说,原来问题出在 mysql 客户端。
mysql client updates my.cnf
's char setting on a server, and resulted in unintended character setting.
mysql 客户端更新my.cnf
了服务器上的字符设置,并导致了意外的字符设置。
So, What I needed to do is just to add character-set-client-handshake = FALSE
.
It disables client setting from disturbing my char setting.
所以,我需要做的就是添加character-set-client-handshake = FALSE
. 它禁止客户端设置干扰我的字符设置。
my.cnf
would be like this.
my.cnf
会是这样。
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
...
Hope it helps.
希望能帮助到你。
回答by Saurabh Mistry
ALTER TABLE
table_name
CHANGEcolumn_name
column_name
VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL DEFAULT NULL;
ALTER TABLE
table_name
CHANGEcolumn_name
column_name
VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL DEFAULT NULL;
example query :
示例查询:
ALTER TABLE `reactions` CHANGE `emoji` `emoji` VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL DEFAULT NULL;
after that , successful able to store emoji in table :
之后,成功能够将表情符号存储在表中:
回答by druid62
Consider adding
考虑添加
init_connect = 'SET NAMES utf8mb4'
to all of your your db-servers' my.cnf-s.
到您所有的数据库服务器的 my.cnf-s。
(still, clients can (so will) overrule it)
(仍然,客户可以(因此将)否决它)
回答by Nicolas Giszpenc
I'm not proud of this answer, because it uses brute-force to clean the input. It's brutal, but it works
我对这个答案并不感到自豪,因为它使用蛮力来清理输入。这是残酷的,但它有效
function cleanWord($string, $debug = false) {
$new_string = "";
for ($i=0;$i<strlen($string);$i++) {
$letter = substr($string, $i, 1);
if ($debug) {
echo "Letter: " . $letter . "<BR>";
echo "Code: " . ord($letter) . "<BR><BR>";
}
$blnSkip = false;
if (ord($letter)=="146") {
$letter = "´";
$blnSkip = true;
}
if (ord($letter)=="233") {
$letter = "é";
$blnSkip = true;
}
if (ord($letter)=="147" || ord($letter)=="148") {
$letter = """;
$blnSkip = true;
}
if (ord($letter)=="151") {
$letter = "–";
$blnSkip = true;
}
if ($blnSkip) {
$new_string .= $letter;
break;
}
if (ord($letter) > 127) {
$letter = "�" . ord($letter) . ";";
}
$new_string .= $letter;
}
if ($new_string!="") {
$string = $new_string;
}
//optional
$string = str_replace("\r\n", "<BR>", $string);
return $string;
}
//clean up the input
$message = cleanWord($message);
//now you can insert it as part of SQL statement
$sql = "INSERT INTO tbl_message (`message`)
VALUES ('" . addslashes($message) . "')";