php 将 UTF-8 编码的字符串插入到 UTF-8 编码的 mysql 表中失败并显示“字符串值不正确”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11936950/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 02:29:58  来源:igfitidea点击:

Inserting UTF-8 encoded string into UTF-8 encoded mysql table fails with "Incorrect string value"

phpmysqldrupal

提问by Letharion

Inserting UTF-8 encoded string into UTF-8 encoded table gives incorrect string value.

将 UTF-8 编码的字符串插入到 UTF-8 编码的表中会给出不正确的字符串值。

PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9D\x84\x8E i...' for column 'body_value' at row 1: INSERT INTO

PDOException: SQLSTATE[HY000]: 一般错误: 1366 不正确的字符串值: '\xF0\x9D\x84\x8E i...' 列 'body_value' 在第 1 行:INSERT INTO

I have a character, in a string that mb_detect_encodingclaims is UTF-8 encoded. I try to insert this string into a MySQL table, which is defined as (among other things) DEFAULT CHARSET=utf8

我有一个字符,在mb_detect_encoding声称是 UTF-8 编码的字符串中。我尝试将此字符串插入到 MySQL 表中,该表定义为(除其他外)DEFAULT CHARSET=utf8

Edit:Drupal always does SET NAMES utf8with optional COLLATE(atleast when talking to MySQL).

编辑:Drupal 总是SET NAMES utf8使用 optional COLLATE(至少在与 MySQL 交谈时)。

Edit 2:Some more details that appear to be relevant. I grab some text from a PostgreSQL database. I stick it onto an object, use mb_detect_encoding to verify that it's UTF-8, and persist the object to the database, using node_save. So while there is an HTTP request that triggers the import, the data does not come from the browser.

编辑 2:一些似乎相关的更多细节。我从 PostgreSQL 数据库中抓取了一些文本。我把它贴在一个对象上,使用 mb_detect_encoding 来验证它是 UTF-8,然后使用node_save将对象持久化到数据库中。因此,虽然存在触发导入的 HTTP 请求,但数据并非来自浏览器。

Edit 3:Data is denormalized over two tables:

编辑 3:数据在两个表上非规范化:

SELECT character_set_name FROM information_schema.COLUMNSC WHERE table_schema = "[database]" AND table_name IN ("field_data_body", "field_revision_body") AND column_name = "body_value";

从信息模式中选择字符集名称。COLUMNSC WHERE table_schema = "[database]" AND table_name IN ("field_data_body", "field_revision_body") AND column_name = "body_value";

>+--------------------+
| character_set_name |
+--------------------+
| utf8               |
| utf8               |
+--------------------+

Edit 4:Is it possible that the character is "to new"? I'm more than a little fuzzy on the relationship between unicode and UTF-8, but this wikipedia article, implies that the character was standardized very recently.

编辑 4:这个角色有可能是“新人”吗?我对unicode 和 UTF-8 之间的关系有点模糊,但是这篇维基百科文章暗示该字符最近已标准化。

I don't understand how that can fail with "Incorrect string value".

我不明白“不正确的字符串值”怎么会失败。

回答by prosfilaes

(U+1D10E) is a character Unicode found outside the BMP (Basic Multilingual Plane) (above U+FFFF) and thus can't be represented in UTF-8 in 3 bytes. MySQL charset utf8 only accepts UTF-8 characters if they can be represented in 3 bytes. If you need to store this in MySQL, you'll need to use MySQL charset utf8mb4. You'll need MySQL 5.5.3 or later. You can use ALTER TABLE to change the character set without much problem; since it needs more space to store the characters, a couple issues show up that may require you to reduce string size. See http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html.

(U+1D10E) 是在 BMP(基本多语言平面)(U+FFFF 以上)之外发现的字符 Unicode,因此不能以 3 个字节的 UTF-8 表示。MySQL 字符集 utf8 仅接受 UTF-8 字符,前提是它们可以用 3 个字节表示。如果需要将其存储在 MySQL 中,则需要使用 MySQL 字符集 utf8mb4。您将需要 MySQL 5.5.3 或更高版本。您可以使用 ALTER TABLE 更改字符集,没有太大问题;由于它需要更多空间来存储字符,因此出现了一些可能需要您减小字符串大小的问题。请参阅http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html

回答by ytdm

to solve this issue, first you change your database field to utf8m4b charset. For example:

要解决此问题,首先将数据库字段更改为 utf8m4b 字符集。例如:

ALTER TABLE `tb_name` CHANGE `field_name` `field_name` VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL; 

then in your db connection, set driver_options for it to utf8mb4. For example, if you use PDO

然后在您的数据库连接中,将其 driver_options 设置为 utf8mb4。例如,如果您使用 PDO

$db = new PDO('mysql:host=localhost;dbname=testdb;charset=utf8mb4', 'username', 'password');

or in zend framework 1.2

或在 Zend 框架 1.2 中

$dbParam = array('host' => 'localhost', 'username' => 'db_user_name',
            'password' => 'password', 'dbname' => 'db_name',
            'driver_options' => array(
                '1002' => "SET NAMES 'utf8mb4'",
                '12'    => 0
            )
        );

回答by wesside

In your PDO connecton, set the charset.

在您的 PDO 连接中,设置字符集。

new PDO('mysql:host=localhost;dbname=the_db;charset=utf8mb4', $user, $password);

回答by huy

I fixed the error: SQLSTATE[HY000]: General error: 1366 Incorrect string value ...... with this method:

我修复了错误:SQLSTATE[HY000]: General error: 1366 Incorrect string value ......用这个方法:

I use utf8mb4_unicode_ci for database databaseSet utf8mb4_unicode_ci for all tables tables

我为数据库使用 utf8mb4_unicode_ci 数据库为所有表设置 utf8mb4_unicode_ci 桌子

Set longblog datatype for column(not text, longtext.... you need big datatype to store 4 bytes of your content) fields

为列设置 longblog 数据类型(不是文本,长文本......你需要大数据类型来存储 4 个字节的内容) 领域

It is okay now. If you use laravel, continue to edit config/database.php

现在没事了。如果你使用laravel,继续编辑config/database.php

'charset' => 'utf8mb4',
'collation' => 'utf8mb4_unicode_ci',

laravel

拉拉维尔

If you use function strtolower, replace it with mb_strtolower Notice: you have to put <meta charset="utf-8">on your head tag

如果你使用函数strtolower,替换成mb_strtolower 注意:你必须戴上<meta charset="utf-8">你的头标签