java 如何处理SQL状态[HY000];错误代码[1366];字符串值不正确?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11057463/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 03:37:22  来源:igfitidea点击:

How to handle SQL state [HY000]; error code [1366]; Incorrect string value?

javamysqlutf-8

提问by Bozho

I'm aware this error means a mysql column doesn't accept the value, but this is strange, since the value fits in a Java UTF-8 encoded string, and the mysql column is utf8_general_ci. Also, all utf8 characters have worked properly so far, apart from these.

我知道这个错误意味着 mysql 列不接受该值,但这很奇怪,因为该值适合 Java UTF-8 编码字符串,而 mysql 列是 utf8_general_ci。此外,到目前为止,除这些之外,所有 utf8 字符都可以正常工作。

The use-case is: I am importing tweets. The tweet in question is: https://twitter.com/bakervin/status/210054214951518212- you can see the two "strange" characters (and two strange whitespaces between them). The question is - how to handle this:

用例是:我正在导入推文。有问题的推文是:https: //twitter.com/bakervin/status/210054214951518212- 您可以看到两个“奇怪”的字符(以及它们之间的两个奇怪的空格)。问题是 - 如何处理这个:

  • trim these characters (how - which are they, how does the Java UTF-8 differ from MySQL one)
  • make the column capable of accepting this value (how - is there anything more utf-y than utf8_general_ci)
  • 修剪这些字符(如何 - 它们是什么,Java UTF-8 与 MySQL 有何不同)
  • 使该列能够接受此值(如何 - 是否有比 utf8_general_ci 更多的 utf-y)

回答by Bozho

These appear to be unicode surrogate characters. Since they are not actual characters, and it seems MySQL doesn't support them, it is safe to trim them:

这些似乎是unicode 代理字符。由于它们不是实际字符,而且 MySQL 似乎不支持它们,因此修剪它们是安全的:

StringBuilder sb = new StringBuilder();
for (int i = 0; i < text.length(); i++) {
    char ch = text.charAt(i);
    if (!Character.isHighSurrogate(ch) && !Character.isLowSurrogate(ch)) {
        sb.append(ch);
    }
}
return sb.toString();