MySQL 中不区分重音的搜索查询
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8647080/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Accent insensitive search query in MySQL
提问by Okan Kocyigit
Is there any way to make search query accent insensitive?
有没有办法使搜索查询重音不敏感?
the column's and table's collation are utf8_polish_ci and I don't want to change them.
列和表的排序规则是 utf8_polish_ci,我不想更改它们。
example word : toruń
示例词: toruń
select * from pages where title like '%torun%'
It doesn't find "toruń". How can I do that?
它没有找到“toruń”。我怎样才能做到这一点?
回答by goat
You can change the collation at runtime in the sql query,
您可以在运行时在 sql 查询中更改排序规则,
...where title like '%torun%' collate utf8_general_ci
but beware that changing the collation on the fly at runtime forgoes the possibility of mysql using an index, so performance on large tables may be terrible.
但请注意,在运行时动态更改排序规则会放弃 mysql 使用索引的可能性,因此大型表的性能可能很糟糕。
Or, you can copy the column to another column, such as searchable_title
, but change the collation on it. It's actually common to do this type of stuff, where you copy data but have it in some slightly different form that's optimized for some specific workload/purpose. You can use triggers as a nice way to keep the duplicated columns in sync. This method has the potential to perform well, if indexed.
或者,您可以将该列复制到另一列,例如searchable_title
,但更改其排序规则。做这种类型的事情实际上很常见,在这种情况下,您复制数据,但以稍微不同的形式保存数据,这些形式针对某些特定的工作负载/目的进行了优化。您可以使用触发器作为保持重复列同步的好方法。如果索引,此方法有可能表现良好。
Note - Make sure that your db really has those characters and not html entities.
Also, the character set of your connection matters. The above assumes it's set to utf8, for example, via set nameslike set names utf8
注意 - 确保您的数据库确实具有这些字符而不是 html 实体。此外,您的连接的字符集很重要。上面假设它被设置为 utf8,例如,通过像这样的设置名称set names utf8
If not, you need an introducerfor the literal value
如果没有,您需要文字值的介绍人
...where title like _utf8'%torun%' collate utf8_general_ci
and of course, the value in the single quotes must actually be utf8 encoded, even if the rest of the sql query isn't.
当然,单引号中的值实际上必须是 utf8 编码的,即使 sql 查询的其余部分不是。
回答by Kieran
This wont work in extreme circumstances, but try to change the column collation to UFT8 utf8_unicode_ci
. Then accented characters will be equal to their non-accented counterparts.
这在极端情况下不起作用,但尝试将列排序规则更改为 UFT8 utf8_unicode_ci
。然后重音字符将等于它们的非重音对应物。
回答by Remy
You could try SOUNDEX:
你可以试试 SOUNDEX:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex
This compares two string by how they sound. But this obviously delivers many more results.
这通过它们的声音来比较两个字符串。但这显然会带来更多的结果。