如何在 MySQL 上进行 SQL 区分大小写的字符串比较?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5629111/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I make SQL case sensitive string comparison on MySQL?
提问by StevenB
I have a function that returns five characters with mixed case. If I do a query on this string it will return the value regardless of case.
我有一个函数返回五个大小写混合的字符。如果我对该字符串进行查询,无论大小写,它都会返回该值。
How can I make MySQL string queries case sensitive?
如何使 MySQL 字符串查询区分大小写?
采纳答案by drudge
http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
http://dev.mysql.com/doc/refman/5.0/en/case-sensitive.html
The default character set and collation are latin1 and latin1_swedish_ci, so nonbinary string comparisons are case insensitive by default. This means that if you search with col_name LIKE 'a%', you get all column values that start with A or a. To make this search case sensitive, make sure that one of the operands has a case sensitive or binary collation. For example, if you are comparing a column and a string that both have the latin1 character set, you can use the COLLATE operator to cause either operand to have the latin1_general_cs or latin1_bin collation:
默认字符集和排序规则是 latin1 和 latin1_swedish_ci,因此默认情况下非二进制字符串比较不区分大小写。这意味着如果您使用 col_name LIKE 'a%' 进行搜索,您将获得所有以 A 或 a 开头的列值。要使此搜索区分大小写,请确保其中一个操作数具有区分大小写或二进制排序规则。例如,如果您要比较都具有 latin1 字符集的列和字符串,则可以使用 COLLATE 运算符使任一操作数具有 latin1_general_cs 或 latin1_bin 排序规则:
col_name COLLATE latin1_general_cs LIKE 'a%'
col_name LIKE 'a%' COLLATE latin1_general_cs
col_name COLLATE latin1_bin LIKE 'a%'
col_name LIKE 'a%' COLLATE latin1_bin
If you want a column always to be treated in case-sensitive fashion, declare it with a case sensitive or binary collation.
如果您希望始终以区分大小写的方式处理列,请使用区分大小写或二进制排序规则声明它。
回答by Craig White
The good news is that if you need to make a case-sensitive query, it is very easy to do:
好消息是,如果您需要进行区分大小写的查询,很容易做到:
SELECT * FROM `table` WHERE BINARY `column` = 'value'
回答by Nitesh
Answer posted by Craig White, has big performance penalty
Craig White 发布的答案有很大的性能损失
SELECT * FROM `table` WHERE BINARY `column` = 'value'
because it don't use indexes. So, either you need to change the table collation like mention here https://dev.mysql.com/doc/refman/5.7/en/case-sensitivity.html.
因为它不使用索引。所以,要么你需要像这里提到的那样更改表排序规则https://dev.mysql.com/doc/refman/5.7/en/case-sensivity.html。
OR
或者
Easiest fix, you should use a BINARY of value.
最简单的修复,您应该使用 BINARY 值。
SELECT * FROM `table` WHERE `column` = BINARY 'value'
Eg.
例如。
mysql> EXPLAIN SELECT * FROM temp1 WHERE BINARY col1 = "ABC" AND col2 = "DEF" ;
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | temp1 | ALL | NULL | NULL | NULL | NULL | 190543 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+--------+-------------+
VS
VS
mysql> EXPLAIN SELECT * FROM temp1 WHERE col1 = BINARY "ABC" AND col2 = "DEF" ;
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
| 1 | SIMPLE | temp1 | range | col1_2e9e898e | col1_2e9e898e | 93 | NULL | 2 | Using index condition; Using where |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+------------------------------------+
enter code here
1 row in set (0.00 sec)
1 行(0.00 秒)
回答by insoftservice
Instead of using the = operator, you may want to use LIKE or LIKE BINARY
您可能希望使用 LIKE 或 LIKE BINARY,而不是使用 = 运算符
// this returns 1 (true)
select 'A' like 'a'
// this returns 0 (false)
select 'A' like binary 'a'
select * from user where username like binary 'a'
It will take 'a' and not 'A' in its condition
它将在其条件下使用 'a' 而不是 'A'
回答by Eric
To make use of an index before using the BINARY, you could do something like this if you have large tables.
要在使用 BINARY 之前使用索引,如果你有大表,你可以做这样的事情。
SELECT
*
FROM
(SELECT * FROM `table` WHERE `column` = 'value') as firstresult
WHERE
BINARY `column` = 'value'
The subquery would result in a really small case-insensitive subset of which you then select the only case-sensitive match.
子查询将产生一个非常小的不区分大小写的子集,然后您可以从中选择唯一区分大小写的匹配项。
回答by Paul Wheeler
The most correct way to perform a case sensitive string comparison without changing the collation of the column being queried is to explicitly specify a character set and collation for the value that the column is being compared to.
在不更改被查询列的排序规则的情况下执行区分大小写的字符串比较的最正确方法是为要比较的列的值显式指定字符集和排序规则。
select * from `table` where `column` = convert('value' using utf8mb4) collate utf8mb4_bin;
Why not use binary
?
为什么不使用binary
?
Using the binary
operator is inadvisable because it compares the actual bytes of the encoded strings. If you compare the actual bytes of two strings encoded using the different character sets two strings that should be considered the same they may not be equal. For example if you have a column that uses the latin1
character set, and your server/session character set is utf8mb4
, then when you compare the column with a string containing an accent such as 'café' it will not match rows containing that same string! This is because in latin1
é is encoded as the byte 0xE9
but in utf8
it is two bytes: 0xC3A9
.
不建议使用binary
运算符,因为它比较编码字符串的实际字节。如果比较使用不同字符集编码的两个字符串的实际字节数,则应将两个字符串视为相同,它们可能不相等。例如,如果您有一个使用latin1
字符集的列,并且您的服务器/会话字符集是utf8mb4
,那么当您将该列与包含重音符号的字符串(例如 'café' )进行比较时,它将不匹配包含相同字符串的行!这是因为在latin1
é 中被编码为字节,0xE9
但在utf8
其中是两个字节:0xC3A9
.
Why use convert
as well as collate
?
为什么使用convert
以及collate
?
Collations must match the character set. So if your server or session is set to use the latin1
character set you must use collate latin1_bin
but if your character set is utf8mb4
you must use collate utf8mb4_bin
. Therefore the most robust solution is to always convert the value into the most flexible character set, and use the binary collation for that character set.
排序规则必须与字符集匹配。因此,如果您的服务器或会话设置为使用latin1
您必须使用的字符集,collate latin1_bin
但如果您的字符集是utf8mb4
您必须使用collate utf8mb4_bin
. 因此,最可靠的解决方案是始终将值转换为最灵活的字符集,并对该字符集使用二进制排序规则。
Why apply the convert
and collate
to the value and not the column?
为什么将convert
andcollate
应用于值而不是列?
When you apply any transforming function to a column before making a comparison it prevents the query engine from using an index if one exists for the column, which could dramatically slow down your query. Therefore it is always better to transform the value instead where possible. When a comparison is performed between two string values and one of them has an explicitly specified collation, the query engine will use the explicit collation, regardless of which value it is applied to.
当您在进行比较之前将任何转换函数应用于列时,它会阻止查询引擎使用索引(如果该列存在索引),这可能会大大减慢您的查询速度。因此,最好在可能的情况下转换该值。当在两个字符串值之间执行比较并且其中一个具有显式指定的排序规则时,查询引擎将使用显式排序规则,而不管它应用于哪个值。
Accent Sensitivity
口音敏感度
It is important to note that MySql is not only case insensitive for columns using an _ci
collation (which is typically the default), but also accentinsensitive. This means that 'é' = 'e'
. Using a binary collation (or the binary
operator) will make string comparisons accent sensitive as well as case sensitive.
需要注意的是,MySql 不仅对使用_ci
排序规则(通常是默认设置)的列不区分大小写,而且不区分重音。这意味着'é' = 'e'
. 使用二进制排序规则(或binary
运算符)将使字符串比较区分重音和大小写。
What is utf8mb4
?
什么是utf8mb4
?
The utf8
character set in MySql is an alias for utf8mb3
which has been deprecated in recent versionsbecause it does not support 4 byte characters (which is important for encoding strings like ). If you wish to use the UTF8 character encodingwith MySql then you should be using the utf8mb4
charset.
utf8
MySql 中的字符集是最近版本中utf8mb3
已弃用的别名,因为它不支持 4 字节字符(这对于编码字符串很重要,如 )。如果您希望在MySql 中使用UTF8 字符编码,那么您应该使用utf8mb4
字符集。
回答by fritzthecat
Following is for MySQL versions equal to or higher than 5.5.
以下是针对等于或高于 5.5 的 MySQL 版本。
Add to /etc/mysql/my.cnf
添加到/etc/mysql/my.cnf
[mysqld]
...
character-set-server=utf8
collation-server=utf8_bin
...
All other collations I tried seemed to be case-insensitive, only "utf8_bin" worked.
我尝试过的所有其他排序规则似乎都不区分大小写,只有“utf8_bin”有效。
Do not forget to restart mysql after this:
在此之后不要忘记重新启动mysql:
sudo service mysql restart
According to http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.htmlthere is also a "latin1_bin".
根据http://dev.mysql.com/doc/refman/5.0/en/case-sensivity.html还有一个“latin1_bin”。
The "utf8_general_cs" was not accepted by mysql startup. (I read "_cs" as "case-sensitive" - ???).
mysql 启动不接受“utf8_general_cs”。(我将“_cs”读为“区分大小写” - ???)。
回答by xiezefan
You can use BINARY to case sensitive like this
您可以像这样使用 BINARY 区分大小写
select * from tb_app where BINARY android_package='com.Mtime';
unfortunately this sql can't use index, you will suffer a performance hit on queries reliant on that index
不幸的是,这个 sql 不能使用索引,你会在依赖于该索引的查询上遭受性能损失
mysql> explain select * from tb_app where BINARY android_package='com.Mtime';
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | tb_app | NULL | ALL | NULL | NULL | NULL | NULL | 1590351 | 100.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
Fortunately, I have a few tricks to solve this problem
幸运的是,我有一些技巧可以解决这个问题
mysql> explain select * from tb_app where android_package='com.Mtime' and BINARY android_package='com.Mtime';
+----+-------------+--------+------------+------+---------------------------+---------------------------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------------------+---------------------------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | tb_app | NULL | ref | idx_android_pkg | idx_android_pkg | 771 | const | 1 | 100.00 | Using index condition |
+----+-------------+--------+------------+------+---------------------------+---------------------------+---------+-------+------+----------+-----------------------+
回答by Pappu Mehta
No need to changes anything on DB level, just you have to changes in SQL Query it will work.
无需在数据库级别更改任何内容,只需更改 SQL 查询即可。
Example -
例子 -
"SELECT * FROM <TABLE> where userId = '" + iv_userId + "' AND password = BINARY '" + iv_password + "'";
"SELECT * FROM <TABLE> where userId = '" + iv_userId + "' AND password = BINARY '" + iv_password + "'";
Binary keyword will make case sensitive.
二进制关键字将区分大小写。
回答by Victor Enrique
Excellent!
优秀!
I share with you, code from a function that compares passwords:
我与您分享一个比较密码的函数的代码:
SET pSignal =
(SELECT DECODE(r.usignal,'YOURSTRINGKEY') FROM rsw_uds r WHERE r.uname =
in_usdname AND r.uvige = 1);
SET pSuccess =(SELECT in_usdsignal LIKE BINARY pSignal);
IF pSuccess = 1 THEN
/*Your code if match*/
ELSE
/*Your code if don't match*/
END IF;