SQL Server 忽略 where 表达式中的大小写
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1224364/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL server ignore case in a where expression
提问by Raul Agrait
How do I construct a SQL query (MS SQL Server) where the "where" clause is case-insensitive?
如何构造“where”子句不区分大小写的 SQL 查询 (MS SQL Server)?
SELECT * FROM myTable WHERE myField = 'sOmeVal'
I want the results to come back ignoring the case
我希望结果回来忽略案例
回答by Adam Robinson
In the default configuration of a SQL Server database, string comparisons arecase-insensitive. If your database overrides this setting (through the use of an alternate collation), then you'll need to specify what sort of collation to use in your query.
在SQL Server数据库的默认配置,字符串比较是不区分大小写。如果您的数据库覆盖此设置(通过使用备用排序规则),则您需要指定在查询中使用哪种排序规则。
SELECT * FROM myTable WHERE myField = 'sOmeVal' COLLATE SQL_Latin1_General_CP1_CI_AS
Note that the collation I provided is just an example (though it will more than likely function just fine for you). A more thorough outline of SQL Server collations can be found here.
请注意,我提供的排序规则只是一个示例(尽管它很可能对您来说很好用)。可以在此处找到 SQL Server 排序规则的更详尽概述。
回答by Andrejs Cainikovs
Usually, string comparisons are case-insensitive. If your database is configured to case sensitive collation, you need to force to use a case insensitive one:
通常,字符串比较不区分大小写。如果您的数据库配置为区分大小写的排序规则,则需要强制使用不区分大小写的排序规则:
SELECT balance FROM people WHERE email = '[email protected]'
COLLATE SQL_Latin1_General_CP1_CI_AS
回答by Danny
I found another solution elsewhere; that is, to use
我在别处找到了另一个解决方案;也就是说,使用
upper(@yourString)
but everyone here is saying that, in SQL Server, it doesn't matter because it's ignoring case anyway? I'm pretty sure our database is case-sensitive.
但是这里的每个人都在说,在 SQL Server 中,这无关紧要,因为它无论如何都忽略了大小写?我很确定我们的数据库是区分大小写的。
回答by Solomon Rutzky
The top 2 answers (from Adam Robinsonand Andrejs Cainikovs) are kinda, sorta correct, in that they do technically work, but their explanations are wrong and so could be misleading in many cases. For example, while the SQL_Latin1_General_CP1_CI_AS
collation will work in many cases, it should not be assumed to be the appropriate case-insensitive collation. In fact, given that the O.P. is working in a database with a case-sensitive (or possibly binary) collation, we know that the O.P. isn't using the collation that is the default for so many installations (especially any installed on an OS using US English as the language): SQL_Latin1_General_CP1_CI_AS
. Sure, the O.P. couldbe using SQL_Latin1_General_CP1_CS_AS
, but when working with VARCHAR
data, it is important to not change the code page as it could lead to data loss, and that is controlled by the locale / culture of the collation (i.e. Latin1_General vs French vs Hebrew etc). Please see point # 9 below.
前 2 个答案(来自Adam Robinson和Andrejs Cainikovs)有点正确,因为它们在技术上确实有效,但它们的解释是错误的,因此在许多情况下可能会产生误导。例如,虽然SQL_Latin1_General_CP1_CI_AS
排序规则在许多情况下都有效,但不应假定它是适当的不区分大小写的排序规则。事实上,鉴于 OP 在具有区分大小写(或可能是二进制)排序规则的数据库中工作,我们知道 OP 没有使用许多安装的默认排序规则(尤其是安装在操作系统上的任何排序规则)使用美国英语作为语言)SQL_Latin1_General_CP1_CI_AS
。当然,OP可能正在使用SQL_Latin1_General_CP1_CS_AS
,但是在使用时VARCHAR
数据,重要的是不要更改代码页,因为它可能会导致数据丢失,这由排序规则的区域设置/文化控制(即 Latin1_General vs French vs Hebrew 等)。请参阅下面的第 9 点。
The other four answers are wrong to varying degrees.
其他四个答案都有不同程度的错误。
I will clarify all of the misunderstandings here so that readers can hopefully make the most appropriate / efficient choices.
我将在这里澄清所有误解,以便读者能够做出最合适/最有效的选择。
Do not use
UPPER()
. That is completely unnecessary extra work. Use aCOLLATE
clause. A string comparison needs to be done in either case, but usingUPPER()
also has to check, character by character, to see if there is an upper-case mapping, and then change it. And you need to do this on both sides. AddingCOLLATE
simply directs the processing to generate the sort keys using a different set of rules than it was going to by default. UsingCOLLATE
is definitely more efficient (or "performant", if you like that word :) than usingUPPER()
, as proven in this test script (on PasteBin).There is also the issue noted by @Ceisc on @Danny's answer:
In some languages case conversions do not round-trip. i.e. LOWER(x) != LOWER(UPPER(x)).
The Turkish upper-case "?" is the common example.
No, collation is not a database-wide setting, at least not in this context. There is a database-level default collation, and it is used as the default for altered and newly created columns that do not specify the
COLLATE
clause (which is likely where this common misconception comes from), but it does not impact queries directly unless you are comparing string literals and variables to other string literals and variables, or you are referencing database-level meta-data.No, collation is not per query.
Collations are per predicate(i.e. something operand something) or expression, not per query. And this is true for the entire query, not just the
WHERE
clause. This covers JOINs, GROUP BY, ORDER BY, PARTITION BY, etc.No, do not convert to
VARBINARY
(e.g.convert(varbinary, myField) = convert(varbinary, 'sOmeVal')
) for the following reasons:- that is a binary comparison, which is not case-insensitive (which is what this question is asking for)
- if you do want a binary comparison, use a binary collation. Use one that ends with
_BIN2
if you are using SQL Server 2008 or newer, else you have no choice but to use one that ends with_BIN
. If the data isNVARCHAR
then it doesn't matter which locale you use as they are all the same in that case, henceLatin1_General_100_BIN2
always works. If the data isVARCHAR
, you must use the same locale that the data is currently in (e.g.Latin1_General
,French
,Japanese_XJIS
, etc) because the locale determines the code page that is used, and changing code pages can alter the data (i.e. data loss). - using a variable-length datatype without specifying the size will rely on the default size, and there are two different defaults depending on the context where the datatype is being used. It is either 1 or 30 for string types. When used with
CONVERT()
it will use the 30 default value. The danger is, if the string can be over 30 bytes, it will get silently truncated and you will likely get incorrect results from this predicate. - Even if you want a case-sensitive comparison, binary collations are notcase-sensitive(another very common misconception).
No,
LIKE
is not always case-sensitive. It uses the collation of the column being referenced, or the collation of the database if a variable is compared to a string literal, or the collation specified via the optionalCOLLATE
clause.LCASE
is not a SQL Server function. It appears to be either Oracle or MySQL. Or possibly Visual Basic?Since the context of the question is comparing a column to a string literal, neither the collation of the instance (often referred to as "server") nor the collation of the database have any directimpact here. Collations are stored per each column, and each column can have a different collation, and those collations don't need to be the same as the database's default collation or the instance's collation. Sure, the instance collation is the default for what a newly created database will use as its default collation if the
COLLATE
clause wasn't specified when creating the database. And likewise, the database's default collation is what an altered or newly created column will use if theCOLLATE
clause wasn't specified.You should use the case-insensitive collation that is otherwise the same as the collation of the column. Use the following query to find the column's collation (change the table's name and schema name):
SELECT col.* FROM sys.columns col WHERE col.[object_id] = OBJECT_ID(N'dbo.TableName') AND col.[collation_name] IS NOT NULL;
Then just change the
_CS
to be_CI
. So,Latin1_General_100_CS_AS
would becomeLatin1_General_100_CI_AS
.If the column is using a binary collation (ending in
_BIN
or_BIN2
), then find a similar collation using the following query:SELECT * FROM sys.fn_helpcollations() col WHERE col.[name] LIKE N'{CurrentCollationMinus"_BIN"}[_]CI[_]%';
For example, assuming the column is using
Japanese_XJIS_100_BIN2
, do this:SELECT * FROM sys.fn_helpcollations() col WHERE col.[name] LIKE N'Japanese_XJIS_100[_]CI[_]%';
不要使用
UPPER()
. 那是完全不必要的额外工作。使用COLLATE
子句。在任何一种情况下都需要进行字符串比较,但 usingUPPER()
还必须逐个字符地检查是否存在大写映射,然后更改它。你需要在双方都这样做。添加COLLATE
只是指示处理使用与默认情况不同的一组规则来生成排序键。正如这个测试脚本(在 PasteBin 上)所证明的COLLATE
那样,使用肯定比使用更有效(或“高性能”,如果你喜欢这个词:) 。UPPER()
在某些语言情况下,转换不会往返。即低(x)!= 低(高(x))。
土耳其语大写“?” 是常见的例子。
不,排序规则不是数据库范围的设置,至少在这种情况下不是。有一个数据库级别的默认排序规则,它被用作未指定
COLLATE
子句的已更改和新创建的列的默认值(这可能是这种常见误解的来源),但它不会直接影响查询,除非您是将字符串文字和变量与其他字符串文字和变量进行比较,或者您正在引用数据库级元数据。不,整理不是每个查询。
排序规则是按谓词(即某些操作数)或表达式,而不是按查询进行的。这适用于整个查询,而不仅仅是
WHERE
子句。这包括 JOIN、GROUP BY、ORDER BY、PARTITION BY 等。不,由于以下原因,请勿转换为
VARBINARY
(egconvert(varbinary, myField) = convert(varbinary, 'sOmeVal')
):- 这是一个二进制比较,不区分大小写(这就是这个问题所要求的)
- 如果您确实需要二进制比较,请使用二进制排序规则。使用一个与结束
_BIN2
,如果你使用的是SQL Server 2008或更新,否则你没有选择,只能使用一个结束与_BIN
。如果数据是,NVARCHAR
那么您使用哪种语言环境并不重要,因为在这种情况下它们都是相同的,因此Latin1_General_100_BIN2
始终有效。如果数据是VARCHAR
,您必须使用数据当前所在的相同语言环境(例如Latin1_General
、French
、Japanese_XJIS
等),因为语言环境决定了所使用的代码页,而更改代码页会改变数据(即数据丢失)。 - 使用不指定大小的可变长度数据类型将依赖于默认大小,根据使用数据类型的上下文,有两种不同的默认值。对于字符串类型,它是 1 或 30。与
CONVERT()
它一起使用时将使用 30 默认值。危险是,如果字符串可以超过 30 个字节,它将被静默截断,并且您可能会从这个谓词中得到不正确的结果。 - 即使您想要区分大小写的比较,二进制排序规则也不区分大小写(另一个非常常见的误解)。
不,
LIKE
并不总是区分大小写。它使用被引用列的排序规则,或者如果将变量与字符串文字进行比较,则使用数据库的排序规则,或者通过可选COLLATE
子句指定的排序规则。LCASE
不是 SQL Server 函数。它似乎是 Oracle 或 MySQL。或者可能是 Visual Basic?由于问题的上下文是将列与字符串文字进行比较,因此实例的排序规则(通常称为“服务器”)和数据库的排序规则在这里都没有任何直接影响。排序规则按每一列存储,每一列可以有不同的排序规则,这些排序规则不需要与数据库的默认排序规则或实例的排序规则相同。当然,如果
COLLATE
在创建数据库时未指定该子句,则实例排序规则是新创建的数据库将用作其默认排序规则的默认设置。同样,如果COLLATE
未指定子句,数据库的默认排序规则是更改或新创建的列将使用的排序规则。您应该使用与列的排序规则相同的不区分大小写的排序规则。使用以下查询查找列的排序规则(更改表的名称和架构名称):
SELECT col.* FROM sys.columns col WHERE col.[object_id] = OBJECT_ID(N'dbo.TableName') AND col.[collation_name] IS NOT NULL;
然后只需将 更改
_CS
为_CI
。所以,Latin1_General_100_CS_AS
会变成Latin1_General_100_CI_AS
。如果该列使用二进制排序规则(以
_BIN
或结尾_BIN2
),则使用以下查询查找类似的排序规则:SELECT * FROM sys.fn_helpcollations() col WHERE col.[name] LIKE N'{CurrentCollationMinus"_BIN"}[_]CI[_]%';
例如,假设列正在使用
Japanese_XJIS_100_BIN2
,请执行以下操作:SELECT * FROM sys.fn_helpcollations() col WHERE col.[name] LIKE N'Japanese_XJIS_100[_]CI[_]%';
For more info on collations, encodings, etc, please visit: Collations Info
有关排序规则、编码等的更多信息,请访问:排序规则信息
回答by David Hermanns
No, only using LIKE
will not work. LIKE
searches values matching exactly your given pattern. In this case LIKE
would find only the text 'sOmeVal' and not 'someval'.
不,只使用是LIKE
行不通的。LIKE
搜索与您的给定模式完全匹配的值。在这种情况下,LIKE
只会找到文本 'sOmeVal' 而不是 'someval'。
A pracitcable solution is using the LCASE()
function. LCASE('sOmeVal')
gets the lowercase string of your text: 'someval'. If you use this function for both sides of your comparison, it works:
一个可行的解决方案是使用该LCASE()
函数。LCASE('sOmeVal')
获取文本的小写字符串:'someval'。如果您在比较的双方都使用此函数,它会起作用:
SELECT * FROM myTable WHERE LCASE(myField) LIKE LCASE('sOmeVal')
SELECT * FROM myTable WHERE LCASE(myField) LIKE LCASE('sOmeVal')
The statement compares two lowercase strings, so that your 'sOmeVal' will match every other notation of 'someval' (e.g. 'Someval', 'sOMEVAl' etc.).
该语句比较两个小写字符串,因此您的“sOmeVal”将匹配“someval”的所有其他符号(例如“Someval”、“sOMEVAL”等)。
回答by David Hermanns
You can force the case sensitive, casting to a varbinary like that:
您可以强制区分大小写,转换为 varbinary 如下:
SELECT * FROM myTable
WHERE convert(varbinary, myField) = convert(varbinary, 'sOmeVal')
回答by Chase Seibert
What database are you on? With MS SQL Server, it's a database-wide setting, or you can over-ride it per-query with the COLLATE keyword.
你在什么数据库上?对于 MS SQL Server,它是一个数据库范围的设置,或者您可以使用 COLLATE 关键字在每个查询中覆盖它。