SQL WHERE 子句中的 IN 与 OR
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3074713/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
IN vs OR in the SQL WHERE Clause
提问by felix
When dealing with big databases, which performs better, IN
or OR
in the SQL Where
-clause?
处理大型数据库时,哪个性能更好,IN
或者OR
在 SQLWhere
子句中?
Is there any difference about the way they are executed?
它们的执行方式有什么不同吗?
回答by Mark Byers
I assume you want to know the performance difference between the following:
我假设您想知道以下内容之间的性能差异:
WHERE foo IN ('a', 'b', 'c')
WHERE foo = 'a' OR foo = 'b' OR foo = 'c'
According to the manual for MySQLif the values are constant IN
sorts the list and then uses a binary search. I would imagine that OR
evaluates them one by one in no particular order. So IN
is faster in some circumstances.
根据MySQL的手册,如果值是常量IN
,则对列表进行排序,然后使用二进制搜索。我可以想象,OR
没有特定的顺序对它们进行一一评估。所以IN
在某些情况下更快。
The best way to know is to profile both on your database with your specific data to see which is faster.
最好的了解方法是使用您的特定数据在您的数据库上对两者进行分析,以查看哪个更快。
I tried both on a MySQL with 1000000 rows. When the column is indexed there is no discernable difference in performance - both are nearly instant. When the column is not indexed I got these results:
我在一个有 1000000 行的 MySQL 上都试过。当列被索引时,性能上没有明显的差异 - 两者几乎是即时的。当该列未编入索引时,我得到了以下结果:
SELECT COUNT(*) FROM t_inner WHERE val IN (1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000);
1 row fetched in 0.0032 (1.2679 seconds)
SELECT COUNT(*) FROM t_inner WHERE val = 1000 OR val = 2000 OR val = 3000 OR val = 4000 OR val = 5000 OR val = 6000 OR val = 7000 OR val = 8000 OR val = 9000;
1 row fetched in 0.0026 (1.7385 seconds)
So in this case the method using OR is about 30% slower. Adding more terms makes the difference larger. Results may vary on other databases and on other data.
所以在这种情况下,使用 OR 的方法慢了大约 30%。添加更多项会使差异更大。结果可能因其他数据库和其他数据而异。
回答by Peter Lang
The best way to find out is looking at the Execution Plan.
找出答案的最佳方法是查看执行计划。
I tried it with Oracle, and it was exactly the same.
我用Oracle试过了,结果完全一样。
CREATE TABLE performance_test AS ( SELECT * FROM dba_objects );
SELECT * FROM performance_test
WHERE object_name IN ('DBMS_STANDARD', 'DBMS_REGISTRY', 'DBMS_LOB' );
Even though the query uses IN
, the Execution Plan says that it uses OR
:
即使查询使用IN
,执行计划也说它使用OR
:
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 1416 | 163 (2)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| PERFORMANCE_TEST | 8 | 1416 | 163 (2)| 00:00:02 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME"='DBMS_LOB' OR "OBJECT_NAME"='DBMS_REGISTRY' OR
"OBJECT_NAME"='DBMS_STANDARD')
回答by Alessandro Rossi
The OR operator needs a much more complex evaluation process than the IN construct because it allows many conditions, not only equals like IN.
OR 运算符需要比 IN 结构复杂得多的计算过程,因为它允许许多条件,而不仅仅是像 IN 一样的等号。
Here is a like of what you can use with OR but that are not compatible with IN: greater. greater or equal, less, less or equal, LIKE and some more like the oracle REGEXP_LIKE. In addition consider that the conditions may not always compare the same value.
这是您可以与 OR 一起使用但与 IN 不兼容的内容:更大。大于或等于、小于、小于或等于、LIKE 和一些更像 oracle REGEXP_LIKE。另外考虑条件可能并不总是比较相同的值。
For the query optimizer it's easier to to manage the IN operator because is only a construct that defines the OR operator on multiple conditions with = operator on the same value. If you use the OR operator the optimizer may not consider that you're always using the = operator on the same value and, if it doesn't perform a deeper and very much more complex elaboration, it could probably exclude that there may be only = operators for the same values on all the involved conditions, with a consequent preclusion of optimized search methods like the already mentioned binary search.
对于查询优化器,管理 IN 运算符更容易,因为它只是一个构造,它在多个条件下定义 OR 运算符,= 运算符用于相同的值。如果您使用 OR 运算符,优化器可能不会认为您总是在相同的值上使用 = 运算符,并且,如果它没有执行更深入、更复杂的阐述,它可能会排除可能只有= 运算符在所有涉及的条件下使用相同的值,因此排除了优化的搜索方法,如已经提到的二分搜索。
[EDIT] Probably an optimizer may not implement optimized IN evaluation process, but this doesn't exclude that one time it could happen(with a database version upgrade). So if you use the OR operator that optimized elaboration will not be used in your case.
[编辑] 优化器可能不会实现优化的 IN 评估过程,但这并不排除它可能发生的一次(通过数据库版本升级)。因此,如果您使用 OR 运算符,则不会在您的情况下使用优化的阐述。
回答by soulmerge
I think oracle is smart enough to convert the less efficient one (whichever that is) into the other. So I think the answer should rather depend on the readability of each (where I think that IN
clearly wins)
我认为 oracle 足够聪明,可以将效率较低的一个(无论哪个)转换为另一个。所以我认为答案应该取决于每个人的可读性(我认为这IN
显然是赢家)
回答by shahkalpesh
OR
makes sense (from readability point of view), when there are less values to be compared.
IN
is useful esp. when you have a dynamic source, with which you want values to be compared.
OR
当要比较的值较少时,才有意义(从可读性的角度来看)。
IN
特别有用 当您有一个动态源时,您希望将值与其进行比较。
Another alternative is to use a JOIN
with a temporary table.
I don't think performance should be a problem, provided you have necessary indexes.
另一种选择是将 aJOIN
与临时表一起使用。
我不认为性能应该是一个问题,只要你有必要的索引。
回答by user3003962
I did a SQL query in a large number of OR (350). Postgres do it 437.80ms.
我在大量OR(350)中做了一个SQL查询。Postgres 做到了437.80ms。
Now use IN:
现在使用IN:
23.18ms
23.18ms