SQL WHERE 子句中的 IN 与 OR

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3074713/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 06:36:34  来源:igfitidea点击:

IN vs OR in the SQL WHERE Clause

sqldatabase

提问by felix

When dealing with big databases, which performs better, INor ORin the SQL Where-clause?

处理大型数据库时,哪个性能更好,IN或者OR在 SQLWhere子句中?

Is there any difference about the way they are executed?

它们的执行方式有什么不同吗?

回答by Mark Byers

I assume you want to know the performance difference between the following:

我假设您想知道以下内容之间的性能差异:

WHERE foo IN ('a', 'b', 'c')
WHERE foo = 'a' OR foo = 'b' OR foo = 'c'

According to the manual for MySQLif the values are constant INsorts the list and then uses a binary search. I would imagine that ORevaluates them one by one in no particular order. So INis faster in some circumstances.

根据MySQL手册,如果值是常量IN,则对列表进行排序,然后使用二进制搜索。我可以想象,OR没有特定的顺序对它们进行一一评估。所以IN在某些情况下更快。

The best way to know is to profile both on your database with your specific data to see which is faster.

最好的了解方法是使用您的特定数据在您的数据库上对两者进行分析,以查看哪个更快。

I tried both on a MySQL with 1000000 rows. When the column is indexed there is no discernable difference in performance - both are nearly instant. When the column is not indexed I got these results:

我在一个有 1000000 行的 MySQL 上都试过。当列被索引时,性能上没有明显的差异 - 两者几乎是即时的。当该列未编入索引时,我得到了以下结果:

SELECT COUNT(*) FROM t_inner WHERE val IN (1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000);
1 row fetched in 0.0032 (1.2679 seconds)

SELECT COUNT(*) FROM t_inner WHERE val = 1000 OR val = 2000 OR val = 3000 OR val = 4000 OR val = 5000 OR val = 6000 OR val = 7000 OR val = 8000 OR val = 9000;
1 row fetched in 0.0026 (1.7385 seconds)

So in this case the method using OR is about 30% slower. Adding more terms makes the difference larger. Results may vary on other databases and on other data.

所以在这种情况下,使用 OR 的方法慢了大约 30%。添加更多项会使差异更大。结果可能因其他数据库和其他数据而异。

回答by Peter Lang

The best way to find out is looking at the Execution Plan.

找出答案的最佳方法是查看执行计划。



I tried it with Oracle, and it was exactly the same.

我用Oracle试过了,结果完全一样。

CREATE TABLE performance_test AS ( SELECT * FROM dba_objects );

SELECT * FROM performance_test
WHERE object_name IN ('DBMS_STANDARD', 'DBMS_REGISTRY', 'DBMS_LOB' );

Even though the query uses IN, the Execution Plan says that it uses OR:

即使查询使用IN,执行计划也说它使用OR

--------------------------------------------------------------------------------------    
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |    
--------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT  |                  |     8 |  1416 |   163   (2)| 00:00:02 |    
|*  1 |  TABLE ACCESS FULL| PERFORMANCE_TEST |     8 |  1416 |   163   (2)| 00:00:02 |    
--------------------------------------------------------------------------------------    

Predicate Information (identified by operation id):                                       
---------------------------------------------------                                       

   1 - filter("OBJECT_NAME"='DBMS_LOB' OR "OBJECT_NAME"='DBMS_REGISTRY' OR                
              "OBJECT_NAME"='DBMS_STANDARD')                                              

回答by Alessandro Rossi

The OR operator needs a much more complex evaluation process than the IN construct because it allows many conditions, not only equals like IN.

OR 运算符需要比 IN 结构复杂得多的计算过程,因为它允许许多条件,而不仅仅是像 IN 一样的等号。

Here is a like of what you can use with OR but that are not compatible with IN: greater. greater or equal, less, less or equal, LIKE and some more like the oracle REGEXP_LIKE. In addition consider that the conditions may not always compare the same value.

这是您可以与 OR 一起使用但与 IN 不兼容的内容:更大。大于或等于、小于、小于或等于、LIKE 和一些更像 oracle REGEXP_LIKE。另外考虑条件可能并不总是比较相同的值。

For the query optimizer it's easier to to manage the IN operator because is only a construct that defines the OR operator on multiple conditions with = operator on the same value. If you use the OR operator the optimizer may not consider that you're always using the = operator on the same value and, if it doesn't perform a deeper and very much more complex elaboration, it could probably exclude that there may be only = operators for the same values on all the involved conditions, with a consequent preclusion of optimized search methods like the already mentioned binary search.

对于查询优化器,管理 IN 运算符更容易,因为它只是一个构造,它在多个条件下定义 OR 运算符,= 运算符用于相同的值。如果您使用 OR 运算符,优化器可能不会认为您总是在相同的值上使用 = 运算符,并且,如果它没有执行更深入、更复杂的阐述,它可能会排除可能只有= 运算符在所有涉及的条件下使用相同的值,因此排除了优化的搜索方法,如已经提到的二分搜索。

[EDIT] Probably an optimizer may not implement optimized IN evaluation process, but this doesn't exclude that one time it could happen(with a database version upgrade). So if you use the OR operator that optimized elaboration will not be used in your case.

[编辑] 优化器可能不会实现优化的 IN 评估过程,但这并不排除它可能发生的一次(通过数据库版本升级)。因此,如果您使用 OR 运算符,则不会在您的情况下使用优化的阐述。

回答by soulmerge

I think oracle is smart enough to convert the less efficient one (whichever that is) into the other. So I think the answer should rather depend on the readability of each (where I think that INclearly wins)

我认为 oracle 足够聪明,可以将效率较低的一个(无论哪个)转换为另一个。所以我认为答案应该取决于每个人的可读性(我认为这IN显然是赢家)

回答by shahkalpesh

ORmakes sense (from readability point of view), when there are less values to be compared. INis useful esp. when you have a dynamic source, with which you want values to be compared.

OR当要比较的值较少时,才有意义(从可读性的角度来看)。 IN特别有用 当您有一个动态源时,您希望将值与其进行比较。

Another alternative is to use a JOINwith a temporary table.
I don't think performance should be a problem, provided you have necessary indexes.

另一种选择是将 aJOIN与临时表一起使用。
我不认为性能应该是一个问题,只要你有必要的索引。

回答by user3003962

I did a SQL query in a large number of OR (350). Postgres do it 437.80ms.

我在大量OR(350)中做了一个SQL查询。Postgres 做到了437.80ms

Use OR

使用或

Now use IN:

现在使用IN:

Use IN

用于

23.18ms

23.18ms