postgresql 等号 (=) 和 IN 之间的性能差异,只有一个字面值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37828398/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 02:18:35  来源:igfitidea点击:

Performance differences between equal (=) and IN with one literal value

mysqlsqlpostgresqlequals-operatorin-operator

提问by Somnath Muluk

How does SQL engines differ when we use equal sign and IN operator have same value? Does execution time changes?

当我们使用等号和 IN 运算符具有相同的值时,SQL 引擎有何不同?执行时间有变化吗?

1st one using equality check operator

第一个使用相等检查运算符

WHERE column_value = 'All'

2nd one using OR operator and single value

第二个使用 OR 运算符和单个值

WHERE column_value IN ('All')

Does SQL engine changes INto =if only one value is there?

如果只有一个值,SQL 引擎是否会更改IN=

Is there any difference for same in MySQL and PostgreSQL?

MySQL 和 PostgreSQL 中的相同有什么区别吗?

回答by sagi

There is no difference between those two statements, and the optimiser will transform the INto the =when INhave just one element in it.

这两个语句之间没有区别,优化器会将 the 转换IN为 the =whenIN只有一个元素。

Though when you have a question like this, just run both statements, run their execution plan and see the differences. Here - you won't find any.

但是,当您遇到这样的问题时,只需运行这两个语句,运行它们的执行计划并查看差异即可。在这里 - 你不会找到任何。

After a big search online, I found a document on SQL to support this(I assume it applies to all DBMS):

在网上大搜之后,我找到了一篇关于 SQL 的文档来支持这一点(我假设它适用于所有 DBMS):

If there is only one value inside the parenthesis, this commend is equivalent to

WHERE "column_name" = 'value1

如果括号内只有一个值,则此命令等效于

WHERE "column_name" = 'value1

Here is the link to the document.

这是文档的链接

Here is the execution plan of both queries in Oracle (Most DBMS will process this the same) :

这是两个查询在 Oracle 中的执行计划(大多数 DBMS 将处理相同):

EXPLAIN PLAN FOR
select * from dim_employees t
where t.identity_number = '123456789'

Plan hash value: 2312174735
-----------------------------------------------------
| Id  | Operation                   | Name          |
-----------------------------------------------------
|   0 | SELECT STATEMENT            |               |
|   1 |  TABLE ACCESS BY INDEX ROWID| DIM_EMPLOYEES |
|   2 |   INDEX UNIQUE SCAN         | SYS_C0029838  |
-----------------------------------------------------

And for IN():

而对于IN()

EXPLAIN PLAN FOR
select * from dim_employees t
where t.identity_number in('123456789');

Plan hash value: 2312174735
-----------------------------------------------------
| Id  | Operation                   | Name          |
-----------------------------------------------------
|   0 | SELECT STATEMENT            |               |
|   1 |  TABLE ACCESS BY INDEX ROWID| DIM_EMPLOYEES |
|   2 |   INDEX UNIQUE SCAN         | SYS_C0029838  |
-----------------------------------------------------

As you can see, both are identical. This is on an indexed column. Same goes for an unindexed column (just full table scan) .

如您所见,两者是相同的。这是在索引列上。未编入索引的列(只是全表扫描)也是如此。

回答by Rahul Tripathi

There is no difference when you are using it with a single value. If you will check the table scan, index scan, or index seek for the above two queries you will find that there is no difference between the two queries.

将它与单个值一起使用时没有区别。如果你会检查上面两个查询的表扫描、索引扫描或索引查找,你会发现这两个查询之间没有区别。

Is there any difference for same in Mysql and PostgresSQL?

Mysql 和 PostgresSQL 中的相同有什么区别吗?

No it would not have any difference on the two engines(Infact it would be same for most of the databases including SQL Server, Oracle etc). Both engines will convert INto =

不,它在两个引擎上不会有任何区别(事实上,对于包括 SQL Server、Oracle 等在内的大多数数据库来说都是一样的)。两个引擎都将转换IN=

回答by John

There are no big differences really, but if your column_valueis indexed, INoperator may not read it as an index.

确实没有太大的区别,但是如果您的column_value被索引,IN操作员可能不会将其作为索引读取。

Encountered this problem once, so be careful.

遇到过一次这个问题,所以要小心。

回答by alexis

Teach a man to fish, etc. Here's how to see for yourself what variations on your queries will do:

教人钓鱼等。以下是如何亲自查看查询的变化会产生什么效果:

mysql> EXPLAIN SELECT * FROM sentence WHERE sentence_lang_id = "AMH"\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sentence
         type: ref
possible_keys: sentence_lang_id
          key: sentence_lang_id
      key_len: 153
          ref: const
         rows: 442
        Extra: Using where

And let's try it the other way:

让我们换一种方式试试:

mysql> EXPLAIN SELECT * FROM sentence WHERE sentence_lang_id in ("AMH")\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sentence
         type: ref
possible_keys: sentence_lang_id
          key: sentence_lang_id
      key_len: 153
          ref: const
         rows: 442
        Extra: Using where

You can read hereabout how to interpret the results of a mysql EXPLAINrequest. For now, note that we got identical output for both queries: exactly the same "execution plan" is generated. The typerow tells us that the query uses a non-unique index (a foreign key, in this case), and the refrow tells us that the query is executed by comparing a constant value against this index.

您可以在此处阅读有关如何解释 mysqlEXPLAIN请求结果的信息。现在,请注意我们为两个查询获得了相同的输出:生成了完全相同的“执行计划”。该type行告诉我们,查询使用非唯一索引(外键,在这种情况下),而ref行告诉我们,查询由针对此索引比较恒定的值执行。

回答by TheGameiswar

For single IN Clause,there is no difference..below is demo using an EMPS table i have..

对于单个 IN 子句,没有区别..下面是使用我拥有的 EMPS 表的演示..

select * from emps where empid in (1)
select * from emps where empid=1

Predicate for First Query in execution plan:

执行计划中 First Query 的谓词:

[PerformanceV3].[dbo].[Emps].[empID]=CONVERT_IMPLICIT(int,[@1],0)

Predicate for second query in execution plan:

执行计划中第二个查询的谓词:

[PerformanceV3].[dbo].[Emps].[empID]=CONVERT_IMPLICIT(int,[@1],0)

If you have multiple values in IN Clause,its better to convert them to joins

如果 IN 子句中有多个值,最好将它们转换为连接

回答by Adam Martin

Just to add a different perspective, one of the main points of rdbms systems is that they will rewrite your query for you, and pick the best execution plan for that query and all equivalent ones. This means that as long as two queries are logically identical, the should always generate the same execution plan on a given rdbms.

只是添加一个不同的观点,rdbms 系统的一个要点是它们将为您重写您的查询,并为该查询和所​​有等效的查询选择最佳执行计划。这意味着只要两个查询在逻辑上相同,就应该始终在给定的 rdbms 上生成相同的执行计划。

That being said, many queries are equivalent (same result set) but only because of constraints the database itself is unaware of, so be careful about those cases (E.g for a flag field with numbers 1-6, the db doesn't know <3is the same as in (1,2)). But at the end of the day, if you're just thinking about legibility of andand orstatements it won't make a difference for performance which way you write them.

话虽如此,许多查询是等效的(相同的结果集),但只是因为数据库本身不知道约束,所以要小心这些情况(例如,对于数字 1-6 的标志字段,数据库不知道<3是同in (1,2))。但归根结底,如果您只考虑andor语句的易读性,那么编写它们的方式不会对性能产生影响。

回答by Yossi Orsen

You will need to run execution plan on both, and see the results.

您需要在两者上运行执行计划,并查看结果。

I believe they will have the same execution plan as it will be performed the same as a normal =sign when only one value is placed inside the IN()statement.

我相信它们将具有相同的执行计划,因为=当仅在IN()语句中放置一个值时,它将像正常符号一样执行。

There is no reason for the optimizer to behave any differently on a query like this.

优化器没有理由在这样的查询上有任何不同的行为。