SQL 为什么临时表和子选择之间存在巨大的性能差异
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16767645/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why is there a HUGE performance difference between temp table and subselect
提问by Ward
This is a question about SQL Server 2008 R2
这是一个关于 SQL Server 2008 R2 的问题
I'm not a DBA, by far. I'm a java developer, who has to write SQL from time to time. (mostly embedded in code). I want to know if I did something wrong here, and if so, what I can do to avoid it to happen again.
到目前为止,我不是 DBA。我是一名java开发人员,时不时要写SQL。(主要嵌入在代码中)。我想知道我是否在这里做错了什么,如果是,我可以做些什么来避免它再次发生。
Q1:
问题 1:
SELECT something FROM (SELECT * FROM T1 WHERE condition1) JOIN ...
Q1 features 14 joins
Q1 有 14 个连接
Q2 is the same as Q1, with one exception. (SELECT * FROM T1 WHERE condition1) is executed before, and stored in a temp table.
Q2 与 Q1 相同,只有一个例外。(SELECT * FROM T1 WHERE condition1) 之前执行,并存储在临时表中。
This is not a correlated sub-query.
这不是相关的子查询。
Q2:
问题 2:
SELECT * INTO #tempTable FROM T1 WHERE condition1
SELECT something FROM #tempTable JOIN ...
again, 14 joins.
再次,14 加入。
The thing that puzzles me now is that Q1 took > 2min, (tried it a few times, to avoid caching to play a role) while Q2 (both queries combined) took 2sec!!! What gives?
现在让我感到困惑的是 Q1 花费了 > 2 分钟,(尝试了几次,以避免缓存起作用)而 Q2(两个查询相结合)花费了 2 秒!!!是什么赋予了?
回答by Karthik AMR
Why it's not recommended to use subqueries?
为什么不建议使用子查询?
Database Optimizer (regardless of what database you are using) can not always properly optimize such query (with subqueries). In this case, the problem to the optimizer is to choose the right way to join result sets. There are several algorithms for joining two result sets. The choice of the algorithm depends on the number of records which are contained in one and in the other result set. In case if you join two physical tables (subquery is not a physical table), the database can easily determine the amount of data in two result sets by the available statistics. If one of result sets is a subquery then to understand how many records it returns is very difficult. In this case the database can choose wrong query plan of join, so that will lead to a dramatic reduction in the performance of the query.
数据库优化器(无论您使用什么数据库)并不总是能正确优化此类查询(使用子查询)。在这种情况下,优化器的问题是选择正确的方式来连接结果集。有几种算法可以连接两个结果集。算法的选择取决于一个结果集中和另一个结果集中包含的记录数。如果连接两个物理表(子查询不是物理表),数据库可以通过可用的统计数据轻松确定两个结果集中的数据量。如果结果集之一是子查询,那么了解它返回多少条记录是非常困难的。在这种情况下,数据库可以选择错误的join查询计划,从而导致查询性能的急剧下降。
Rewriting the query with using temporary tables is intended to simplify the database optimizer. In the rewritten query all result sets participating in joins will be physical tables and the database will easily determine the length of each result set. This will allow the database to choose the guaranteed fastest of all possible query plans. Moreover, the database will make the right choice no matter what are the conditions. The rewritten query with temporary tables would work well on any database, this is especially important in the development of portable solutions. In addition, the rewritten query is easier to read, easier to understand and to debug.
使用临时表重写查询旨在简化数据库优化器。在重写的查询中,所有参与连接的结果集都将是物理表,数据库将很容易确定每个结果集的长度。这将允许数据库选择所有可能的查询计划中保证最快的。而且,无论什么条件,数据库都会做出正确的选择。使用临时表重写的查询在任何数据库上都可以很好地工作,这在可移植解决方案的开发中尤其重要。此外,重写的查询更易于阅读、更易于理解和调试。
It is understood that rewriting the query with temporary tables can lead to some slowdown due to additional expenses: creation of temporary tables. If the database will not be mistaken with the choice of the query plan, it will perform the old query faster than a new one. However, this slowdown will always be negligible. Typically the creation of a temporary table takes a few milliseconds. That is, the delay can not have a significant impact on system performance, and usually can be ignored.
据了解,使用临时表重写查询会导致一些速度变慢,因为额外开销:临时表的创建。如果数据库不会错误地选择查询计划,它将比新查询更快地执行旧查询。但是,这种放缓将始终可以忽略不计。通常,临时表的创建需要几毫秒。即延迟不能对系统性能产生显着影响,通常可以忽略。
Important! Do not forget to create indexes for temporary tables. The index fields should include all fields that are used in join conditions.
重要的!不要忘记为临时表创建索引。索引字段应包括连接条件中使用的所有字段。
回答by Yaroslav
There are lot of things to tackle here, indexes, execution plans, etc. Testing and comparing results is the way to go.
这里有很多事情要处理,索引、执行计划等。测试和比较结果是要走的路。
You could take a look to the usual suspects, indexes. Take a look into the execution plan and compare them. Make sure the WHERE
clause is using the correct ones. Ensure you are using the indexes on your JOINs
.
你可以看看通常的嫌疑人,索引。查看执行计划并比较它们。确保该WHERE
条款使用正确的条款。确保您正在使用JOINs
.
These answers sure will help you a lot.
这些答案肯定会对你有很大帮助。