SQL 哪个性能更好:派生表或临时表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2326395/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Which one have better performance : Derived Tables or Temporary Tables
提问by masoud ramezani
Sometimes we can write a query with both derived table and temporary table. my question is that which one is better? why?
有时我们可以同时使用派生表和临时表编写查询。我的问题是哪个更好?为什么?
采纳答案by Quassnoi
Derived table is a logical construct.
派生表是一个逻辑结构。
It may be stored in the tempdb
, built at runtime by reevaluating the underlying statement each time it is accessed, or even optimized out at all.
它可以存储在 中tempdb
,在运行时通过在每次访问时重新评估底层语句来构建,甚至完全优化。
Temporary table is a physical construct. It is a table in tempdb
that is created and populated with the values.
临时表是一种物理结构。它是一个在tempdb
其中创建并填充值的表。
Which one is better depends on the query they are used in, the statement that is used to derive a table, and many other factors.
哪个更好取决于它们所使用的查询、用于派生表的语句以及许多其他因素。
For instance, CTE
(common table expressions) in SQL Server
can (and most probably will) be reevaluated each time they are used. This query:
例如,CTE
(公共表表达式) inSQL Server
可以(并且很可能会)在每次使用时重新评估。这个查询:
WITH q (uuid) AS
(
SELECT NEWID()
)
SELECT *
FROM q
UNION ALL
SELECT *
FROM q
will most probablyyield two different NEWID()
's.
将最有可能产生两种不同NEWID()
的。
In this case, a temporary table should be used since it guarantees that its values persist.
在这种情况下,应使用临时表,因为它可以保证其值保持不变。
On the other hand, this query:
另一方面,这个查询:
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS rn
FROM master
) q
WHERE rn BETWEEN 80 AND 100
is better with a derived table, because using a temporary table will require fetching all values from master
, while this solution will just scan the first 100
records using the index on id
.
使用派生表更好,因为使用临时表将需要从 获取所有值master
,而此解决方案将仅100
使用 上的索引扫描第一条记录id
。
回答by onupdatecascade
It depends on the circumstances.
这取决于具体情况。
Advantages of derived tables:
派生表的优点:
A derived table is part of a larger, single query, and will be optimized in the context of the rest of the query. This can be an advantage, if the query optimization helps performance (it usually does, with some exceptions). Example: if you populate a temp table, then consume the results in a second query, you are in effect tying the database engine to one execution method (run the first query in its entirety, save the whole result, run the second query) where with a derived table the optimizer might be able to find a faster execution method or access path.
A derived table only "exists" in terms of the query execution plan - it's purely a logical construct. There really is no table.
派生表是更大的单个查询的一部分,将在查询的其余部分的上下文中进行优化。这可能是一个优势,如果查询优化有助于提高性能(通常会,但有一些例外)。示例:如果您填充临时表,然后在第二个查询中使用结果,您实际上将数据库引擎绑定到一个执行方法(完整运行第一个查询,保存整个结果,运行第二个查询)其中使用派生表,优化器可能能够找到更快的执行方法或访问路径。
派生表仅在查询执行计划方面“存在” - 它纯粹是一个逻辑构造。真的没有桌子。
Advantages of temp tables
临时表的优点
The table "exists" - that is, it's materialized as a table, at least in memory, which contains the result set and can be reused.
In some cases, performance can be improved or blocking reduced when you have to perform some elaborate transformation on the data - for example, if you want to fetch a 'snapshot' set of rows out of a base table that is busy, and then do some complicated calculation on that set, there can be less contention if you get the rows out of the base table and unlock it as quickly as possible, then do the work independently. In some cases the overhead of a real temp table is small relative to the advantage in concurrency.
表“存在”——也就是说,它被具体化为一个表,至少在内存中,它包含结果集并且可以重用。
在某些情况下,当您必须对数据执行一些复杂的转换时,可以提高性能或减少阻塞 - 例如,如果您想从繁忙的基表中获取一组“快照”行,然后执行在该集合上进行一些复杂的计算,如果您从基表中取出行并尽快解锁,然后独立完成工作,则争用可能会减少。在某些情况下,相对于并发优势而言,真实临时表的开销很小。
回答by Jim Clark
I want to add an anecdote here as it leads me to advise the opposite of the accepted answer. I agree with the thinking presented in the accepted answer but it is mostly theoretical. My experience has lead me to recommend temp tables over derived tables, common table expressions and table value functions. We used derived tables and common table expressions extensively with much success based on thoughts consistent with the accepted answer until we started dealing with larger result sets and/or more complex queries. Then we found that the optimizer did not optimize well with the derived table or CTE.
我想在这里添加一个轶事,因为它使我建议与接受的答案相反。我同意接受的答案中提出的想法,但它主要是理论性的。我的经验使我推荐临时表而不是派生表、公共表表达式和表值函数。在我们开始处理更大的结果集和/或更复杂的查询之前,我们广泛使用派生表和公用表表达式,并根据与公认答案一致的想法取得了很大成功。然后我们发现优化器对派生表或者CTE优化不好。
I looked at an example today that ran for 10:15. I inserted the results from the derived table into a temp table and joined the temp table in the main query and the total time dropped to 0:03. Usually when we see a big performance problem we can quickly address it this way. For this reason I recommend temp tables unless your query is relatively simple and you are certain it will not be processing large data sets.
我今天看了一个例子,它跑了 10:15。我将派生表的结果插入到临时表中,并在主查询中加入临时表,总时间下降到 0:03。通常当我们看到一个很大的性能问题时,我们可以通过这种方式快速解决它。出于这个原因,我建议使用临时表,除非您的查询相对简单并且您确定它不会处理大型数据集。
回答by paparazzo
The big difference is that you can put constraints including a primary key on a temporary table. For big (I mean millions of records) sometime you can get better performance with temporary. I have the key query that needs 5 joins (each joins happens to be similar). Performance was OK with 2 joins and then on the third performance went bad and query plan went crazy. Even with hints I could not correct the query plan. Tried restructuring the joins as derived tables and still same performance issues. With with temporary tables can create a primary key (then when I populate first sort on PK). When SQL could join the 5 tables and use the PK performance went from minutes to seconds. I wish SQL would support constraints on derived tables and CTE (even if only a PK).
最大的区别在于您可以在临时表上放置包括主键在内的约束。对于大型(我的意思是数百万条记录),有时您可以通过临时获得更好的性能。我有需要 5 个连接的关键查询(每个连接恰好相似)。2 次连接时性能还可以,然后在第三次连接时性能变差,查询计划变得疯狂。即使有提示,我也无法更正查询计划。尝试将连接重组为派生表,但仍然存在相同的性能问题。使用临时表可以创建主键(然后当我在 PK 上填充第一次排序时)。当 SQL 可以加入 5 个表并使用 PK 时,性能从几分钟缩短到几秒钟。我希望 SQL 支持对派生表和 CTE 的约束(即使只有一个 PK)。