SQL Server 查询性能 - 消除对哈希匹配(内部联接)的需要

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7221373/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 11:54:50  来源:igfitidea点击:

SQL Server query performance - removing need for Hash Match (Inner Join)

sqlperformancesql-server-2008

提问by Tim Peel

I have the following query, which is doing very little and is an example of the kind of joins I am doing throughout the system.

我有以下查询,它做的很少,是我在整个系统中执行的连接类型的一个示例。

select t1.PrimaryKeyId, t1.AdditionalColumnId
from TableOne t1
    join TableTwo t2 on t1.ForeignKeyId = t2.PrimaryKeyId
    join TableThree t3 on t1.PrimaryKeyId = t3.ForeignKeyId
    join TableFour t4 on t3.ForeignKeyId = t4.PrimaryKeyId
    join TableFive t5 on t4.ForeignKeyId = t5.PrimaryKeyId
where 
    t1.StatusId = 1
    and t5.TypeId = 68

There are indexes on all the join columns, however the performance is not great. Inspecting the query plan reveals a lot of Hash Match (Inner Joins) when really I want to see Nested Loop joins.

所有连接列上都有索引,但性能不是很好。当我真的想看到嵌套循环连接时,检查查询计划会发现很多哈希匹配(内部连接)。

The number of records in each table is as follows:

每个表的记录数如下:

select count(*) from TableOne

= 64393

= 64393

select count(*) from TableTwo

= 87245

= 87245

select count(*) from TableThree

= 97141

= 97141

select count(*) from TableFour

= 116480

= 116480

select count(*) from TableFive

= 62

= 62

What is the best way in which to improve the performance of this type of query?

提高此类查询性能的最佳方法是什么?

回答by gbn

First thoughts:

第一个想法:

  1. Change to EXISTS (changes equi-join to semi-join)
  2. You need to have indexes on t1.StatusId, t5.TypeId and INCLUDE t1.AdditionalColumnID
  1. 更改为 EXISTS(将等连接更改为半连接)
  2. 您需要在 t1.StatusId、t5.TypeId 和 INCLUDE t1.AdditionalColumnID 上建立索引

I wouldn't worry about your join method yet...

我不会担心你的加入方法......

Personally, I've never used a JOIN hint. They only work for the data, indexes and statistics you have at that point in time. As these change, your JOIN hint limits the optimiser

就个人而言,我从未使用过 JOIN 提示。它们仅适用于您当时拥有的数据、索引和统计信息。随着这些变化,您的 JOIN 提示限制了优化器

select t1.PrimaryKeyId, t1.AdditionalColumnId
from
    TableOne t1
where 
    t1.Status = 1
    AND EXISTS (SELECT *
        FROM
          TableThree t3
          join TableFour t4 on t3.ForeignKeyId = t4.PrimaryKeyId
          join TableFive t5 on t4.ForeignKeyId = t5.PrimaryKeyId
        WHERE
          t1.PrimaryKeyId = t3.ForeignKeyId
          AND
          t5.TypeId = 68)
    AND EXISTS (SELECT *
        FROM
          TableTwo t2
        WHERE
          t1.ForeignKeyId = t2.PrimaryKeyId)

Index for tableOne.. one of

tableOne 的索引 .. 之一

  • (Status, ForeignKeyId) INCLUDE (AdditionalColumnId)
  • (ForeignKeyId, Status) INCLUDE (AdditionalColumnId)
  • (Status, ForeignKeyId) INCLUDE (AdditionalColumnId)
  • (ForeignKeyId, Status) INCLUDE (AdditionalColumnId)

Index for tableFive... probably (typeID, PrimaryKeyId)

tableFive 的索引...可能 (typeID, PrimaryKeyId)

Edit: updated JOINS and EXISTS to match question fixes

编辑:更新 JOINS 和 EXISTS 以匹配问题修复

回答by Andomar

SQL Server is pretty good at optimizing queries, but it's also conservative: it optimizes queries for the worst case. A loop join typically results in an index lookup and a bookmark lookup for for every row. Because loop joins cause dramatic degradation for large sets, SQL Server is hesitant to use them unless it's sure about the number of rows.

SQL Server 非常擅长优化查询,但它也很保守:它针对最坏的情况优化查询。循环连接通常会导致对每一行进行索引查找和书签查找。因为循环连接会导致大型集合的显着降级,SQL Server 对使用它们犹豫不决,除非它确定行数。

You can use the forceseekquery hintto force an index lookup:

您可以使用forceseek查询提示强制进行索引查找:

inner join TableTwo t2 with (FORCESEEK) on t1.ForeignKeyId = t2.PrimaryKeyId

Alternatively, you can force a loop join with the loopkeyword:

或者,您可以使用loop关键字强制循环连接:

inner LOOP join TableTwo t2 on t1.ForeignKeyId = t2.PrimaryKeyId

Query hints limit SQL Server's freedom, so it can no longer adapt to changed circumstances. It's best practice to avoid query hints unless there is a business need that cannot be met without them.

查询提示限制了 SQL Server 的自由,因此它不能再适应变化的环境。最好的做法是避免使用查询提示,除非没有它们就无法满足业务需求。