通过删除执行计划中的排序运算符来优化 SQL 查询

Question

提问by jodev

I've just started looking into optimizing my queries through indexes because SQL data is growing large and fast. I looked at how the optimizer is processing my query through the Execution plan in SSMS and noticed that a Sort operator is being used. I've heard that a Sort operator indicates a bad design in the query since the sort can be made prematurely through an index. So here is an example table and data similar to what I'm doing:

我刚刚开始研究通过索引优化我的查询，因为 SQL 数据增长得又大又快。我查看了优化器如何通过 SSMS 中的执行计划处理我的查询，并注意到正在使用排序运算符。我听说 Sort 运算符表示查询中的设计不好，因为可以通过索引过早地进行排序。所以这里是一个示例表和数据类似于我在做什么：

IF OBJECT_ID('dbo.Store') IS NOT NULL DROP TABLE dbo.[Store]
GO

CREATE TABLE dbo.[Store]
(
    [StoreId] int NOT NULL IDENTITY (1, 1),
    [ParentStoreId] int NULL,
    [Type] int NULL,
    [Phone] char(10) NULL,
    PRIMARY KEY ([StoreId])
)

INSERT INTO dbo.[Store] ([ParentStoreId], [Type], [Phone]) VALUES (10, 0, '2223334444')
INSERT INTO dbo.[Store] ([ParentStoreId], [Type], [Phone]) VALUES (10, 0, '3334445555')
INSERT INTO dbo.[Store] ([ParentStoreId], [Type], [Phone]) VALUES (10, 1, '0001112222')
INSERT INTO dbo.[Store] ([ParentStoreId], [Type], [Phone]) VALUES (10, 1, '1112223333')
GO

Here is an example query:

这是一个示例查询：

SELECT [Phone]
FROM [dbo].[Store]
WHERE [ParentStoreId] = 10
AND ([Type] = 0 OR [Type] = 1)
ORDER BY [Phone]

I create a non clustered index to help speed up the query:

我创建了一个非聚集索引来帮助加快查询速度：

CREATE NONCLUSTERED INDEX IX_Store ON dbo.[Store]([ParentStoreId], [Type], [Phone])

To build the IX_Store index, I start with the simple predicates

为了构建 IX_Store 索引，我从简单的谓词开始

[ParentStoreId] = 10
AND ([Type] = 0 OR [Type] = 1)

Then I add the [Phone]column for the ORDER BY and to cover the SELECT output

然后我[Phone]为 ORDER BY添加列并覆盖 SELECT 输出

So even when the index is built, the optimizer still uses the Sort operator (and not the index sort) because [Phone]is sorted AFTER [ParentStoreId]AND [Type]. If I remove the [Type]column from the index and run the query:

因此，即使建立了索引，优化器仍然使用 Sort 运算符（而不是索引排序），因为它[Phone]是在[ParentStoreId]AND之后排序的[Type]。如果我[Type]从索引中删除该列并运行查询：

SELECT [Phone]
FROM [dbo].[Store]
WHERE [ParentStoreId] = 10
--AND ([Type] = 0 OR [Type] = 1)
ORDER BY [Phone]

Then of course the Sort operator is not used by the optimizer because [Phone]is sorted by [ParentStoreId].

然后当然优化器不使用 Sort 运算符，因为它[Phone]是由[ParentStoreId].

So the question is how can I create an index that will cover the query (including the [Type]predicate) and not have the optimizer use a Sort?

所以问题是如何创建一个索引来覆盖查询（包括[Type]谓词）而不让优化器使用排序？

EDIT:

编辑：

The table I'm working with has more than 20 million rows

我正在使用的表有超过 2000 万行

Answer 1

采纳答案by meriton

First, you should verify that the sort is actually a performance bottleneck. The duration of the sort will depend on the number of elements to be sorted, and the number of stores for a particular parent store is likely to be small. (That is assuming the sort operator is applied after applying the where clause).

首先，您应该验证排序实际上是一个性能瓶颈。排序的持续时间将取决于要排序的元素数量，并且特定父存储的存储数量可能很小。（即假设在应用 where 子句后应用排序运算符）。

I've heard that a Sort operator indicates a bad design in the query since the sort can be made prematurely through an index

我听说 Sort 运算符表示查询中的设计不好，因为可以通过索引过早地进行排序

That's an over-generalization. Often, a sort-operator can trivially be moved into the index, and, if only the first couple rows of the result set are fetched, can substantially reduce query cost, because the database no longer has to fetch all matching rows (and sort them all) to find the first ones, but can read the records in result set order, and stop once enough records are found.

这是一种过度概括。通常，排序运算符可以轻松地移动到索引中，并且如果只提取结果集的前几行，则可以大大降低查询成本，因为数据库不再需要提取所有匹配的行（并对它们进行排序） all) 查找第一个，但可以按结果集顺序读取记录，并在找到足够的记录后停止。

In your case, you seem to be fetching the entire result set, so sorting that is unlikely to make things much worse (unless the result set is huge). Also, in your case it might not be trivial to build a useful sorted index, because the where clause contains an or.

在您的情况下，您似乎正在获取整个结果集，因此排序不太可能使事情变得更糟（除非结果集很大）。此外，在您的情况下，构建有用的排序索引可能并非易事，因为 where 子句包含一个 or。

Now, if you still want to get rid of that sort-operator, you can try:

现在，如果您仍然想摆脱那个排序运算符，您可以尝试：

SELECT [Phone]
FROM [dbo].[Store]
WHERE [ParentStoreId] = 10
AND [Type] in (0, 1)
ORDER BY [Phone]

Alternatively, you can try the following index:

或者，您可以尝试以下索引：

CREATE NONCLUSTERED INDEX IX_Store ON dbo.[Store]([ParentStoreId], [Phone], [Type])

to try getting the query optimizer to do an index range scan on ParentStoreIdonly, then scan all matching rows in the index, outputting them if Typematches. However, this is likely to cause more disk I/O, and hence slow your query down rather than speed it up.

尝试让查询优化器ParentStoreId仅对索引范围进行扫描，然后扫描索引中所有匹配的行，如果Type匹配则输出它们。但是，这可能会导致更多的磁盘 I/O，从而减慢查询速度而不是加快查询速度。

Edit: As a last resort, you could use

编辑：作为最后的手段，你可以使用

SELECT [Phone]
FROM [dbo].[Store]
WHERE [ParentStoreId] = 10
AND [Type] = 0
ORDER BY [Phone]

UNION ALL

SELECT [Phone]
FROM [dbo].[Store]
WHERE [ParentStoreId] = 10
AND [Type] = 1
ORDER BY [Phone]

with

和

CREATE NONCLUSTERED INDEX IX_Store ON dbo.[Store]([ParentStoreId], [Type], [Phone])

and sort the two lists on the application server, where you can merge (as in merge sort) the presorted lists, thereby avoiding a complete sort. But that's really a micro-optimization that, while speeding up the sort itself by an order of magnitude, is unlikely to affect the total execution time of the query much, as I'd expect the bottleneck to be network and disk I/O, especially in light of the fact that the disk will do a lot of random access as the index is not clustered.

并对应用服务器上的两个列表进行排序，您可以在其中合并（如在合并排序中）预先排序的列表，从而避免完全排序。但这确实是一个微优化，虽然将排序本身加快了一个数量级，但不太可能对查询的总执行时间产生太大影响，因为我预计瓶颈是网络和磁盘 I/O，特别是考虑到磁盘将进行大量随机访问，因为索引没有聚集。

通过删除执行计划中的排序运算符来优化 SQL 查询

提问by jodev

采纳答案by meriton

相关推荐

最近更新

标签

通过删除执行计划中的排序运算符来优化 SQL 查询

提问by jodev

采纳答案by meriton

相关推荐

SQL 如何使用toad在oracle中插入&？

SQL 如何在sql中获得上周的最后一天？

SQL 如何打开 .mdf 和 .ldf 文件？

对多个数据库的 SQL 查询

相关推荐

最近更新

标签