提高大型表上的 SQL Server 查询性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10025569/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 15:12:03  来源:igfitidea点击:

Improve SQL Server query performance on large tables

sqlsql-serversql-server-2008

提问by Lee Tickett

I have a relatively large table (currently 2 million records) and would like to know if it's possible to improve performance for ad-hoc queries. The word ad-hocbeing key here. Adding indexs is not an option (there are already indexs on the columns which are queried most commonly).

我有一个相对较大的表(目前有 200 万条记录),想知道是否可以提高临时查询的性能。这个词的ad-hoc正在此间关键。添加索引不是一个选项(最常查询的列上已经有索引)。

Running a simple query to return the 100 most recently updated records:

运行一个简单的查询来返回最近更新的 100 条记录:

select top 100 * from ER101_ACCT_ORDER_DTL order by er101_upd_date_iso desc

Takes several minutes. See execution plan below:

需要几分钟。请参阅下面的执行计划:

enter image description here

enter image description here

Additional detail from the table scan:

表扫描的其他详细信息:

enter image description here

enter image description here

SQL Server Execution Times:
  CPU time = 3945 ms,  elapsed time = 148524 ms.

The server is pretty powerful (from memory 48GB ram, 24 core processor) running sql server 2008 r2 x64.

服务器非常强大(来自内存 48GB ram,24 核处理器)运行 sql server 2008 r2 x64。

Update

更新

I found this code to create a table with 1,000,000 records. I thought i could then run SELECT TOP 100 * FROM testEnvironment ORDER BY mailAddress DESCon a few different servers to find out if my disk access speeds were poor on the server.

我发现这段代码可以创建一个包含 1,000,000 条记录的表。我想我可以SELECT TOP 100 * FROM testEnvironment ORDER BY mailAddress DESC在几台不同的服务器上运行,看看我的磁盘访问速度在服务器上是否很差。

WITH t1(N) AS (SELECT 1 UNION ALL SELECT 1),
t2(N) AS (SELECT 1 FROM t1 x, t1 y),
t3(N) AS (SELECT 1 FROM t2 x, t2 y),
Tally(N) AS (SELECT TOP 98 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM t3 x, t3 y),
Tally2(N) AS (SELECT TOP 5 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM t3 x, t3 y),
Combinations(N) AS (SELECT DISTINCT LTRIM(RTRIM(RTRIM(SUBSTRING(poss,a.N,2)) + SUBSTRING(vowels,b.N,1)))
                    FROM Tally a
                    CROSS JOIN Tally2 b
                    CROSS APPLY (SELECT 'B C D F G H J K L M N P R S T V W Z SCSKKNSNSPSTBLCLFLGLPLSLBRCRDRFRGRPRTRVRSHSMGHCHPHRHWHBWCWSWTW') d(poss)
                    CROSS APPLY (SELECT 'AEIOU') e(vowels))
SELECT IDENTITY(INT,1,1) AS ID, a.N + b.N AS N
INTO #testNames
FROM Combinations a 
CROSS JOIN Combinations b;

SELECT IDENTITY(INT,1,1) AS ID, firstName, secondName
INTO #testNames2
FROM (SELECT firstName, secondName
      FROM (SELECT TOP 1000 --1000 * 1000 = 1,000,000 rows
            N AS firstName
            FROM #testNames
            ORDER BY NEWID()) a
      CROSS JOIN (SELECT TOP 1000 --1000 * 1000 = 1,000,000 rows
                  N AS secondName
                  FROM #testNames
                  ORDER BY NEWID()) b) innerQ;

SELECT firstName, secondName,
firstName + '.' + secondName + '@fake.com' AS eMail,
CAST((ABS(CHECKSUM(NEWID())) % 250) + 1 AS VARCHAR(3)) + ' ' AS mailAddress,
(ABS(CHECKSUM(NEWID())) % 152100) + 1 AS jID,
IDENTITY(INT,1,1) AS ID
INTO #testNames3
FROM #testNames2

SELECT IDENTITY(INT,1,1) AS ID, firstName, secondName, eMail, 
mailAddress + b.N + b.N AS mailAddress
INTO testEnvironment
FROM #testNames3 a
INNER JOIN #testNames b ON a.jID = b.ID;

--CLEAN UP USELESS TABLES
DROP TABLE #testNames;
DROP TABLE #testNames2;
DROP TABLE #testNames3;

But on the three test servers the query ran almost instantaneously. Can anyone explain this?

但是在三个测试服务器上,查询几乎是立即运行的。谁能解释一下?

enter image description here

enter image description here

Update 2

更新 2

Thank you for the comments- please keep them coming... they led me to try changing the primary key index from non-clustered to clustered with rather interesting (and unexpected?) results.

感谢您的评论——请让他们继续发表……他们让我尝试将主键索引从非集群更改为集群,结果相当有趣(和意外?)。

Non-clustered:

非集群:

enter image description here

enter image description here

SQL Server Execution Times:
  CPU time = 3634 ms,  elapsed time = 154179 ms.

Clustered:

集群:

enter image description here

enter image description here

SQL Server Execution Times:
  CPU time = 2650 ms,  elapsed time = 52177 ms.

How is this possible? Without an index on the er101_upd_date_iso column how can a clustered index scan be used?

这怎么可能?如果 er101_upd_date_iso 列上没有索引,如何使用聚集索引扫描?

Update 3

更新 3

As requested- this is the create table script:

根据要求 - 这是创建表脚本:

CREATE TABLE [dbo].[ER101_ACCT_ORDER_DTL](
    [ER101_ORG_CODE] [varchar](2) NOT NULL,
    [ER101_ORD_NBR] [int] NOT NULL,
    [ER101_ORD_LINE] [int] NOT NULL,
    [ER101_EVT_ID] [int] NULL,
    [ER101_FUNC_ID] [int] NULL,
    [ER101_STATUS_CDE] [varchar](2) NULL,
    [ER101_SETUP_ID] [varchar](8) NULL,
    [ER101_DEPT] [varchar](6) NULL,
    [ER101_ORD_TYPE] [varchar](2) NULL,
    [ER101_STATUS] [char](1) NULL,
    [ER101_PRT_STS] [char](1) NULL,
    [ER101_STS_AT_PRT] [char](1) NULL,
    [ER101_CHG_COMMENT] [varchar](255) NULL,
    [ER101_ENT_DATE_ISO] [datetime] NULL,
    [ER101_ENT_USER_ID] [varchar](10) NULL,
    [ER101_UPD_DATE_ISO] [datetime] NULL,
    [ER101_UPD_USER_ID] [varchar](10) NULL,
    [ER101_LIN_NBR] [int] NULL,
    [ER101_PHASE] [char](1) NULL,
    [ER101_RES_CLASS] [char](1) NULL,
    [ER101_NEW_RES_TYPE] [varchar](6) NULL,
    [ER101_RES_CODE] [varchar](12) NULL,
    [ER101_RES_QTY] [numeric](11, 2) NULL,
    [ER101_UNIT_CHRG] [numeric](13, 4) NULL,
    [ER101_UNIT_COST] [numeric](13, 4) NULL,
    [ER101_EXT_COST] [numeric](11, 2) NULL,
    [ER101_EXT_CHRG] [numeric](11, 2) NULL,
    [ER101_UOM] [varchar](3) NULL,
    [ER101_MIN_CHRG] [numeric](11, 2) NULL,
    [ER101_PER_UOM] [varchar](3) NULL,
    [ER101_MAX_CHRG] [numeric](11, 2) NULL,
    [ER101_BILLABLE] [char](1) NULL,
    [ER101_OVERRIDE_FLAG] [char](1) NULL,
    [ER101_RES_TEXT_YN] [char](1) NULL,
    [ER101_DB_CR_FLAG] [char](1) NULL,
    [ER101_INTERNAL] [char](1) NULL,
    [ER101_REF_FIELD] [varchar](255) NULL,
    [ER101_SERIAL_NBR] [varchar](50) NULL,
    [ER101_RES_PER_UNITS] [int] NULL,
    [ER101_SETUP_BILLABLE] [char](1) NULL,
    [ER101_START_DATE_ISO] [datetime] NULL,
    [ER101_END_DATE_ISO] [datetime] NULL,
    [ER101_START_TIME_ISO] [datetime] NULL,
    [ER101_END_TIME_ISO] [datetime] NULL,
    [ER101_COMPL_STS] [char](1) NULL,
    [ER101_CANCEL_DATE_ISO] [datetime] NULL,
    [ER101_BLOCK_CODE] [varchar](6) NULL,
    [ER101_PROP_CODE] [varchar](8) NULL,
    [ER101_RM_TYPE] [varchar](12) NULL,
    [ER101_WO_COMPL_DATE] [datetime] NULL,
    [ER101_WO_BATCH_ID] [varchar](10) NULL,
    [ER101_WO_SCHED_DATE_ISO] [datetime] NULL,
    [ER101_GL_REF_TRANS] [char](1) NULL,
    [ER101_GL_COS_TRANS] [char](1) NULL,
    [ER101_INVOICE_NBR] [int] NULL,
    [ER101_RES_CLOSED] [char](1) NULL,
    [ER101_LEAD_DAYS] [int] NULL,
    [ER101_LEAD_HHMM] [int] NULL,
    [ER101_STRIKE_DAYS] [int] NULL,
    [ER101_STRIKE_HHMM] [int] NULL,
    [ER101_LEAD_FLAG] [char](1) NULL,
    [ER101_STRIKE_FLAG] [char](1) NULL,
    [ER101_RANGE_FLAG] [char](1) NULL,
    [ER101_REQ_LEAD_STDATE] [datetime] NULL,
    [ER101_REQ_LEAD_ENDATE] [datetime] NULL,
    [ER101_REQ_STRK_STDATE] [datetime] NULL,
    [ER101_REQ_STRK_ENDATE] [datetime] NULL,
    [ER101_LEAD_STDATE] [datetime] NULL,
    [ER101_LEAD_ENDATE] [datetime] NULL,
    [ER101_STRK_STDATE] [datetime] NULL,
    [ER101_STRK_ENDATE] [datetime] NULL,
    [ER101_DEL_MARK] [char](1) NULL,
    [ER101_USER_FLD1_02X] [varchar](2) NULL,
    [ER101_USER_FLD1_04X] [varchar](4) NULL,
    [ER101_USER_FLD1_06X] [varchar](6) NULL,
    [ER101_USER_NBR_060P] [int] NULL,
    [ER101_USER_NBR_092P] [numeric](9, 2) NULL,
    [ER101_PR_LIST_DTL] [numeric](11, 2) NULL,
    [ER101_EXT_ACCT_CODE] [varchar](8) NULL,
    [ER101_AO_STS_1] [char](1) NULL,
    [ER101_PLAN_PHASE] [char](1) NULL,
    [ER101_PLAN_SEQ] [int] NULL,
    [ER101_ACT_PHASE] [char](1) NULL,
    [ER101_ACT_SEQ] [int] NULL,
    [ER101_REV_PHASE] [char](1) NULL,
    [ER101_REV_SEQ] [int] NULL,
    [ER101_FORE_PHASE] [char](1) NULL,
    [ER101_FORE_SEQ] [int] NULL,
    [ER101_EXTRA1_PHASE] [char](1) NULL,
    [ER101_EXTRA1_SEQ] [int] NULL,
    [ER101_EXTRA2_PHASE] [char](1) NULL,
    [ER101_EXTRA2_SEQ] [int] NULL,
    [ER101_SETUP_MSTR_SEQ] [int] NULL,
    [ER101_SETUP_ALTERED] [char](1) NULL,
    [ER101_RES_LOCKED] [char](1) NULL,
    [ER101_PRICE_LIST] [varchar](10) NULL,
    [ER101_SO_SEARCH] [varchar](9) NULL,
    [ER101_SSB_NBR] [int] NULL,
    [ER101_MIN_QTY] [numeric](11, 2) NULL,
    [ER101_MAX_QTY] [numeric](11, 2) NULL,
    [ER101_START_SIGN] [char](1) NULL,
    [ER101_END_SIGN] [char](1) NULL,
    [ER101_START_DAYS] [int] NULL,
    [ER101_END_DAYS] [int] NULL,
    [ER101_TEMPLATE] [char](1) NULL,
    [ER101_TIME_OFFSET] [char](1) NULL,
    [ER101_ASSIGN_CODE] [varchar](10) NULL,
    [ER101_FC_UNIT_CHRG] [numeric](13, 4) NULL,
    [ER101_FC_EXT_CHRG] [numeric](11, 2) NULL,
    [ER101_CURRENCY] [varchar](3) NULL,
    [ER101_FC_RATE] [numeric](12, 5) NULL,
    [ER101_FC_DATE] [datetime] NULL,
    [ER101_FC_MIN_CHRG] [numeric](11, 2) NULL,
    [ER101_FC_MAX_CHRG] [numeric](11, 2) NULL,
    [ER101_FC_FOREIGN] [numeric](12, 5) NULL,
    [ER101_STAT_ORD_NBR] [int] NULL,
    [ER101_STAT_ORD_LINE] [int] NULL,
    [ER101_DESC] [varchar](255) NULL
) ON [PRIMARY]
SET ANSI_PADDING OFF
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PRT_SEQ_1] [varchar](12) NULL
SET ANSI_PADDING ON
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PRT_SEQ_2] [varchar](120) NULL
SET ANSI_PADDING OFF
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TAX_BASIS] [char](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_RES_CATEGORY] [char](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_DECIMALS] [char](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TAX_SEQ] [varchar](7) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MANUAL] [char](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TR_LC_RATE] [numeric](12, 5) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TR_FC_RATE] [numeric](12, 5) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TR_PL_RATE] [numeric](12, 5) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TR_DIFF] [char](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TR_UNIT_CHRG] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TR_EXT_CHRG] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TR_MIN_CHRG] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TR_MAX_CHRG] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PL_UNIT_CHRG] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PL_EXT_CHRG] [numeric](13, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PL_MIN_CHRG] [numeric](13, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PL_MAX_CHRG] [numeric](13, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TAX_RATE_TYPE] [char](1) NULL
SET ANSI_PADDING ON
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ORDER_FORM] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_FACTOR] [int] NULL
SET ANSI_PADDING OFF
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MGMT_RPT_CODE] [varchar](6) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ROUND_CHRG] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_WHOLE_QTY] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SET_QTY] [numeric](15, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SET_UNITS] [numeric](15, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SET_ROUNDING] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SET_SUB] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TIME_QTY] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_GL_DISTR_PCT] [numeric](7, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_REG_SEQ] [int] NULL
SET ANSI_PADDING ON
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ALT_DESC] [varchar](255) NULL
SET ANSI_PADDING OFF
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_REG_ACCT] [varchar](8) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_DAILY] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_AVG_UNIT_CHRG] [varchar](1) NULL
SET ANSI_PADDING ON
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ALT_DESC2] [varchar](255) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_CONTRACT_SEQ] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ORIG_RATE] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_DISC_PCT] [decimal](17, 10) NULL
SET ANSI_PADDING OFF
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_DTL_EXIST] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ORDERED_ONLY] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SHOW_STDATE] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SHOW_STTIME] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SHOW_ENDATE] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SHOW_ENTIME] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SHOW_RATE] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SHOW_UNITS] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_BASE_RATE] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_COMMIT_QTY] [numeric](11, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MM_QTY_USED] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MM_CHRG_USED] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_TEXT_1] [varchar](50) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_NBR_1] [numeric](13, 3) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_NBR_2] [numeric](13, 3) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_NBR_3] [numeric](13, 3) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PL_BASE_RATE] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_REV_DIST] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_COVER] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_RATE_TYPE] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_USE_SEASONAL] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TAX_EI] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TAXES] [numeric](13, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_FC_TAXES] [numeric](13, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PL_TAXES] [numeric](13, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_FC_QTY] [numeric](13, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_LEAD_HRS] [numeric](6, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_STRIKE_HRS] [numeric](6, 2) NULL
SET ANSI_PADDING ON
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_CANCEL_USER_ID] [varchar](10) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ST_OFFSET_HRS] [numeric](7, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_EN_OFFSET_HRS] [numeric](7, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MEMO_FLAG] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MEMO_EXT_CHRG] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MEMO_EXT_CHRG_PL] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MEMO_EXT_CHRG_TR] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MEMO_EXT_CHRG_FC] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TIME_QTY_EDIT] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SURCHARGE_PCT] [decimal](17, 10) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_INCL_EXT_CHRG] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_INCL_EXT_CHRG_FC] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_CARRIER] [varchar](6) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SETUP_ID2] [varchar](8) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SHIPPABLE] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_CHARGEABLE] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_NBR_ALLOW] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_NBR_START] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_NBR_END] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_SUPPLIER] [varchar](8) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_TRACK_ID] [varchar](40) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_REF_INV_NBR] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_NEW_ITEM_STS] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MSTR_REG_ACCT_CODE] [varchar](8) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ALT_DESC3] [varchar](255) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ALT_DESC4] [varchar](255) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ALT_DESC5] [varchar](255) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SETUP_ROLLUP] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MM_COST_USED] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_AUTO_SHIP_RCD] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_FIXED] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ITEM_EST_TBD] [varchar](3) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ROLLUP_PL_UNIT_CHRG] [numeric](13, 4) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ROLLUP_PL_EXT_CHRG] [numeric](13, 2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_GL_ORD_REV_TRANS] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_DISCOUNT_FLAG] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SETUP_RES_TYPE] [varchar](6) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SETUP_RES_CODE] [varchar](12) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PERS_SCHED_FLAG] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PRINT_STAMP] [datetime] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SHOW_EXT_CHRG] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PRINT_SEQ_NBR] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_PAY_LOCATION] [varchar](3) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_MAX_RM_NIGHTS] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_USE_TIER_COST] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_UNITS_SCHEME_CODE] [varchar](6) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_ROUND_TIME] [varchar](2) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_LEVEL] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_SETUP_PARENT_ORD_LINE] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_BADGE_PRT_STS] [varchar](1) NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_EVT_PROMO_SEQ] [int] NULL
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD [ER101_REG_TYPE] [varchar](12) NULL
/****** Object:  Index [PK__ER101_ACCT_ORDER]    Script Date: 04/15/2012 20:24:37 ******/
ALTER TABLE [dbo].[ER101_ACCT_ORDER_DTL] ADD  CONSTRAINT [PK__ER101_ACCT_ORDER] PRIMARY KEY CLUSTERED 
(
    [ER101_ORD_NBR] ASC,
    [ER101_ORD_LINE] ASC,
    [ER101_ORG_CODE] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 50) ON [PRIMARY]

The table is 2.8 GB in size with index size standing at 3.9 GB.

该表的大小为 2.8 GB,索引大小为 3.9 GB。

回答by Solomon Rutzky

Simple Answer: NO. You cannot help ad hoc queries on a 238 column table with a 50% Fill Factor on the Clustered Index.

简单的回答:不。您无法帮助对聚集索引的填充因子为 50% 的 238 列表进行临时查询。

Detailed Answer:

详细解答:

As I have stated in other answers on this topic, Index design is both Art and Science and there are so many factors to consider that there are few, if any, hard and fast rules. You need to consider: the volume of DML operations vs SELECTs, disk subsystem, other indexes / triggers on the table, distribution of data within the table, are queries using SARGable WHERE conditions, and several other things that I can't even remember right now.

正如我在关于此主题的其他答案中所述,索引设计既是艺术又是科学,需要考虑的因素太多,几乎没有硬性规则。您需要考虑:DML 操作与 SELECT 的数量、磁盘子系统、表上的其他索引/触发器、表内数据的分布、使用 SARGable WHERE 条件的查询,以及其他一些我什至记不起来的事情现在。

I can say that no help can be given for questions on this topic without an understanding of the Table itself, its indexes, triggers, etc. Now that you have posted the table definition (still waiting on the Indexes but the Table definition alone points to 99% of the issue) I can offer some suggestions.

我可以说,如果不了解表本身、其索引、触发器等,就无法对有关此主题的问题提供任何帮助。 现在您已经发布了表定义(仍在等待索引,但表定义仅指向99% 的问题)我可以提供一些建议。

First, if the table definition is accurate (238 columns, 50% Fill Factor) then you can pretty much ignore the rest of the answers / advice here ;-). Sorry to be less-than-political here, but seriously, it's a wild goose chase without knowing the specifics. And now that we see the table definition it becomes quite a bit clearer as to why a simple query would take so long, even when the test queries (Update #1) ran so quickly.

首先,如果表定义是准确的(238 列,50% 填充因子),那么您几乎可以忽略此处的其余答案/建议 ;-)。很抱歉在这里不那么化,但说真的,这是一场不知道具体细节的疯狂追逐。现在我们看到了表定义,即使测试查询(更新 #1)运行得如此之快,为什么一个简单的查询会花费这么长时间就变得非常清楚了。

The main problem here (and in many poor-performance situations) is bad data modeling. 238 columns is not prohibited just like having 999 indexes is not prohibited, but it is also generally not very wise.

这里(以及在许多性能不佳的情况下)的主要问题是糟糕的数据建模。238 列不是禁止的,就像有 999 个索引不被禁止一样,但通常也不是很明智。

Recommendations:

建议:

  1. First, this table really needs to be remodeled. If this is a data warehouse table then maybe, but if not then these fields really need to be broken up into several tables which can all have the same PK. You would have a master record table and the child tables are just dependent info based on commonly associated attributes and the PK of those tables is the same as the PK of the master table and hence also FK to the master table. There will be a 1-to-1 relationship between master and all child tables.
  2. The use of ANSI_PADDING OFFis disturbing, not to mention inconsistent within the table due to the various column additions over time. Not sure if you can fix that now, but ideally you would always have ANSI_PADDING ON, or at the very least have the same setting across all ALTER TABLEstatements.
  3. Consider creating 2 additional File Groups: Tables and Indexes. It is best not to put your stuff in PRIMARYas that is where SQL SERVER stores all of its data and meta-data about your objects. You create your Table and Clustered Index (as that is the data for the table) on [Tables]and all Non-Clustered indexes on [Indexes]
  4. Increase the Fill Factor from 50%. This low number is likely why your index space is larger than your data space. Doing an Index Rebuild will recreate the data pages with a max of 4k (out of the total 8k page size) used for your data so your table is spread out over a wide area.
  5. If most or all queries have "ER101_ORG_CODE" in the WHEREcondition, then consider moving that to the leading column of the clustered index. Assuming that it is used more often than "ER101_ORD_NBR". If "ER101_ORD_NBR" is used more often then keep it. It just seems, assuming that the field names mean "OrganizationCode" and "OrderNumber", that "OrgCode" is a better grouping that might have multiple "OrderNumbers" within it.
  6. Minor point, but if "ER101_ORG_CODE" is always 2 characters, then use CHAR(2)instead of VARCHAR(2)as it will save a byte in the row header which tracks variable width sizes and adds up over millions of rows.
  7. As others here have mentioned, using SELECT *will hurt performance. Not only due to it requiring SQL Server to return all columns and hence be more likely to do a Clustered Index Scan regardless of your other indexes, but it also takes SQL Server time to go to the table definition and translate *into all of the column names. It should be slightlyfaster to specify all 238 column names in the SELECTlist though that won't help the Scan issue. But do you ever really need all 238 columns at the same time anyway?
  1. 首先,这张桌子真的需要改造。如果这是一个数据仓库表,那么也许可以,但如果不是,那么这些字段真的需要分解成几个表,这些表都可以具有相同的 PK。您将拥有一个主记录表,而子表只是基于共同关联属性的相关信息,这些表的 PK 与主表的 PK 相同,因此也与主表的 FK 相同。主表和所有子表之间将存在一对一的关系。
  2. 的使用ANSI_PADDING OFF令人不安,更不用说由于随着时间的推移添加了各种列而导致表内不一致。不确定你现在是否可以解决这个问题,但理想情况下你总是有ANSI_PADDING ON,或者至少在所有ALTER TABLE语句中都有相同的设置。
  3. 考虑创建 2 个额外的文件组:表和索引。最好不要把你的东西放在PRIMARY那里,因为那是 SQL SERVER 存储关于你的对象的所有数据和元数据的地方。您创建表和聚集索引(因为这是表的数据)[Tables]以及所有非聚集索引[Indexes]
  4. 将填充因子从 50% 增加。这个低数字很可能是您的索引空间大于数据空间的原因。执行索引重建将重新创建最多 4k(总页面大小为 8k 页面大小)的数据页面用于您的数据,因此您的表分布在一个广阔的区域。
  5. 如果大多数或所有查询在WHERE条件中都有“ER101_ORG_CODE” ,则考虑将其移动到聚集索引的前导列。假设它比“ER101_ORD_NBR”更常用。如果更频繁地使用“ER101_ORD_NBR”,则保留它。看起来,假设字段名称的意思是“OrganizationCode”和“OrderNumber”,那么“OrgCode”是一个更好的分组,其中可能有多个“OrderNumbers”。
  6. 次要问题,但如果“ER101_ORG_CODE”始终为 2 个字符,则使用CHAR(2)而不是VARCHAR(2)因为它将在行标题中保存一个字节,该字节跟踪可变宽度大小并加起来超过数百万行。
  7. 正如这里的其他人所提到的,使用SELECT *会损害性能。不仅因为它需要 SQL Server 返回所有列,因此更有可能进行聚集索引扫描而不考虑其他索引,而且还需要 SQL Server 花费时间转到表定义并转换*为所有列名. 指定列表中的所有 238 个列名应该会稍微快一点,SELECT尽管这无助于扫描问题。但是您真的需要同时使用所有 238 列吗?

Good luck!

祝你好运!

UPDATE
For the sake of completeness to the question "how to improve performance on a large table for ad-hoc queries", it should be noted that while it will not help for this specific case, IF someone is using SQL Server 2012 (or newer when that time comes) and IF the table is not being updated, then using Columnstore Indexes is an option. For more details on that new feature, look here: http://msdn.microsoft.com/en-us/library/gg492088.aspx(I believe these were made to be updateable starting in SQL Server 2014).

更新
为了完整地解决“如何提高临时查询的大表的性能”这个问题,应该注意的是,虽然它对这种特定情况没有帮助,但如果有人使用 SQL Server 2012(或更新的当那个时间到来时)并且如果表没有被更新,那么使用列存储索引是一种选择。有关该新功能的更多详细信息,请查看此处:http: //msdn.microsoft.com/en-us/library/gg492088.aspx(我相信这些是从 SQL Server 2014 开始可更新的)。

UPDATE 2
Additional considerations are:

更新 2
其他注意事项是:

  • Enable compression on the Clustered Index. This option became available in SQL Server 2008, but as an Enterprise Edition-only feature. However, as of SQL Server 2016 SP1, Data Compression was made available in all editions! Please see the MSDN page for Data Compressionfor details on Row and Page Compression.
  • If you cannot use Data Compression, or if it won't provide much benefit for a particular table, then IF you have a column of a fixed-length type (INT, BIGINT, TINYINT, SMALLINT, CHAR, NCHAR, BINARY, DATETIME, SMALLDATETIME, MONEY, etc) and well over 50% of the rows are NULL, then consider enabling the SPARSEoption which became available in SQL Server 2008. Please see the MSDN page for Use Sparse Columnsfor details.
  • 对聚集索引启用压缩。此选项在 SQL Server 2008 中可用,但仅作为企业版功能。但是,从 SQL Server 2016 SP1 开始所有版本都提供数据压缩!有关行和页面压缩的详细信息,请参阅数据压缩的 MSDN 页面。
  • 如果不能使用数据压缩,或者如果它不特定表提供多少好处,然后如果有一个固定长度的类型的一列(INTBIGINTTINYINTSMALLINTCHARNCHARBINARYDATETIMESMALLDATETIMEMONEY,等等)和超过50 % 的行是NULL,然后考虑启用SPARSE在 SQL Server 2008 中可用的选项。有关详细信息,请参阅使用稀疏列的 MSDN 页面。

回答by Grzegorz Gierlik

There are a few issues with this query (and this apply to every query).

此查询存在一些问题(这适用于每个查询)。

Lack of index

缺乏索引

Lack of index on er101_upd_date_isocolumn is most important thing as Odedhas already mentioned.

er101_upd_date_iso正如Oded已经提到的那样,缺少列索引是最重要的事情。

Without matching index (which lack of could cause table scan) there is no chance to run fast queries on big tables.

如果没有匹配索引(缺少索引可能导致表扫描),就没有机会在大表上运行快速查询。

If you cannot add indexes (for various reasons including there is no point in creating index for just one ad-hoc query) I would suggest a few workarounds (which can be used for ad-hoc queries):

如果您无法添加索引(出于各种原因,包括为一个临时查询创建索引没有意义),我会建议一些解决方法(可用于临时查询):

1. Use temporary tables

1.使用临时表

Create temporary table on subset (rows and columns) of data you are interested in. Temporary table should be much smaller that original source table, can be indexed easily (if needed) and can cachedsubset of data which you are interested in.

在您感兴趣的数据子集(行和列)上创建临时表。临时表应该比原始源表小得多,可以轻松索引(如果需要)并且可以缓存您感兴趣的数据子集。

To create temporary table you can use code (not tested) like:

要创建临时表,您可以使用如下代码(未测试):

-- copy records from last month to temporary table
INSERT INTO
   #my_temporary_table
SELECT
    *
FROM
    er101_acct_order_dtl WITH (NOLOCK)
WHERE 
    er101_upd_date_iso > DATEADD(month, -1, GETDATE())

-- you can add any index you need on temp table
CREATE INDEX idx_er101_upd_date_iso ON #my_temporary_table(er101_upd_date_iso)

-- run other queries on temporary table (which can be indexed)
SELECT TOP 100
    * 
FROM 
    #my_temporary_table 
ORDER BY 
    er101_upd_date_iso DESC

Pros:

优点:

  • Easy to do for any subset of data.
  • Easy to manage -- it's temporaryand it's table.
  • Doesn't affect overall system performance like view.
  • Temporary table can be indexed.
  • You don't have to care about it -- it's temporary :).
  • 易于处理任何数据子集。
  • 易于管理——它是临时的,它是
  • 不会像view.
  • 临时表可以被索引。
  • 你不必关心它——这是暂时的:)。

Cons:

缺点:

  • It's snapshot of data -- but probably this is good enough for most ad-hoc queries.
  • 它是数据的快照——但这对于大多数临时查询来说可能已经足够了。

2. Common table expression -- CTE

2.公用表表达式——CTE

Personally I use CTEa lot with ad-hoc queries -- it's help a lot with building (and testing) a query piece by piece.

就我个人而言,我经常CTE用于临时查询——它对逐个构建(和测试)查询有很大帮助。

See example below (the query starting with WITH).

请参见下面的示例(以 开头的查询WITH)。

Pros:

优点:

  • Easy to build starting from big viewand then selecting and filtering what really you need.
  • Easy to test.
  • 易于从大视图开始构建,然后选择和过滤您真正需要的内容。
  • 易于测试。

Cons:

缺点:

  • Some people dislike CDE -- CDE queries seem to be long and difficult to understand.
  • 有些人不喜欢 CDE——CDE 查询似乎很长而且难以理解。

3. Create views

3. 创建视图

Similar to above, but create views instead of temporary tables (if you play often with the same queries and you have MS SQL version which supports indexed views.

与上面类似,但创建视图而不是临时表(如果您经常使用相同的查询并且您拥有支持索引视图的 MS SQL 版本。

You can create views or indexed viewson subset of data you are interested in and run queries on view -- which should contain only interesting subset of data much smaller than the whole table.

您可以在您感兴趣的数据子集上创建视图或索引视图,并在视图上运行查询——它应该只包含比整个表小得多的有趣数据子集。

Pros:

优点:

  • Easy to do.
  • It's up to date with source data.
  • 很容易做到。
  • 它是最新的源数据。

Cons:

缺点:

  • Possible only for defined subset of data.
  • Could be inefficient for large tables with high rate of updates.
  • Not so easy to manage.
  • Can affect overall system performance.
  • I am not sure indexed views are available in every version of MS SQL.
  • 仅适用于已定义的数据子集。
  • 对于更新率高的大表可能效率低下。
  • 没那么容易管理。
  • 会影响整体系统性能。
  • 我不确定每个版本的 MS SQL 都提供索引视图。

Selecting all columns

选择所有列

Running star query(SELECT * FROM) on big table is not good thing...

在大表上运行star query( SELECT * FROM) 可不是什么好事...

If you have large columns (like long strings) it takes a lot of time to read them from disk and pass by network.

如果您有大列(如长字符串),则需要花费大量时间从磁盘读取它们并通过网络传递。

I would try to replace *with column names which you really need.

我会尝试*用您真正需要的列名替换。

Or, if you need all columns try to rewrite query to something like (using common data expression):

或者,如果您需要所有列,请尝试将查询重写为(使用通用数据表达式):

;WITH recs AS (
    SELECT TOP 100 
        id as rec_id -- select primary key only
    FROM 
        er101_acct_order_dtl 
    ORDER BY 
        er101_upd_date_iso DESC
)
SELECT
    er101_acct_order_dtl.*
FROM
    recs
    JOIN
      er101_acct_order_dtl
    ON
      er101_acct_order_dtl.id = recs.rec_id
ORDER BY 
    er101_upd_date_iso DESC 

Dirty reads

脏读

Last thing which could speed up the ad-hoc query is allowing dirty readswith table hint WITH (NOLOCK).

可以加速临时查询的最后一件事是允许使用表提示进行脏读WITH (NOLOCK)

Instead of hint you can set transaction isolation levelto read uncommited:

您可以将事务隔离级别设置为未提交读取,而不是提示:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

or set proper SQL Management Studio setting.

或设置适当的 SQL Management Studio 设置。

I assume for ad-hoc queries dirty readsis good enough.

我认为对于临时查询脏读就足够了。

回答by Oded

You are getting a table scanthere, meaning that you do not have an indexdefined on er101_upd_date_iso, or if that column is part of an existing index, the index can't be used (possibly it is not the primary indexer column).

您正在那里进行表扫描,这意味着您没有在 上定义索引er101_upd_date_iso,或者如果该列是现有索引的一部分,则无法使用该索引(可能它不是主索引器列)。

Adding missing indexes will help performance no end.

添加缺失的索引将有助于提高性能。

there are already indexs on the columns which are queried most commonly

最常查询的列上已经有索引

That does not mean they are used in this query (and they probably are not).

这并不意味着在这个查询中使用了它们(它们可能没有)。

I suggest reading Finding the Causes of Poor Performance in SQL Server by Gail Shaw, part 1and part 2.

我建议阅读 Gail Shaw 的《在 SQL Server 中查找性能不佳的原因》,第 1部分第 2 部分

回答by Phil

The question specifically states the performance needs to be improved for ad-hocqueries, and that indexes can't be added. So taking that at face value, what can be done to improve performance on any table?

该问题明确指出需要改进临时查询的性能,并且不能添加索引。那么从表面上看,可以做些什么来提高任何桌子的性能?

Since we're considering ad-hoc queries, the WHERE clause and the ORDER BY clause can contain any combination of columns. This means that almost regardless of what indexes are placed on the table there will be some queries that require a table scan, as seen above in query plan of a poorly performing query.

由于我们正在考虑即席查询,因此 WHERE 子句和 ORDER BY 子句可以包含任何列组合。这意味着几乎不管在表上放置什么索引,都会有一些查询需要表扫描,如上面性能不佳查询的查询计划所示。

Taking this into account, let's assume there are no indexes at all on the table apart from a clustered index on the primary key. Now let's consider what options we have to maximize performance.

考虑到这一点,我们假设除了主键上的聚集索引外,表上根本没有索引。现在让我们考虑我们有哪些选项可以最大限度地提高性能。

  • Defragment the table

    As long as we have a clustered index then we can defragment the table using DBCC INDEXDEFRAG(deprecated) or preferably ALTER INDEX. This will minimize the number of disk reads required to scan the table and will improve speed.

  • Use the fastest disks possible. You don't say what disks you're using but if you can use SSDs.

  • Optimize tempdb. Put tempdb on the fastest disks possible, again SSDs. See this SO Articleand this RedGate article.

  • As stated in other answers, using a more selective query will return less data, and should be therefore be faster.

  • 对表进行碎片整理

    只要我们有聚集索引,我们就可以使用DBCC INDEXDEFRAG(不推荐使用)或最好使用ALTER INDEX对表进行碎片整理。这将最大限度地减少扫描表所需的磁盘读取次数并提高速度。

  • 尽可能使用最快的磁盘。您不会说您使用的是什么磁盘,而是说您是否可以使用 SSD。

  • 优化临时数据库。将 tempdb 放在尽可能快的磁盘上,同样是 SSD。请参阅此SO 文章和此RedGate 文章

  • 正如其他答案中所述,使用更具选择性的查询将返回更少的数据,因此应该更快。

Now let's consider what we can do if we are allowed to add indexes.

现在让我们考虑如果允许添加索引我们可以做什么。

If we weren'ttalking about ad-hoc queries, then we would add indexes specifically for the limited set of queries being run against the table. Since we are discussing ad-hocqueries, what can be done to improve speed mostof the time?

如果我们不是在谈论即席查询,那么我们将专门为针对表运行的有限查询集添加索引。既然我们正在讨论即席查询,那么大多数时候可以做些什么来提高速度?

  • Add a single column index to each column. This should give SQL Server at least something to work with to improve the speed for the majority of queries, but won't be optimal.
  • Add specific indexes for the most common queries so they are optimized.
  • Add additional specific indexes as required by monitoring for poorly performing queries.
  • 为每列添加一个单列索引。这应该至少为 SQL Server 提供一些可以提高大多数查询速度的东西,但不会是最佳的。
  • 为最常见的查询添加特定索引,以便对其进行优化。
  • 通过监视性能不佳的查询,根据需要添加其他特定索引。

Edit

编辑

I've run some tests on a 'large' table of 22 million rows. My table only has six columns but does contain 4GB of data. My machine is a respectable desktop with 8Gb RAM and a quad core CPU and has a single Agility 3 SSD.

我在一个包含 2200 万行的“大”表上运行了一些测试。我的表只有六列,但确实包含 4GB 的数据。我的机器是具有 8Gb RAM 和四核 CPU 的受人尊敬的台式机,并且有一个 Agility 3 SSD。

I removed all indexes apart from the primary key on the Id column.

除了 Id 列上的主键之外,我删除了所有索引。

A similar query to the problem one given in the question takes 5 seconds if SQL server is restarted first and 3 seconds subsequently. The database tuning advisor obviously recommends adding an index to improve this query, with an estimated improvement of > 99%. Adding an index results in a query time of effectively zero.

如果首先重新启动 SQL 服务器,然后重新启动 3 秒,则与问题中给出的问题类似的查询需要 5 秒。数据库调优顾问显然建议添加一个索引来改进这个查询,估计改进超过 99%。添加索引会导致查询时间实际上为零。

What's also interesting is that my query plan is identical to yours (with the clustered index scan), but the index scan accounts for 9% of the query cost and the sort the remaining 91%. I can only assume your table contains an enormous amount of data and/or your disks are very slow or located over a very slow network connection.

同样有趣的是,我的查询计划与您的相同(使用聚集索引扫描),但索引扫描占查询成本的 9%,而排序占剩余的 91%。我只能假设您的表包含大量数据和/或您的磁盘速度非常慢或位于非常慢的网络连接上。

回答by foamdino

Even if you have indexes on some columns that are used in some queries, the fact that your 'ad-hoc' query causes a table scan shows that you don't have sufficient indexes to allow this query to complete efficiently.

即使您在某些查询中使用的某些列上有索引,您的“临时”查询导致表扫描这一事实表明您没有足够的索引来允许此查询有效完成。

For date ranges in particular it is difficult to add good indexes.

特别是对于日期范围,很难添加好的索引。

Just looking at your query, the db has to sort all the records by the selected column to be able to return the first n records.

仅查看您的查询,数据库必须按所选列对所有记录进行排序才能返回前 n 条记录。

Does the db also do a full table scan without the order by clause? Does the table have a primary key - without a PK, the db will have to work harder to perform the sort?

db 是否也会在没有 order by 子句的情况下进行全表扫描?表是否有主键 - 没有 PK,数据库将不得不更加努力地执行排序?

回答by shankar_pratap

How is this possible? Without an index on the er101_upd_date_iso column how can a clustered index scan be used?

这怎么可能?如果 er101_upd_date_iso 列上没有索引,如何使用聚集索引扫描?

An index is a B-Tree where each leaf node is pointing to a 'bunch of rows'(called a 'Page' in SQL internal terminology), That is when the index is a non-clustered index.

索引是一个 B 树,其中每个叶节点都指向一组“行”(在 SQL 内部术语中称为“页面”),即索引是非聚集索引。

Clustered index is a special case, in which the leaf nodes has the 'bunch of rows' (rather than pointing to them). that is why...

聚集索引是一种特殊情况,其中叶节点具有“一堆行”(而不是指向它们)。这就是为什么...

1) There can be only one clustered index on the table.

1) 表上只能有一个聚集索引。

this also means the whole table is stored as the clustered index, that is why you started seeing index scan rather than a table scan.

这也意味着整个表被存储为聚集索引,这就是为什么你开始看到索引扫描而不是表扫描。

2) An operation that utilizes clustered index is generally faster than a non-clustered index

2) 使用聚集索引的操作通常比非聚集索引快

Read more at http://msdn.microsoft.com/en-us/library/ms177443.aspx

http://msdn.microsoft.com/en-us/library/ms177443.aspx阅读更多信息

For the problem you have, you should really consider adding this column to a index, as you said adding a new index (or a column to an existing index) increases INSERT/UPDATE costs. But it might be possible to remove some underutilized index (or a column from an existing index) to replace with 'er101_upd_date_iso'.

对于您遇到的问题,您应该真正考虑将此列添加到索引中,正如您所说,添加新索引(或添加到现有索引的列)会增加 INSERT/UPDATE 成本。但是可能可以删除一些未充分利用的索引(或现有索引中的列)以替换为“er101_upd_date_iso”。

If index changes are not possible, i recommend adding a statistics on the column, it can fasten things up when the columns have some correlation with indexed columns

如果无法更改索引,我建议在列上添加统计信息,当列与索引列有一些相关性时,它可以加快速度

http://msdn.microsoft.com/en-us/library/ms188038.aspx

http://msdn.microsoft.com/en-us/library/ms188038.aspx

BTW, You will get much more help if you can post the table schema of ER101_ACCT_ORDER_DTL. and the existing indices too..., probably the query could be re-written to use some of them.

顺便说一句,如果您可以发布 ER101_ACCT_ORDER_DTL 的表模式,您将获得更多帮助。和现有的索引也是......,可能可以重新编写查询以使用其中的一些。

回答by ninghad

One of the reasons your 1M test ran quicker is likely because the temp tables are entirely in memory and would only go to disk if your server experiences memory pressure. You can either re-craft your query to remove the order by, add a good clustered index and covering index(es) as previously mentioned, or query the DMV to check for IO pressure to see if hardware related.

您的 1M 测试运行得更快的原因之一可能是因为临时表完全在内存中,并且只有在您的服务器遇到内存压力时才会转到磁盘。您可以重新编写查询以删除订单,添加一个良好的聚集索引和覆盖索引,如前所述,或者查询 DMV 以检查 IO 压力以查看硬件是否相关。

-- From Glen Barry
-- Clear Wait Stats (consider clearing and running wait stats query again after a few minutes)
-- DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

-- Check Task Counts to get an initial idea what the problem might be

-- Avg Current Tasks Count, Avg Runnable Tasks Count, Avg Pending Disk IO Count across all schedulers
-- Run several times in quick succession
SELECT AVG(current_tasks_count) AS [Avg Task Count], 
       AVG(runnable_tasks_count) AS [Avg Runnable Task Count],
       AVG(pending_disk_io_count) AS [Avg Pending DiskIO Count]
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE scheduler_id < 255 OPTION (RECOMPILE);

-- Sustained values above 10 suggest further investigation in that area
-- High current_tasks_count is often an indication of locking/blocking problems
-- High runnable_tasks_count is a good indication of CPU pressure
-- High pending_disk_io_count is an indication of I/O pressure

回答by Diego

I know that you said that adding indexes is not an option but that would be the only option to eliminate the table scan you have. When you do a scan, SQL Server reads all 2 million rows on the table to fulfill your query.

我知道您说过添加索引不是一种选择,但这将是消除您所拥有的表扫描的唯一选择。当您进行扫描时,SQL Server 会读取表中的所有 200 万行来完成您的查询。

thisarticle provides more info but remember: Seek = good, Scan = bad.

这篇文章提供了更多信息,但请记住:寻求 = 好,扫描 = 坏。

Second, can't you eliminate the select * and select only the columns you need? Third, no "where" clause? Even if you have a index, since you are reading everything the best you will get is a index scan (which is better than a table scan, but it is not a seek, which is what you should aim for)

其次,你不能消除select *并只选择你需要的列吗?三、没有“where”子句?即使您有索引,由于您正在阅读所有内容,因此您将获得的最佳结果是索引扫描(这比表扫描更好,但它不是搜索,这是您应该瞄准的目标)

回答by darlove

I know it's been quite a time since the beginning... There is a lot of wisdom in all these answers. Good indexing is the first thing when trying to improve a query. Well, almost the first. The most-first (so to speak) is making changes to code so that it's efficient. So, after all's been said and done, if one has a query with no WHERE, or when the WHERE-condition is not selective enough, there is only one way to get the data: TABLE SCAN (INDEX SCAN). If one needs all the columns from a table, then TABLE SCAN will be used - no question about it. This might be a heap scan or clustered index scan, depending on the type of data organization. The only last way to speed things up (if at all possible), is to make sure that as many cores are used as possible to do the scan: OPTION (MAXDOP 0). I'm ignoring the subject of storage, of course, but one should make sure that one has unlimited RAM, which goes without saying :)

我知道从开始到现在已经有很长一段时间了……所有这些答案都蕴含着很多智慧。尝试改进查询时,首先要建立良好的索引。嗯,几乎是第一个。最重要的(可以这么说)是对代码进行更改以使其高效。所以,归根结底,如果有一个没有 WHERE 的查询,或者当 WHERE 条件不够选择性时,只有一种方法可以获取数据:TABLE SCAN (INDEX SCAN)。如果需要表中的所有列,则将使用 TABLE SCAN - 毫无疑问。这可能是堆扫描或聚集索引扫描,具体取决于数据组织的类型。加快速度的唯一最后一种方法(如果可能的话)是确保使用尽可能多的内核来进行扫描:OPTION (MAXDOP 0)。当然,我忽略了存储的主题,