SQL 插入大量记录而不锁定表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39404875/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 04:51:32  来源:igfitidea点击:

Inserting large number of records without locking the table

sqlsql-servertsqlsql-insertbatch-insert

提问by P?????

I am trying to insert 1,500,000records into a table. Am facing table lock issues during the insertion. So I came up with the below batch insert.

我正在尝试将1,500,000条记录插入到一​​个表中。我在插入过程中面临表锁定问题。所以我想出了下面的批量插入。

DECLARE @BatchSize INT = 50000

WHILE 1 = 1
  BEGIN
      INSERT INTO [dbo].[Destination] 
                  (proj_details_sid,
                   period_sid,
                   sales,
                   units)
      SELECT TOP(@BatchSize) s.proj_details_sid,
                             s.period_sid,
                             s.sales,
                             s.units
      FROM   [dbo].[SOURCE] s
      WHERE  NOT EXISTS (SELECT 1
                         FROM   dbo.Destination d
                         WHERE  d.proj_details_sid = s.proj_details_sid
                                AND d.period_sid = s.period_sid)

      IF @@ROWCOUNT < @BatchSize
        BREAK
  END 

I have a clustered Index on Destinationtable (proj_details_sid ,period_sid ). NOT EXISTSpart is just to restrict inserted records from again inserting into the table

我在Destinationtable上有一个聚集索引(proj_details_sid ,period_sid )NOT EXISTS部分只是为了限制插入的记录再次插入表中

Am I doing it right, will this avoid table lock ? or is there any better way.

我做得对吗,这会避免表锁定吗?或者有什么更好的方法。

Note :Time taken is more or less same with batch and without batch insert

注意:使用批处理和不使用批处理插入所需的时间或多或少相同

采纳答案by Martin Smith

Lock escalation is not likely to be related to the SELECTpart of your statement at all.

锁升级不太可能与SELECT您的语句部分有关。

It is a natural consequence of inserting a large number of rows

这是插入大量行自然结果

Lock escalation is triggered when lock escalation is not disabled on the table by using the ALTER TABLE SET LOCK_ESCALATION option, and when either of the following conditions exists:

  • A single Transact-SQL statement acquires at least 5,000 locks on a single nonpartitioned table or index.
  • A single Transact-SQL statement acquires at least 5,000 locks on a single partition of a partitioned table and the ALTER TABLE SET LOCK_ESCALATION option is set to AUTO.
  • The number of locks in an instance of the Database Engine exceeds memory or configuration thresholds.

If locks cannot be escalated because of lock conflicts, the Database Engine periodically triggers lock escalation at every 1,250 new locks acquired.

当未使用 ALTER TABLE SET LOCK_ESCALATION 选项在表上禁用锁升级时,并且存在以下任一条件时,会触发锁升级:

  • 单个 Transact-SQL 语句在单个非分区表或索引上获取至少 5,000 个锁。
  • 单个 Transact-SQL 语句在分区表的单个分区上获取至少 5,000 个锁,并且 ALTER TABLE SET LOCK_ESCALATION 选项设置为 AUTO。
  • 数据库引擎实例中的锁数超过了内存或配置阈值。

如果由于锁冲突而无法升级锁,数据库引擎会在每获得 1,250 个新锁时定期触发锁升级。

You can easily see this for yourself by tracing the lock escalation event in Profiler or simply trying the below with different batch sizes. For me TOP (6228)shows 6250 locks held but TOP (6229)it suddenly plummets to 1 as lock escalation kicks in. The exact numbers may vary (dependant on database settings and resources currently available). Use trial and error to find the threshold where lock escalation appears for you.

您可以通过跟踪 Profiler 中的锁升级事件或简单地尝试以下不同的批处理大小,轻松地亲眼看到这一点。对我而言,TOP (6228)显示持有 6250 个锁,但TOP (6229)随着锁升级开始,它突然暴跌至 1。确切数字可能会有所不同(取决于当前可用的数据库设置和资源)。使用试错法找到为您出现锁升级的阈值。

CREATE TABLE [dbo].[Destination]
  (
     proj_details_sid INT,
     period_sid       INT,
     sales            INT,
     units            INT
  )

BEGIN TRAN --So locks are held for us to count in the next statement
INSERT INTO [dbo].[Destination]
SELECT TOP (6229) 1,
                  1,
                  1,
                  1
FROM   master..spt_values v1,
       master..spt_values v2

SELECT COUNT(*)
FROM   sys.dm_tran_locks
WHERE  request_session_id = @@SPID;

COMMIT

DROP TABLE [dbo].[Destination] 

You are inserting 50,000 rows so almost certainly lock escalation will be attempted.

您正在插入 50,000 行,因此几乎可以肯定会尝试锁升级。

The article How to resolve blocking problems that are caused by lock escalation in SQL Serveris quite old but a lot of the suggestions are still valid.

文章如何解决由 SQL Server 中的锁升级引起的阻塞问题已经很老了,但很多建议仍然有效。

  1. Break up large batch operations into several smaller operations (i.e. use a smaller batch size)
  2. Lock escalation cannot occur if a different SPID is currently holding an incompatible table lock - The example they give is a different session executing
  1. 将大批量操作分解为几个较小的操作(即使用较小的批量大小)
  2. 如果不同的 SPID 当前持有不兼容的表锁,则不会发生锁升级 - 他们给出的示例是执行不同的会话


BEGIN TRAN
SELECT * FROM mytable (UPDLOCK, HOLDLOCK) WHERE 1=0
WAITFOR DELAY '1:00:00'
COMMIT TRAN 
  1. Disable lock escalation by enabling trace flag 1211 - However this is a global setting and can cause severe issues. There is a newer option 1224that is less problematic but this is still global.
  1. 通过启用跟踪标志 1211 禁用锁升级 - 但是,这是一个全局设置,可能会导致严重问题。有一个较新的选项1224问题较少,但这仍然是全局的。

Another option would be to ALTER TABLE blah SET (LOCK_ESCALATION = DISABLE)but this is still not very targeted as it affects all queries against the table not just your single scenario here.

另一种选择是,ALTER TABLE blah SET (LOCK_ESCALATION = DISABLE)但这仍然不是很有针对性,因为它会影响对表的所有查询,而不仅仅是这里的单个场景。

So I would opt for option 1 or possibly option 2 and discount the others.

所以我会选择选项 1 或选项 2 并打折其他选项。

回答by Prisoner

Instead of checking the data exists in Destination, it seems better to store all data in temp table first, and batch insert into Destination

与其检查数据存在于 中Destination,不如先将所有数据存储在临时表中,然后批量插入Destination

Reference: Using ROWLOCK in an INSERT statement (SQL Server)

参考:在 INSERT 语句中使用 ROWLOCK (SQL Server)

DECLARE @batch int = 100
DECLARE @curRecord int = 1
DECLARE @maxRecord int

-- remove (nolock) if you don't want to have dirty read
SELECT row_number over (order by s.proj_details_sid, s.period_sid) as rownum,
       s.proj_details_sid,
       s.period_sid,
       s.sales,
       s.units
INTO #Temp
FROM   [dbo].[SOURCE] s WITH (NOLOCK)
WHERE  NOT EXISTS (SELECT 1
                   FROM   dbo.Destination d WITH (NOLOCK)
                   WHERE  d.proj_details_sid = s.proj_details_sid
                          AND d.period_sid = s.period_sid)

-- change this maxRecord if you want to limit the records to insert
SELECT @maxRecord = count(1) from #Temp

WHILE @maxRecord >= @curRecord
   BEGIN
       INSERT INTO [dbo].[Destination] 
              (proj_details_sid,
               period_sid,
               sales,
               units)
       SELECT proj_details_sid, period_sid, sales, units
       FROM #Temp
       WHERE rownum >= @curRecord and rownum < @curRecord + @batch

       SET @curRecord = @curRecord + @batch
   END

DROP TABLE #Temp

回答by Huseyin Durmus

I added (NOLOCK) your destination table -> dbo.Destination(NOLOCK). Now, You won't lock your table.

我添加了 (NOLOCK) 您的目标表 -> dbo.Destination(NOLOCK)。现在,你不会锁定你的桌子。

WHILE 1 = 1
  BEGIN
      INSERT INTO [dbo].[Destination] 
                  (proj_details_sid,
                   period_sid,
                   sales,
                   units)
      SELECT TOP(@BatchSize) s.proj_details_sid,
                             s.period_sid,
                             s.sales,
                             s.units
      FROM   [dbo].[SOURCE] s
      WHERE  NOT EXISTS (SELECT 1
                         FROM   dbo.Destination(NOLOCK) d
                         WHERE  d.proj_details_sid = s.proj_details_sid
                                AND d.period_sid = s.period_sid)

      IF @@ROWCOUNT < @BatchSize
        BREAK
  END 

回答by Mohit Dagar

To do this you can use WITH (NOLOCK) in your select statement. BUT NOLOCK is not recommended on OLTP Databases.

为此,您可以在选择语句中使用 WITH (NOLOCK)。但是不建议在 OLTP 数据库上使用 NOLOCK。