使用 SQL Server 作为具有多个客户端的数据库队列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3641703/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 07:24:37  来源:igfitidea点击:

Using SQL Server as a DB queue with multiple clients

sqlsql-serverdatabaseconcurrency

提问by Synesso

Given a table that is acting as a queue, how can I best configure the table/queries so that multiple clients process from the queue concurrently?

给定一个充当队列的表,如何最好地配置表/查询,以便多个客户端同时从队列中处理?

For example, the table below indicates a command that a worker must process. When the worker is done, it will set the processed value to true.

例如,下表指示工作人员必须处理的命令。当工作完成后,它会将处理的值设置为 true。

| ID | COMMAND | PROCESSED |
|  1 | ...     | true      |
|  2 | ...     | false     |
|  3 | ...     | false     |

The clients might obtain one command to work on like so:

客户端可能会获得一个命令来处理,如下所示:

select top 1 COMMAND 
from EXAMPLE_TABLE 
with (UPDLOCK, ROWLOCK) 
where PROCESSED=false;

However, if there are multiple workers, each tries to get the row with ID=2. Only the first will get the pessimistic lock, the rest will wait. Then one of them will get row 3, etc.

但是,如果有多个工作人员,每个工作人员都会尝试获取 ID=2 的行。只有第一个将获得悲观锁,其余的将等待。然后其中一个将获得第 3 行,依此类推。

What query/configuration would allow each worker client to get a different row each and work on them concurrently?

什么查询/配置将允许每个工作客户端获得不同的行并同时处理它们?

EDIT:

编辑:

Several answers suggest variations on using the table itself to record an in-process state. I thought that this would not be possible within a single transaction. (i.e., what's the point of updating the state if no other worker will see it until the txn is committed?) Perhaps the suggestion is:

几个答案表明使用表本身记录进程中状态的变化。我认为这在单个交易中是不可能的。(即,如果在提交 txn 之前没有其他工作人员会看到它,那么更新状态有什么意义?)也许建议是:

# start transaction
update to 'processing'
# end transaction
# start transaction
process the command
update to 'processed'
# end transaction

Is this the way people usually approach this problem? It seems to me that the problem would be better handled by the DB, if possible.

这是人们通常处理这个问题的方式吗?在我看来,如果可能的话,这个问题最好由 DB 处理。

回答by Remus Rusanu

I recommend you go over Using tables as Queues. Properly implemented queues can handle thousands of concurrent users and service as high as 1/2 Million enqueue/dequeue operations per minute. Until SQL Server 2005 the solution was cumbersome and involved a mixing a SELECTand an UPDATEin a single transaction and give just the right mix of lock hints, as in the article linked by gbn. Luckly since SQL Server 2005 with the advent of the OUTPUT clause, a much more elegant solution is available, and now MSDN recommends using the OUTPUT clause:

我建议您阅读Using tables as Queues。正确实施的队列可以处理高达每分钟 1/2 百万个入队/出队操作的数千个并发用户和服务。在 SQL Server 2005 之前,该解决方案很麻烦,涉及在单个事务中混合 aSELECT和 anUPDATE并提供正确的锁提示组合,如 gbn 链接的文章中所示。幸运的是,随着 OUTPUT 子句的出现,SQL Server 2005 提供了一个更优雅的解决方案,现在 MSDN 建议使用OUTPUT 子句

You can use OUTPUT in applications that use tables as queues, or to hold intermediate result sets. That is, the application is constantly adding or removing rows from the table

您可以在使用表作为队列或保存中间结果集的应用程序中使用 OUTPUT。也就是说,应用程序不断地从表中添加或删除行

Basically there are 3 parts of the puzzle you need to get right in order for this to work in a highly concurrent manner:

基本上有 3 个部分你需要解决,以便它以高度并发的方式工作:

1) You need to dequeue atomically. You have to find the row, skipp any locked rows, and mark it as 'dequeued' in a single, atomic operation, and this is where the OUTPUTclause comes into play:

1)您需要以原子方式出列。您必须找到该行,跳过任何锁定的行,并在单个原子操作中将其标记为“出队”,这就是该OUTPUT子句发挥作用的地方:

with CTE as (
  SELECT TOP(1) COMMAND, PROCESSED
  FROM TABLE WITH (READPAST)
  WHERE PROCESSED = 0)
UPDATE CTE
  SET PROCESSED = 1
  OUTPUT INSERTED.*;

2) You muststructure your table with the leftmost clustered index key on the PROCESSEDcolumn. If the IDwas used a primary key, then move it as the second column in the clustered key. The debate whether to keep a non-clustered key on the IDcolumn is open, but I strongly favor nothaving any secondary non-clustered indexes over queues:

2) 您必须使用PROCESSED列上最左侧的聚集索引键构建表。如果ID使用了主键,则将其作为聚集键中的第二列移动。是否在ID列上保留非聚集键的争论是公开的,但我强烈赞成队列上没有任何辅助非聚集索引:

CREATE CLUSTERED INDEX cdxTable on TABLE(PROCESSED, ID);

3) You must not query this table by any other means but by Dequeue. Trying to do Peek operations or trying to use the table both as a Queue andas a store will very likelylead to deadlocks and will slow down throughput dramatically.

3) 除了Dequeue 之外,您不得通过任何其他方式查询此表。尝试执行 Peek 操作或尝试将表同时用作队列存储很可能会导致死锁并显着降低吞吐量。

The combination of atomic dequeue, READPAST hint at searching elements to dequeue and leftmost key on the clustered index based on the processing bit ensure a very high throughput under a highly concurrent load.

原子出队、READPAST 提示搜索元素出队以及基于处理位的聚簇索引上的最左键的组合确保了在高并发负载下的非常高的吞吐量。

回答by gbn

My answer here shows you how to use tables as queues... SQL Server Process Queue Race Condition

我在这里的回答向您展示了如何使用表作为队列... SQL Server Process Queue Race Condition

You basically need "ROWLOCK, READPAST, UPDLOCK" hints

您基本上需要“ROWLOCK、READPAST、UPDLOCK”提示

回答by MacGyver

If you want to serialize your operations for multiple clients, you can simply use application locks.

如果您想为多个客户端序列化您的操作,您可以简单地使用应用程序锁。

BEGIN TRANSACTION

EXEC  sp_getapplock @resource = 'app_token', @lockMode = 'Exclusive'

-- perform operation

EXEC  sp_releaseapplock @resource = 'app_token'

COMMIT TRANSACTION

回答by Macros

Rather than using a boolean value for Processed you could use an int to define the state of the command:

您可以使用 int 来定义命令的状态,而不是对 Processed 使用布尔值:

1 = not processed
2 = in progress
3 = complete

Each worker would then get the next row with Processed = 1, update Processed to 2 then begin work. When work in complete Processed is updated to 3. This approach would also allow for extension of other Processed outcomes, for example rather than just defining that a worker is complet you may add new statuses for 'Completed Succesfully' and 'Completed with Errors'

然后每个工作人员将获得 Processed = 1 的下一行,将 Processed 更新为 2 然后开始工作。当完成 Processed 的工作更新为 3 时。这种方法还允许扩展其他 Processed 结果,例如,不仅仅是定义工作人员完成,您还可以为“已成功完成”和“已完成错误”添加新状态

回答by no_one

Probably the better option will be use a trisSate processed column along with a version/timestamp column. The three values in the processed column will then indicate indicates if the row is under processing, processed or unprocessed.

可能更好的选择是使用经 trisSate 处理的列以及版本/时间戳列。然后,已处理列中的三个值将指示该行是否正在处理、已处理或未处理。

For example

例如

    CREATE TABLE Queue ID INT NOT NULL PRIMARY KEY,
    Command NVARCHAR(100), 
    Processed INT NOT NULL CHECK (Processed in (0,1,2) ), 
    Version timestamp)

You grab the top 1 unprocessed row, set the status to underprocessing and set the status back to processed when things are done. Base your update status on the Version and the primary key columns. If the update fails then someone has already been there.

您获取前 1 个未处理的行,将状态设置为处理不足,并在事情完成后将状态设置回已处理。基于版本和主键列的更新状态。如果更新失败,那么有人已经在那里了。

You might want to add a client identifier as well, so that if the client dies while processing it up, it can restart, look at the last row and then start from where it was.

您可能还想添加一个客户端标识符,以便如果客户端在处理它时死亡,它可以重新启动,查看最后一行,然后从原来的位置开始。

回答by ZippyV

I would stay away from messing with locks in a table. Just create two extra columns like IsProcessing (bit/boolean) and ProcessingStarted (datetime). When a worker crashes or doesn't update his row after a timeout you can have another worker try to process the data.

我不会弄乱桌子上的锁。只需创建两个额外的列,如 IsProcessing (bit/boolean) 和 ProcessingStarted (datetime)。当工作人员崩溃或超时后不更新他的行时,您可以让另一位工作人员尝试处理数据。

回答by Andomar

One way is to mark the row with a single update statement. If you read the status in the whereclause and change it in the setclause, no other process can come in between, because the row will be locked. For example:

一种方法是使用单个更新语句标记该行。如果您读取where子句中的状态并在子句中更改它set,则不会有其他进程介入,因为该行将被锁定。例如:

declare @pickup_id int
set @pickup_id = 1

set rowcount 1

update  YourTable
set     status = 'picked up'
,       @pickup_id = id
where   status = 'new'

set rowcount 0

return @pickup_id

This uses rowcountto update one row at most. If no row was found, @pickup_idwill be -1.

这用于rowcount最多更新一行。如果未找到行,@pickup_id则为-1