在 Postgresql 中选择未锁定的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/389541/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Select unlocked row in Postgresql
提问by alanc10n
Is there a way to select rows in Postgresql that aren't locked? I have a multi-threaded app that will do:
有没有办法在 Postgresql 中选择未锁定的行?我有一个多线程应用程序可以执行以下操作:
Select... order by id desc limit 1 for update
on a table.
在一张桌子上。
If multiple threads run this query, they both try to pull back the same row.
如果多个线程运行此查询,它们都会尝试拉回同一行。
One gets the row lock, the other blocks and then fails after the first one updates the row. What I'd really like is for the second thread to get the first row that matches the WHERE
clause and isn't already locked.
一个获取行锁,另一个阻塞,然后在第一个更新行后失败。我真正想要的是让第二个线程获取与WHERE
子句匹配且尚未锁定的第一行。
To clarify, I want each thread to immediately update the first available row after doing the select.
为了澄清,我希望每个线程在执行选择后立即更新第一个可用行。
So if there are rows with ID: 1,2,3,4
, the first thread would come in, select the row with ID=4
and immediately update it.
所以如果有行ID: 1,2,3,4
,第一个线程会进来,选择行ID=4
并立即更新它。
If during that transaction a second thread comes it, I'd like it to get row with ID=3
and immediately update that row.
如果在该事务期间出现第二个线程,我希望它获取行ID=3
并立即更新该行。
For Share won't accomplish this nor with nowait
as the WHERE
clause will match the locked row (ID=4 in my example)
. Basically what I'd like is something like "AND NOT LOCKED" in the WHERE
clause.
因为 Share 不会完成此操作,也不会 withnowait
因为该WHERE
子句将与锁定的行匹配(ID=4 in my example)
。基本上我想要的是WHERE
条款中的“AND NOT LOCKED” 。
Users
-----------------------------------------
ID | Name | flags
-----------------------------------------
1 | bob | 0
2 | fred | 1
3 | tom | 0
4 | ed | 0
If the query is "Select ID from users where flags = 0 order by ID desc limit 1
" and when a row is returned the next thing is "Update Users set flags = 1 where ID = 0
" then I'd like the first thread in to grab the row with ID 4
and the next one in to grab the row with ID 3
.
如果查询是“ Select ID from users where flags = 0 order by ID desc limit 1
”,当返回一行时,下一个是“ Update Users set flags = 1 where ID = 0
”,那么我希望第一个线程ID 4
使用ID 3
.
If I append "For Update
" to the select then the first thread gets the row, the second one blocks and then returns nothing because once the first transaction commits the WHERE
clause is no longer satisfied.
如果我将“ For Update
”附加到选择然后第一个线程获取行,第二个阻塞然后不返回任何内容,因为一旦第一个事务提交该WHERE
子句就不再满足。
If I don't use "For Update
" then I need to add a WHERE clause on the subsequent update (WHERE flags = 0) so only one thread can update the row.
如果我不使用“ For Update
”,那么我需要在后续更新(WHERE 标志 = 0)中添加一个 WHERE 子句,以便只有一个线程可以更新该行。
The second thread will select the same row as the first but the second thread's update will fail.
第二个线程将选择与第一个相同的行,但第二个线程的更新将失败。
Either way the second thread fails to get a row and update because I can't get the database to give row 4 to the first thread and row 3 to the second thread the the transactions overlap.
无论哪种方式,第二个线程都无法获取一行并更新,因为我无法让数据库将第 4 行提供给第一个线程,将第 3 行提供给第二个线程,事务重叠。
回答by Gavin Wahl
This feature, SELECT ... SKIP LOCKED
is being implemented in Postgres 9.5. http://www.depesz.com/2014/10/10/waiting-for-9-5-implement-skip-locked-for-row-level-locks/
此功能SELECT ... SKIP LOCKED
正在 Postgres 9.5 中实现。http://www.depesz.com/2014/10/10/waiting-for-9-5-implement-skip-locked-for-row-level-locks/
回答by Marek
No No NOOO :-)
不不NOOO :-)
I know what the author means. I have a similar situation and i came up with a nice solution. First i will start from describing my situation. I have a table i which i store messages that have to be sent at a specific time. PG doesn't support timing execution of functions so we have to use daemons (or cron). I use a custom written script that opens several parallel processes. Every process selects a set of messages that have to be sent with the precision of +1 sec / -1 sec. The table itself is dynamically updated with new messages.
我知道作者的意思。我有类似的情况,我想出了一个很好的解决方案。首先我将从描述我的情况开始。我有一张表,我存储必须在特定时间发送的消息。PG 不支持函数的定时执行,所以我们必须使用守护进程(或 cron)。我使用自定义编写的脚本打开多个并行进程。每个进程都选择一组必须以 +1 秒/-1 秒的精度发送的消息。表格本身会随着新消息动态更新。
So every process needs to download a set of rows. This set of rows cannot be downloaded by the other process because it will make a lot of mess (some people would receive couple messages when they should receive only one). That is why we need to lock the rows. The query to download a set of messages with the lock:
所以每个进程都需要下载一组行。这组行不能被其他进程下载,因为它会造成很多混乱(有些人在他们应该只收到一条消息时会收到几条消息)。这就是为什么我们需要锁定行。使用锁下载一组消息的查询:
FOR messages in select * from public.messages where sendTime >= CURRENT_TIMESTAMP - '1 SECOND'::INTERVAL AND sendTime <= CURRENT_TIMESTAMP + '1 SECOND'::INTERVAL AND sent is FALSE FOR UPDATE LOOP
-- DO SMTH
END LOOP;
a process with this query is started every 0.5 sec. So this will result in the next query waiting for the first lock to unlock the rows. This approach creates enormous delays. Even when we use NOWAIT the query will result in a Exception which we don't want because there might be new messages in the table that have to be sent. If use simply FOR SHARE the query will execute properly but still it will take a lot of time creating huge delays.
此查询的进程每 0.5 秒启动一次。所以这将导致下一个查询等待第一个锁来解锁行。这种方法造成了巨大的延迟。即使我们使用 NOWAIT,查询也会导致我们不想要的异常,因为表中可能有必须发送的新消息。如果仅使用 FOR SHARE 查询将正确执行,但仍会花费大量时间造成巨大延迟。
In order to make it work we do a little magic:
为了让它工作,我们做了一个小魔术:
changing the query:
FOR messages in select * from public.messages where sendTime >= CURRENT_TIMESTAMP - '1 SECOND'::INTERVAL AND sendTime <= CURRENT_TIMESTAMP + '1 SECOND'::INTERVAL AND sent is FALSE AND is_locked(msg_id) IS FALSE FOR SHARE LOOP -- DO SMTH END LOOP;
the mysterious function 'is_locked(msg_id)' looks like this:
CREATE OR REPLACE FUNCTION is_locked(integer) RETURNS BOOLEAN AS $$ DECLARE id integer; checkout_id integer; is_it boolean; BEGIN checkout_id := ; is_it := FALSE; BEGIN -- we use FOR UPDATE to attempt a lock and NOWAIT to get the error immediately id := msg_id FROM public.messages WHERE msg_id = checkout_id FOR UPDATE NOWAIT; EXCEPTION WHEN lock_not_available THEN is_it := TRUE; END; RETURN is_it; END; $$ LANGUAGE 'plpgsql' VOLATILE COST 100;
更改查询:
FOR messages in select * from public.messages where sendTime >= CURRENT_TIMESTAMP - '1 SECOND'::INTERVAL AND sendTime <= CURRENT_TIMESTAMP + '1 SECOND'::INTERVAL AND sent is FALSE AND is_locked(msg_id) IS FALSE FOR SHARE LOOP -- DO SMTH END LOOP;
神秘的函数“is_locked(msg_id)”看起来像这样:
CREATE OR REPLACE FUNCTION is_locked(integer) RETURNS BOOLEAN AS $$ DECLARE id integer; checkout_id integer; is_it boolean; BEGIN checkout_id := ; is_it := FALSE; BEGIN -- we use FOR UPDATE to attempt a lock and NOWAIT to get the error immediately id := msg_id FROM public.messages WHERE msg_id = checkout_id FOR UPDATE NOWAIT; EXCEPTION WHEN lock_not_available THEN is_it := TRUE; END; RETURN is_it; END; $$ LANGUAGE 'plpgsql' VOLATILE COST 100;
Of course we can customize this function to work on any table you have in your database. In my opinion it is better to create one check function for one table. Adding more things to this function can make it only slower. I takes longer to check this clause anyways so there is no need to make it even slower. For me this the complete solution and it works perfectly.
当然,我们可以自定义此函数以处理您数据库中的任何表。在我看来,最好为一张表创建一个检查功能。向这个函数添加更多的东西只会让它变慢。反正我需要更长的时间来检查这个子句,所以没有必要让它更慢。对我来说,这是完整的解决方案,而且效果很好。
Now when i have my 50 processes running in parallel every process has a unique set of fresh messages to send. Once the are sent i just update the row with sent = TRUE and never go back to it again.
现在,当我有 50 个并行运行的进程时,每个进程都有一组独特的新消息要发送。发送后,我只需使用 sent = TRUE 更新该行,然后再不返回。
I hope this solution will also work for you (author). If you have any question just let me know :-)
我希望这个解决方案也适用于你(作者)。如果您有任何问题,请告诉我 :-)
Oh, and let me know if this worked for you as-well.
哦,让我知道这是否也适合你。
回答by Timon
I use something like this:
我使用这样的东西:
select *
into l_sms
from sms
where prefix_id = l_prefix_id
and invoice_id is null
and pg_try_advisory_lock(sms_id)
order by suffix
limit 1;
and don't forget to call pg_advisory_unlock
并且不要忘记调用 pg_advisory_unlock
回答by Peter Eisentraut
If you are trying to implement a queue, take a look at PGQ, which has solved this and other problems already. http://wiki.postgresql.org/wiki/PGQ_Tutorial
如果您正在尝试实现队列,请查看 PGQ,它已经解决了这个问题和其他问题。http://wiki.postgresql.org/wiki/PGQ_Tutorial
回答by Grant Johnson
It appears that you are trying to do something like grab the highest priority item in a queue that is not already being taken care of by another process.
看起来您正在尝试执行一些操作,例如抓取队列中尚未由另一个进程处理的最高优先级项目。
A likely solution is to add a where clause limiting it to unhandled requests:
一个可能的解决方案是添加一个 where 子句,将其限制为未处理的请求:
select * from queue where flag=0 order by id desc for update;
update queue set flag=1 where id=:id;
--if you really want the lock:
select * from queue where id=:id for update;
...
Hopefully, the second transaction will block while the update to the flag happens, then it will be able to continue, but the flag will limit it to the next in line.
希望第二个事务在更新标志时会阻塞,然后它可以继续,但标志会将其限制为下一个。
It is also likely that using the serializable isolation level, you can get the result you want without all of this insanity.
也有可能使用可序列化的隔离级别,您可以获得想要的结果,而不会出现所有这些疯狂的情况。
Depending on the nature of your application, there may be better ways of implementing this than in the database, such as a FIFO or LIFO pipe. Additionally, it may be possible to reverse the order that you need them in, and use a sequence to ensure that they are processed sequentially.
根据应用程序的性质,可能有比在数据库中更好的实现方式,例如 FIFO 或 LIFO 管道。此外,可以颠倒您需要它们的顺序,并使用一个序列来确保它们按顺序处理。
回答by Grant Johnson
回答by HUAGHAGUAH
What are you trying to accomplish? Can you better explain why neither unlocked row updates nor full transactions will do what you want?
你想达到什么目的?你能更好地解释为什么解锁的行更新和完整的交易都不会做你想要的吗?
Better yet, can you prevent contention and simply have each thread use a different offset? This won't work well if the relevant portion of the table is being updated frequently; you'll still have collisions but only during heavy insert load.
更好的是,您能否防止争用并简单地让每个线程使用不同的偏移量?如果表格的相关部分经常更新,这将无法正常工作;您仍然会发生碰撞,但仅在插入负载较重时才会发生。
Select... order by id desc offset THREAD_NUMBER limit 1 for update
回答by alanc10n
Since I haven't found a better answer yet, I've decided to use locking within my app to synchronize access to the code that does this query.
由于我还没有找到更好的答案,我决定在我的应用程序中使用锁定来同步对执行此查询的代码的访问。
回答by HUAGHAGUAH
How about the following? It mightbe treated more atomically than the other examples but should stillbe tested to make sure my assumptions aren't wrong.
以下情况如何?它可能更原子处理比其他的例子,但应该仍然进行测试,以确保我的假设是没有错的。
UPDATE users SET flags = 1 WHERE id = ( SELECT id FROM users WHERE flags = 0 ORDER BY id DESC LIMIT 1 ) RETURNING ...;
You'll probably still be stuck with whatever locking scheme postgres uses internally to supply consistent SELECT results in the face of a simultaneous UPDATEs.
您可能仍然会被 postgres 在内部使用的任何锁定方案所困扰,以在面对同步更新时提供一致的 SELECT 结果。
回答by HUAGHAGUAH
I faced the same problem in our application and came up with a solution that is very similar to Grant Johnson's approach. A FIFO or LIFO pipe was not an option because we have a cluster of application servers accessing one DB. What we do is a
我在我们的申请中遇到了同样的问题,并提出了一个与 Grant Johnson 的方法非常相似的解决方案。FIFO 或 LIFO 管道不是一种选择,因为我们有一组应用服务器访问一个数据库。我们所做的是
SELECT ... WHERE FLAG=0 ... FOR UPDATE
紧接着是一个 UPDATE ... SET FLAG=1 WHERE ID=:id
尽快以保持锁定时间尽可能短。根据表列数和大小,仅在第一个选择中获取 ID 并在标记行以获取剩余数据后可能会有所帮助。存储过程可以进一步减少往返次数。