postgresql Postgres - 如何调试/跟踪“事务中空闲”连接

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18789586/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:06:20  来源:igfitidea点击:

Postgres - How to debug/trace 'Idle in transaction' connection

postgresqltransactions

提问by Amit

I am using Postgres for one of my applications and sometimes (not very frequently) one of the connection goes into <IDLE> in transactionstate and it keeps acquired lock that causes other connections to wait on these locks ultimately causing my application to hang.

我将 Postgres 用于我的一个应用程序,有时(不是很频繁)其中一个连接进入<IDLE> in transaction状态并保持获取锁,导致其他连接等待这些锁,最终导致我的应用程序挂起。

Following is the output from pg_stat_activitytable for that process:

以下是该pg_stat_activity过程的表输出:

select * from pg_stat_activity

24081 | db     |     798 |    16384 | db     |                  | 10.112.61.218 |                 |       59034 | 2013-09-12 23:46:05.132267+00 | 2013-09-12 23:47:31.763084+00 | 2013-09-12 23:47:31.763534+00 | f       | <IDLE> in transaction

This indicates that PID=798is in <IDLE> in transactionstate. The client process on web server is found as following using the client_port(59034) from above output.

这表明PID=798处于<IDLE> in transaction状态。使用上述输出中的client_port( 59034)可以找到 Web 服务器上的客户端进程,如下所示。

sudo netstat -apl | grep 59034

tcp        0      0 ip-10-112-61-218.:59034 db-server:postgresql    ESTABLISHED 23843/pgbouncer

I know that something is wrong in my application code (I killed one of the running application cron and it freed the locks) that is causing the connection to hang, but I am not able to trace it.

我知道我的应用程序代码有问题(我杀死了一个正在运行的应用程序 cron 并释放了锁)导致连接挂起,但我无法跟踪它。

This is not very frequent and I can't find any definite reproduction steps either as this only occurs on the production server.

这不是很频繁,我也找不到任何明确的复制步骤,因为这只发生在生产服务器上。

I would like to get inputs on how to trace such idle connection, e.g. getting last executed query or some kind of trace-back to identify which part of code is causing this issue.

我想获得有关如何跟踪此类空闲连接的输入,例如获取上次执行的查询或某种回溯以识别导致此问题的代码部分。

回答by jjanes

If you upgrade to 9.2 or higher, the pg_stat_activityview will show you what the most recent query executed was for idle in transactionconnections.

如果升级到 9.2 或更高版本,该pg_stat_activity视图将显示最近执行的idle in transaction连接查询。

select * from pg_stat_activity  \x\g\x

...
waiting          | f
state            | idle in transaction
query            | select count(*) from pg_class ;

You can also (even in 9.1) look in pg_locksto see what locks are being held by the idle in transactionprocess. If it only has locks on very commonly used objects, this might not narrow things down much, but if it was a peculiar lock that could tell you exactly where in your code to look.

您还可以(甚至在 9.1 中)查看进程pg_locks持有哪些锁idle in transaction。如果它只锁定非常常用的对象,这可能不会缩小很多范围,但如果它是一个特殊的锁定,可以告诉您确切地查看代码中的哪个位置。

If you are stuck with 9.1, you can perhaps use the debugger to get all but the first 22 characters of the query (the first 22 are overwritten by the <IDLE> in transaction\0message). For example:

如果您坚持使用 9.1,您或许可以使用调试器来获取查询的前 22 个字符以外的所有字符(前 22 个字符被<IDLE> in transaction\0消息覆盖)。例如:

(gdb) printf "%s\n", ((MyBEEntry->st_activity)+22)