python 如何在 RabbitMQ 服务器上设置超时检测?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1345239/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to set timeout detection on a RabbitMQ server?
提问by Unknown
I am trying out RabbitMQwith thispython binding.
One thing I noticed is that if I kill a consumer uncleanly (emulating a crashed program), the server will think that this consumer is still there for a long time. The result of this is that every other message will be ignored.
我注意到的一件事是,如果我不干净地杀死了一个消费者(模拟一个崩溃的程序),服务器会认为这个消费者仍然存在很长时间。这样做的结果是所有其他消息都将被忽略。
For example if you kill a consumer 1 time and reconnect, then 1/2 messages will be ignored. If you kill another consumer, then 2/3 messages will be ignored. If you kill a 3rd, then 3/4 messages will be ignored and so on.
例如,如果您杀死消费者 1 次并重新连接,则将忽略 1/2 消息。如果你杀死另一个消费者,那么 2/3 的消息将被忽略。如果你杀死了第 3 个,那么 3/4 条消息将被忽略,依此类推。
I've tried turning on acknowledgments, but it doesn't seem to be helping. The only solution I have found is to manually stop the server and reset it.
我试过打开确认,但它似乎没有帮助。我找到的唯一解决方案是手动停止服务器并重置它。
Is there a better way?
有没有更好的办法?
How to recreate this scenario
如何重现这个场景
Run rabbitmq.
Unarchive this library.
Download the consumer and publisher here. Run amqp_consumer.py twice. Run amqp_publisher.py, feeding in some data and observe that it works as expected. Messages are received round robin style.
Kill one of the consumer processes with kill -9 or task manager.
Now when you publish a message, 50% of the messages will be lost.
回答by Tony Garnock-Jones
I don't see amqp_consumer.py
or amqp_producer.py
in the tarball, so reproducing the fault is tricky.
我没有看到amqp_consumer.py
或amqp_producer.py
在 tarball 中,所以重现错误很棘手。
RabbitMQ terminates connections, releasing their unacknowledged messages for redelivery to other clients, whenever it is told by the operating system that a socket has closed. Your symptoms are very strange, in that even a kill -9
ought to cause the TCP socket to be cleaned up properly.
RabbitMQ 终止连接,释放未确认的消息以重新传递给其他客户端,只要操作系统告诉它套接字已关闭。你的症状很奇怪,甚至kill -9
应该导致TCP套接字被正确清理。
Some people have noticed problems with sockets surviving longer than they should when running with a firewall or NAT device between the AMQP clients and the server. Could that be an issue here, or are you running everything on localhost? Also, what operating system are you running the various components of the system on?
有些人已经注意到,当在 AMQP 客户端和服务器之间使用防火墙或 NAT 设备运行时,套接字的存活时间比他们应该的时间长。这可能是这里的问题,还是您在本地主机上运行所有内容?另外,您在什么操作系统上运行系统的各种组件?
ETA:From your comment below, I am guessing that while you are running the server on Linux, you may be running the clients on Windows. If this is the case, then it could be that the Windows TCP driver is not closing the sockets correctly, which is different from the kill-9 behaviour on Unix. (On Unix, the kernel will properly close the TCP connections on any killed process.)
ETA:根据您下面的评论,我猜您在 Linux 上运行服务器时,您可能在 Windows 上运行客户端。如果是这种情况,则可能是 Windows TCP 驱动程序没有正确关闭套接字,这与 Unix 上的 kill-9 行为不同。(在 Unix 上,内核将正确关闭任何被杀死进程的 TCP 连接。)
If that's the case, then the bad newsis that RabbitMQ can only release resources when the socket is closed, so if the client operating system doesn't do that, there's nothing it can do. This is the same as almost every other TCP-based service out there.
如果是这样,那么坏消息是RabbitMQ只能在socket关闭时释放资源,所以如果客户端操作系统不这样做,它也无能为力。这与几乎所有其他基于 TCP 的服务相同。
The good news, though, is that AMQP supports a "heartbeat" option for exactly these cases, where the networking fabric is untrustworthy. You could try enabling heartbeats. When they're enabled, if the server doesn't receive any traffic within a configurable interval, it decides that the connection must be dead.
的好消息,不过,是AMQP支持正是这些情况下,当网络结构是不值得信任的“心跳”选项。您可以尝试启用心跳。当它们被启用时,如果服务器在可配置的时间间隔内没有收到任何流量,它就会决定连接一定是死的。
The bad news, however, is that I don't think py-amqplib supports heartbeats at the moment. Worth a try, though!
在坏消息,然而,就是我不认为PY-amqplib支撑心跳的时刻。不过值得一试!
回答by Vinay Sajip
RabbitMQ doesn't have a timeout on acknowledgements from the client that a message has been processed: see this post(the whole thread might be of interest). Some salient points from the post:
RabbitMQ 对来自客户端的消息已被处理的确认没有超时:请参阅这篇文章(可能对整个线程感兴趣)。帖子中的一些要点:
The AMQP ack model for subscriptions and "pull" are identical. In both cases the message is kept on the server but is unavailable to other consumers until it either has been ack'ed (and gets removed), nack'ed (with basic.reject; though RabbitMQ does not implement that) or the channel/connection is closed (at which point the message becomes available to other consumers).
订阅和“拉取”的 AMQP ack 模型是相同的。在这两种情况下,消息都保留在服务器上,但其他消费者不可用,直到它被确认(并被删除)、被确认(使用 basic.reject;虽然 RabbitMQ 没有实现)或通道/连接关闭(此时消息可供其他消费者使用)。
and (my emphases)
和(我的重点)
There is no timeout on waiting for acks. Usually that is not a problem since the common cases of a missing ack - network or client failure - will result in the connection getting dropped(and thus trigger the behaviour described above). Still, a timeout could be useful to, say, deal with alive but unresponsive consumers. That has come up in discussion before. Is there a specific use case you have in mind that requires such functionality?
等待确认没有超时。通常这不是问题, 因为丢失 ack 的常见情况 - 网络或客户端故障 - 将导致连接断开(从而触发上述行为)。尽管如此,超时对于处理活着但没有响应的消费者来说可能是有用的。这在之前的讨论中已经出现过。您是否想到了需要此类功能的特定用例?
The problem might well be occurring because in a client pull model, it's harder for the server to detect a broken connection (as opposed to an alive but unresponsive consumer), particularly as the server seems happy to wait forever for an ack.
问题很可能正在发生,因为在客户端拉模型中,服务器更难检测到断开的连接(与活着但没有响应的消费者相反),特别是当服务器似乎很乐意永远等待 ack 时。
Update:On Linux, you can attach signal handlers for SIGTERM and/or SIGKILL and/or SIGINT and hopefully close down the connection in an orderly way from the client. On Windows, I believe closing from Task Manager invokes the Win32 TerminateProcess
API, about which MSDN says:
更新:在 Linux 上,您可以为 SIGTERM 和/或 SIGKILL 和/或 SIGINT 附加信号处理程序,并希望以有序的方式从客户端关闭连接。在 Windows 上,我相信从任务管理器关闭会调用 Win32 TerminateProcess
API,MSDN 对此表示:
If a process is terminated by
TerminateProcess
, all threads of the process are terminated immediately with no chance to run additional code. This means that the thread does not execute code in termination handler blocks. In addition, no attached DLLs are notified that the process is detaching.
如果进程被 终止
TerminateProcess
,则该进程的所有线程都将立即终止,而没有机会运行其他代码。这意味着线程不会执行终止处理程序块中的代码。此外,不会通知附加的 DLL 进程正在分离。
This means it might be difficult to catch termination and close down in an orderly way.
这意味着可能很难以有序的方式捕捉终止和关闭。
It might be worth pursuing on the RabbitMQ list with your own use case for an ack timeout.
可能值得在 RabbitMQ 列表中使用您自己的 ack 超时用例进行研究。
回答by yawn
Please provide a few more specifics regarding the components you've declared. Usually (and independent of the the client implementation) a queue with the properties
请提供有关您声明的组件的更多细节。通常(并且独立于客户端实现)具有属性的队列
- exclusive and
- auto-delete
- 排他性和
- 自动删除
should get removed as soon as the connection between the declaring client and the broker breaks up. This won't help you with shared queues, though. Please detail a bit what exactly you are trying to model.
一旦声明的客户端和代理之间的连接中断,就应该被删除。但是,这对共享队列没有帮助。请详细说明您正在尝试建模的内容。