为什么我们需要像 RabbitMQ 这样的消息代理而不是像 PostgreSQL 这样的数据库?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13005410/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why do we need message brokers like RabbitMQ over a database like PostgreSQL?
提问by Yugal Jindle
I am new to message brokers like RabbitMQwhich we can use to create tasks / message queues for a scheduling system like Celery.
我是RabbitMQ这样的消息代理的新手,我们可以用它来为Celery等调度系统创建任务/消息队列。
Now, here is the question:
现在,问题来了:
I can create a table in PostgreSQLwhich can be appended with new tasks and consumed by the consumer program like Celery.
Why on earth would I want to setup a whole new tech for this like RabbitMQ?
我可以在PostgreSQL 中创建一个表,该表可以附加新任务并由 Celery 等消费者程序使用。
我到底为什么要为此设置一个全新的技术,比如 RabbitMQ?
Now, I believe scaling cannot be the answer since our database like PostgreSQL can work in a distributed environment.
现在,我相信扩展不是答案,因为我们的数据库(如 PostgreSQL)可以在分布式环境中工作。
I googled for what problems does the database poses for the particular problem, and I found:
我用谷歌搜索数据库对特定问题造成的问题,我发现:
- polling keeps the database busy and low performing
- locking of the table -> again low performing
- millions of rows of tasks -> again, polling is low performing
- 轮询使数据库保持忙碌且性能低下
- 锁定表 -> 再次表现不佳
- 数百万行任务 -> 再次,轮询性能低下
Now, how does RabbitMQ or any other message broker like that solves these problems?
现在,RabbitMQ 或任何其他类似的消息代理如何解决这些问题?
Also, I found out that AMQP
protocol is what it follows. What's great in that?
此外,我发现AMQP
协议就是它所遵循的。那有什么了不起?
Can Redisalso be used as a message broker? I find it more analogous to Memcached than RabbitMQ.
可以Redis的也可以用作消息代理?我发现它比 RabbitMQ 更类似于 Memcached。
Please shed some light on this!
请对此有所了解!
采纳答案by Jaigus
Rabbit's queues reside in memory and will therefore be much faster than implementing this in a database. A (good)dedicated message queue should also provide essential queueing related features such as throttling/flow control, and the ability to choose different routing algorithms, to name a couple(rabbit provides these and more). Depending on the size of your project, you may also want the message passing component separate from your database, so that if one component experiences heavy load, it need not hinder the other's operation.
Rabbit 的队列驻留在内存中,因此比在数据库中实现它要快得多。一个(好的)专用消息队列还应该提供必要的排队相关功能,例如节流/流量控制,以及选择不同路由算法的能力,举几个例子(兔子提供了这些等等)。根据项目的大小,您可能还希望将消息传递组件与数据库分开,这样如果一个组件负载过重,就不会妨碍另一个组件的操作。
As for the problems you mentioned:
至于你提到的问题:
polling keeping the database buzy and low performing: Using Rabbitmq, producers can pushupdates to consumers which is far more performant than polling. Data is simply sent to the consumer when it needs to be, eliminating the need for wasteful checks.
locking of the table -> again low performing:There is no table to lock :P
millions of rows of task -> again polling is low performing:As mentioned above, Rabbitmq will operate faster as it resides RAM, and provides flow control. If needed, it can also use the disk to temporarily store messages if it runs out of RAM. After 2.0, Rabbit has significantly improved on its RAM usage. Clustering options are also available.
轮询使数据库保持繁忙和低性能:使用 Rabbitmq,生产者可以将更新推送给消费者,这比轮询的性能要好得多。只需在需要时将数据发送给消费者,就无需进行无谓的检查。
锁定表 -> 再次表现不佳:没有要锁定的表:P
数百万行的任务 -> 再次轮询是低性能的:如上所述,Rabbitmq 将运行得更快,因为它驻留在 RAM 中,并提供流量控制。如果需要,它还可以使用磁盘在 RAM 用完时临时存储消息。在 2.0 之后,Rabbit 的 RAM 使用率有了显着改善。集群选项也可用。
In regards to AMQP, I would say a really cool feature is the "exchange", and the ability for it to route to other exchanges. This gives you more flexibility and enables you to create a wide array of elaborate routing typologies which can come in very handy when scaling. For a good example, see:
关于 AMQP,我想说一个非常酷的功能是“交换”,以及它路由到其他交换的能力。这为您提供了更大的灵活性,并使您能够创建各种复杂的路由类型,这些类型在扩展时非常方便。一个很好的例子,请参见:
(source: springsource.com)
(来源:springsource.com)
和:http: //blog.springsource.org/2011/04/01/routing-topologies-for-performance-and-scalability-with-rabbitmq/
Finally, in regards to redis, yes, it can be used as a message broker, and can do well. However, Rabbitmq has more message queuing features than redis, as rabbitmq was built from the ground up to be a full-featured enterprise-level dedicated message queue. Redis on the other hand was primarily created to be an in-memory key-value store(though it does much more than that now; its even referred to as a swiss army knife). Still, I've read/heard many people achieving good results with Redis for smaller sized projects, but haven't heard much about it in larger applications.
最后,关于redis,是的,它可以用作消息代理,并且可以做得很好。但是,Rabbitmq 比 redis 具有更多的消息队列功能,因为 rabbitmq 是从头开始构建的,是一个功能齐全的企业级专用消息队列。另一方面,Redis 主要被创建为内存中的键值存储(尽管它现在的功能远不止于此;它甚至被称为瑞士军刀)。尽管如此,我还是读到/听说过很多人使用 Redis 在较小规模的项目中取得了良好的结果,但在较大的应用程序中却没有听说过很多。
Here is an example of redis being used in a long-polling chat implementation: http://eflorenzano.com/blog/2011/02/16/technology-behind-convore/
以下是在长轮询聊天实现中使用 redis 的示例:http: //eflorenzano.com/blog/2011/02/16/technology-behind-convore/
回答by Craig Ringer
PostgreSQL 9.5
PostgreSQL 9.5
PostgreSQL 9.5 incorporates SELECT ... FOR UPDATE ... SKIP LOCKED
. This makes implementing working queuing systems a lotsimpler and easier. You may no longer require an external queueing system since it's now simple to fetch 'n' rows that no other session has locked, and keep them locked until you commit confirmation that the work is done. It even works with two-phase transactions for when external co-ordination is required.
PostgreSQL 9.5 将SELECT ... FOR UPDATE ... SKIP LOCKED
. 这使得执行工作的排队系统中很多简单和容易。您可能不再需要外部排队系统,因为现在可以轻松获取其他会话未锁定的“n”行,并保持锁定状态,直到您确认工作已完成。当需要外部协调时,它甚至适用于两阶段事务。
External queueing systems remain useful, providing canned functionality, proven performance, integration with other systems, options for horizontal scaling and federation, etc. Nonetheless, for simple cases you don't really need them anymore.
外部排队系统仍然有用,提供固定的功能、经过验证的性能、与其他系统的集成、水平扩展和联合的选项等。尽管如此,对于简单的情况,您不再真正需要它们。
Older versions
旧版本
You don't needsuch tools, but using one may make life easier. Doing queueing in the database looks easy, but you'll discover in practice that high performance, reliable concurrent queuing is really hardto do right in a relational database.
您不需要这样的工具,但使用它们可能会使生活更轻松。在数据库中进行排队看起来很容易,但在实践中您会发现高性能、可靠的并发排队在关系数据库中确实很难做到。
That's why tools like PGQexist.
这就是PGQ 等工具存在的原因。
You can get rid of polling in PostgreSQL by using LISTEN
and NOTIFY
, but that won't solve the problem of reliably handing out entries off the top of the queue to exactly one consumer while preserving highly concurrent operation and not blocking inserts. All the simple and obvious solutions you think will solve that problem actually don't in the real world, and tend to degenerate into less efficient versions of single-worker queue fetching.
您可以使用LISTEN
and摆脱 PostgreSQL 中的轮询NOTIFY
,但这并不能解决将队列顶部的条目可靠地分发给一个消费者同时保留高并发操作而不阻塞插入的问题。您认为可以解决该问题的所有简单而明显的解决方案在现实世界中实际上都没有,并且往往会退化为效率较低的单线程队列获取版本。
If you don't need highly concurrent multi-worker queue fetches then using a single queue table in PostgreSQL is entirely reasonable.
如果您不需要高并发的多线程队列提取,那么在 PostgreSQL 中使用单个队列表是完全合理的。