database mongodb 在 CAP 定理中处于什么位置?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11292215/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 08:36:47  来源:igfitidea点击:

Where does mongodb stand in the CAP theorem?

mongodbcap-theoremdatabasenosql

提问by Gluz

Everywhere I look, I see that MongoDB is CP. But when I dig in I see it is eventually consistent. Is it CP when you use safe=true? If so, does that mean that when I write with safe=true, all replicas will be updated before getting the result?

无论我看哪里,我都看到 MongoDB 是 CP。但是当我深入研究时,我发现它最终是一致的。使用safe=true时是CP吗?如果是这样,那是否意味着当我使用 safe=true 写入时,所有副本都会在获得结果之前更新?

采纳答案by stbrody

MongoDB is strongly consistent by default - if you do a write and then do a read, assuming the write was successful you will always be able to read the result of the write you just read. This is because MongoDB is a single-master system and all reads go to the primary by default. If you optionally enable reading from the secondaries then MongoDB becomes eventually consistent where it's possible to read out-of-date results.

默认情况下,MongoDB 是强一致性的——如果你先写然后读,假设写成功,你将始终能够读取你刚读的写的结果。这是因为 MongoDB 是一个单主系统,默认情况下所有读取都转到主系统。如果您有选择地启用从辅助节点读取,那么 MongoDB 最终会变得一致,可以读取过时的结果。

MongoDB also gets high-availability through automatic failover in replica sets: http://www.mongodb.org/display/DOCS/Replica+Sets

MongoDB 还通过副本集中的自动故障转移获得高可用性:http: //www.mongodb.org/display/DOCS/Replica+Sets

回答by JoCa

I agree with Luccas post. You can't just say that MongoDB is CP/AP/CA, because it actually is a trade-off between C, A and P, depending on both database/driver configuration and type of disaster: here's a visual recap, and below a more detailed explanation.

我同意卢卡斯的帖子。你不能只说 MongoDB 是 CP/AP/CA,因为它实际上是C、A 和 P 之间权衡,这取决于数据库/驱动程序配置和灾难类型:这是一个视觉回顾,下面是更详细的解释。

    Scenario                   | Main Focus | Description
    ---------------------------|------------|------------------------------------
    No partition               |     CA     | The system is available 
                               |            | and provides strong consistency
    ---------------------------|------------|------------------------------------
    partition,                 |     AP     | Not synchronized writes 
    majority connected         |            | from the old primary are ignored                
    ---------------------------|------------|------------------------------------
    partition,                 |     CP     | only read access is provided
    majority not connected     |            | to avoid separated and inconsistent systems

Consistency:

一致性:

MongoDB is strongly consistent when you use a single connection or the correct Write/Read Concern Level(Which will cost you execution speed). As soon as you don't meet those conditions (especially when you are reading from a secondary-replica) MongoDB becomes Eventually Consistent.

当您使用单个连接或正确的写入/读取关注级别这会降低执行速度)时,MongoDB 是高度一致的。一旦您不满足这些条件(特别是当您从辅助副本读取时),MongoDB 就会最终变得一致。

Availability:

可用性:

MongoDB gets high availability through Replica-Sets. As soon as the primary goes down or gets unavailable else, then the secondaries will determine a new primary to become available again. There is an disadvantage to this: Every write that was performed by the old primary, but not synchronized to the secondaries will be rolled backand saved to a rollback-file, as soon as it reconnects to the set(the old primary is a secondary now). So in this case some consistency is sacrificed for the sake of availability.

MongoDB 通过Replica-Sets获得高可用性。一旦主节点出现故障或变得不可用,则辅助节点将确定一个新的主节点再次可用。这样做有一个缺点:每次由旧主执行但未同步到辅助的写入将回滚并保存到回滚文件,只要它重新连接到集合(旧主是辅助现在)。所以在这种情况下,为了可用性而牺牲了一些一致性。

Partition Tolerance:

分区容差:

Through the use of said Replica-Sets MongoDB also achieves the partition tolerance: As long as more than half of the servers of a Replica-Set is connected to each other, a new primary can be chosen. Why? To ensure two separated networks can not both choose a new primary. When not enough secondaries are connected to each other you can still read from them (but consistency is not ensured), but not write. The set is practically unavailable for the sake of consistency.

通过使用上述 Replica-Sets MongoDB 还实现了分区容错性:只要一个 Replica-Set 的一半以上的服务器相互连接,就可以选择一个新的主服务器。为什么?保证两个分离的网络不能同时选择一个新的primary。当没有足够的辅助节点相互连接时,您仍然可以从它们读取(但不能确保一致性),但不能写入。为了一致性,该集合实际上不可用。

回答by Luccas

As a brilliant new articleshowed up and also some awesome experiments by Kylein this field, you should be careful when labeling MongoDB, and other databases, as C or A.

随着一篇出色的新文章的出现以及Kyle在该领域的一些很棒的实验,您在将 MongoDB 和其他数据库标记为 C 或 A 时应该小心。

Of course CAP helps to track down without much words what the database prevails about it, but people often forget that C in CAP means atomic consistency (linearizability), for example. And this caused me lots of pain to understand when trying to classify. So, besides MongoDB give strong consistency, that doesn't mean that is C. In this way, if one make this classifications, I recommended to also give more depth in how it actually works to not leave doubts.

当然,CAP 有助于在不多言的情况下追踪数据库的优势,但人们经常忘记,例如,CAP 中的 C 意味着原子一致性(线性化)。这让我在尝试分类时很难理解。所以,除了MongoDB给出了强一致性之外,这并不意味着它就是C。这样,如果进行这种分类,我建议也更深入地了解它的实际运作方式,以免留下疑问。

回答by Jan Prieser

Yes, it is CP when using safe=true. This simply means, the data made it to the masters disk. If you want to make sure it also arrived on some replica, look into the 'w=N' parameter where N is the number of replicas the data has to be saved on.

是的,使用时是CP safe=true。这只是意味着,数据进入了主磁盘。如果您想确保它也到达某个副本,请查看 'w=N' 参数,其中 N 是必须保存数据的副本数。

see thisand thisfor more information.

有关更多信息,请参阅

回答by kubal5003

I'm not sure about P for Mongo. Imagine situation:

我不确定 Mongo 的 P。设想情况:

  • Your replica gets split into two partitions.
  • Writes continue to both sides as new masters were elected
  • Partition is resolved - all servers are now connected again
  • What happens is that new master is elected - the one that has highest oplog, but the data from the other master gets reverted to the common state before partition and it is dumped to a file for manual recovery
  • all secondaries catch up with the new master
  • 您的副本被分成两个分区。
  • Writes continue to both sides as new masters were elected
  • 分区已解决 - 所有服务器现在再次连接
  • 发生的事情是选择了新的 master - 具有最高 oplog 的那个,但是来自另一个 master 的数据恢复到分区前的公共状态,并将其转储到文件中以进行手动恢复
  • 所有次要赶上新的主人

The problem here is that the dump file size is limited and if you had a partition for a long time you can loose your data forever.

这里的问题是转储文件的大小是有限的,如果你有一个分区很长一段时间,你可能会永远丢失你的数据。

You can say that it's unlikely to happen - yes, unless in the cloud where it is more common than one may think.

您可以说这不太可能发生 - 是的,除非在云中它比人们想象的更常见。

This example is why I would be very careful before assigning any letter to any database. There's so many scenarios and implementations are not perfect.

这个例子就是为什么在将任何字母分配给任何数据库之前我会非常小心的原因。场景太多,实现并不完美。

If anyone knows if this scenario has been addressed in later releases of Mongo please comment! (I haven't been following everything that was happening for some time..)

如果有人知道这种情况是否已在 Mongo 的后续版本中得到解决,请发表评论!(我已经有一段时间没有关注正在发生的一切了..)

回答by sn.anurag

Mongodb never allows write to secondary. It allows optional reads from secondary but not writes. So if your primary goes down, you can't write till a secondary becomes primary again. That is how, you sacrifice High Availability in CAP theorem. By keeping your reads only from primary you can have strong consistency.

Mongodb 从不允许写入次要。它允许从辅助读取但不允许写入。因此,如果您的主要设备出现故障,则在次要设备再次成为主要设备之前,您将无法写入。这就是你在 CAP 定理中牺牲高可用性的方式。通过保持只从主要读取,您可以获得很强的一致性。

回答by Rajneesh Prakash

MongoDB selects Consistency over Availability whenever there is a Partition. What it means is that when there's a partition(P) it chooses Consistency(C) over Availability(A).

只要有分区,MongoDB 就会选择一致性而不是可用性。这意味着当有一个分区(P)时,它选择一致性(C)而不是可用性(A)。

To understand this, Let's understand how MongoDB does replica set works. A Replica Set has a single Primary node. The only "safe" way to commit data is to write to that node and then wait for that data to commit to a majority of nodes in the set. (you will see that flag for w=majority when sending writes)

为了理解这一点,让我们了解 MongoDB 是如何工作的。副本集有一个主节点。提交数据的唯一“安全”方法是写入该节点,然后等待该数据提交到集合中的大多数节点。(发送写入时,您将看到 w=majority 的标志)

Partition can occur in two scenarios as follows :

分区可以在以下两种情况下发生:

  • When Primary node goes down: system becomes unavailable until a new primary is selected.
  • When Primary node looses connection from too many Secondary nodes: system becomes unavailable. Other secondaries will try to elect a new Primary and current primary will step down.
  • 当主节点宕机时:系统变得不可用,直到选择新的主节点。
  • 当 Primary 与太多 Secondary 节点失去连接时:系统变得不可用。其他次要节点将尝试选举一个新的主要节点,当前的主要节点将下台。

Basically, whenever a partition happens and MongoDB needs to decide what to do, it will choose Consistency over Availability. It will stop accepting writes to the system until it believes that it can safely complete those writes.

基本上,每当发生分区并且 MongoDB 需要决定做什么时,它都会选择一致性而不是可用性。它将停止接受对系统的写入,直到它相信它可以安全地完成这些写入。