SQL 在 v4 之前,MongoDB 不符合 ACID 的真正含义是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7149890/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What did MongoDB not being ACID compliant before v4 really mean?
提问by Lance Pollard
I am not a database expert and have no formal computer science background, so bear with me. I want to know the kinds of real worldnegative things that can happen if you use an old MongoDB version prior to v4, which were not ACIDcompliant. This applies to any ACID noncompliant database.
我不是数据库专家,也没有正式的计算机科学背景,所以请耐心等待。我想知道如果您使用不符合ACID 的v4 之前的旧MongoDB 版本,可能会发生哪些现实世界的负面事情。这适用于任何不符合 ACID 的数据库。
I understand that MongoDB can perform Atomic Operations, but that they don't "support traditional locking and complex transactions", mostly for performance reasons. I also understand the importance of database transactions, and the example of when your database is for a bank, and you're updating several records that all need to be in sync, you want the transaction to revert back to the initial state if there's a power outage so credit equals purchase, etc.
我知道 MongoDB 可以执行Atomic Operations,但它们不“支持传统锁定和复杂事务”,主要是出于性能原因。我也了解数据库事务的重要性,以及当您的数据库用于银行时的示例,并且您正在更新所有需要同步的几条记录,如果有一个,您希望事务恢复到初始状态停电所以信用等于购买等。
But when I get into conversations about MongoDB, those of us that don't know the technical details of how databases are actually implemented start throwing around statements like:
但是当我开始谈论 MongoDB 时,我们这些不了解数据库实际实现方式的技术细节的人开始抛出以下语句:
MongoDB is way faster than MySQL and Postgres, but there's a tiny chance, like 1 in a million, that it "won't save correctly".
MongoDB 比 MySQL 和 Postgres 快得多,但它“无法正确保存”的可能性很小,比如百万分之一。
That "won't save correctly" part is referring to this understanding: If there's a power outage right at the instant you're writing to MongoDB, there's a chance for a particular record (say you're tracking pageviews in documents with 10 attributes each), that one of the documents only saved 5 of the attributes… which means over time your pageview counters are going to be "slightly" off. You'll never know by how much, you know they'll be 99.999% correct, but not 100%. This is because, unless you specifically made this a mongodb atomic operation, the operation is not guaranteed to have been atomic.
“不会正确保存”的部分指的是这种理解:如果在您写入 MongoDB 的那一刻断电,则可能会出现特定记录(假设您正在跟踪具有 10 个属性的文档中的浏览量)每个),其中一个文档只保存了 5 个属性……这意味着随着时间的推移,您的综合浏览量计数器将“略微”关闭。你永远不会知道有多少,你知道他们会 99.999% 正确,但不是 100%。这是因为,除非您专门将其设为mongodb 原子操作,否则无法保证该操作是原子操作。
So my question is, what is the correct interpretation of when and why MongoDB may not "save correctly"? What parts of ACID does it not satisfy, and under what circumstances, and how do you know when that 0.001% of your data is off? Can't this be fixed somehow? If not, this seems to mean that you shouldn't store things like your users
table in MongoDB, because a record might not save. But then again, that 1/1,000,000 user might just need to "try signing up again", no?
所以我的问题是,对何时以及为什么 MongoDB 可能无法“正确保存”的正确解释是什么?它不满足 ACID 的哪些部分,在什么情况下,您如何知道 0.001% 的数据何时关闭?这不能以某种方式修复吗?如果没有,这似乎意味着您不应该users
在 MongoDB 中存储诸如表之类的内容,因为记录可能无法保存。但话说回来,那 1/1,000,000 用户可能只需要“尝试再次注册”,不是吗?
I am just looking for maybe a list of when/why negative things happen with an ACID noncompliant database like MongoDB, and ideally if there's a standard workaround (like run a background job to cleanup data, or only use SQL for this, etc.).
我只是在寻找一个列表,列出何时/为什么不符合 ACID 的数据库(如 MongoDB)会发生负面事情,理想情况下是否有标准的解决方法(例如运行后台作业来清理数据,或仅为此使用 SQL 等) .
采纳答案by Bryan Migliorisi
One thing you lose with MongoDB is multi-collection (table) transactions. Atomic modifiers in MongoDB can only work against a single document.
使用 MongoDB 会丢失的一件事是多集合(表)事务。MongoDB 中的原子修饰符只能对单个文档起作用。
If you need to remove an item from inventory and add it to someone's order at the same time - you can't. Unless those two things - inventory and orders - exist in the same document (which they probably do not).
如果您需要从库存中删除一个项目并同时将其添加到某人的订单中 - 您不能。除非这两件事——库存和订单——存在于同一个文件中(他们可能不存在)。
I encountered this very same issue in an application I am working on and had two possible solutions to choose from:
我在我正在处理的应用程序中遇到了同样的问题,并且有两种可能的解决方案可供选择:
1) Structure your documents as best you can and use atomic modifiers as best you can and for the remaining bit, use a background process to cleanup records that may be out of sync. For example, I remove items from inventory and add them to a reservedInventory array of the same document using atomic modifiers.
1) 尽可能地构建您的文档并尽可能地使用原子修饰符,对于剩下的部分,使用后台进程来清理可能不同步的记录。例如,我从库存中删除项目,并使用原子修饰符将它们添加到同一文档的 reservedInventory 数组中。
This lets me always know that items are NOT available in the inventory (because they are reserved by a customer). When the customer check's out, I then remove the items from the reservedInventory. Its not a standard transaction and since the customer could abandon the cart, I need some background process to go through and find abandoned carts and move the reserved inventory back into the available inventory pool.
这让我始终知道库存中没有物品(因为它们是由客户预订的)。当客户结账时,我会从 reservedInventory 中删除这些项目。这不是标准交易,由于客户可能会放弃购物车,因此我需要一些后台进程来查找废弃的购物车并将保留的库存移回可用库存池。
This is obviously less than ideal, but its the only part of a large application where mongodb does not fit the need perfectly. Plus, it works flawlessly thus far. This may not be possible for many scenarios, but because of the document structure I am using, it fits well.
这显然不太理想,但它是大型应用程序中 mongodb 不能完美满足需求的唯一部分。此外,到目前为止,它的工作完美无缺。这对于许多场景来说可能是不可能的,但由于我使用的文档结构,它非常适合。
2) Use a transactional database in conjunction with MongoDB. It is common to use MySQL to provide transactions for the things that absolutely need them while letting MongoDB (or any other NoSQL) do what it does best.
2) 将事务数据库与 MongoDB 结合使用。通常使用 MySQL 为绝对需要它们的事物提供事务,同时让 MongoDB(或任何其他 NoSQL)做它最擅长的事情。
If my solution from #1 does not work in the long run, I will investigate further into combining MongoDB with MySQL but for now #1 suits my needs well.
如果我的解决方案 #1 从长远来看不起作用,我将进一步研究将 MongoDB 与 MySQL 结合,但现在 #1 非常适合我的需求。
回答by William Z
It's actually not correct that MongoDB is not ACID-compliant. On the contrary, MongoDB is ACID-compilant at the document level.
MongoDB 不符合 ACID 的说法实际上是不正确的。相反,MongoDB在文档级别是 ACID 兼容的。
Any update to a single document is
对单个文档的任何更新都是
- Atomic: it either fully completes or it does not
- Consistent: no reader will see a "partially applied" update
- Isolated: again, no reader will see a "dirty" read
- Durable: (with the appropriate write concern)
- Atomic:要么完全完成,要么不完成
- 一致:没有读者会看到“部分应用”的更新
- 隔离:再次,没有读者会看到“脏”读
- 持久:(具有适当的写关注)
What MongoDB doesn't have is transactions-- that is, multiple-document updates that can be rolled back and are ACID-compliant.
MongoDB 没有事务——即可以回滚且符合 ACID 的多文档更新。
Note that you can build transactions on top of the ACID-compliant updates to a single document, by using two-phase commit.
请注意,您可以通过使用两阶段提交,在对单个文档的 ACID 兼容更新之上构建事务。
回答by duffymo
A good explanation is contained in "Starbucks Does Not Use Two Phase Commit".
“Starbucks 不使用两阶段提交”中有一个很好的解释。
It's not about NoSQL databases, but it does illustrate the point that sometimes you can afford to lose a transaction or have your database in an inconsistent state temporarily.
这与 NoSQL 数据库无关,但它确实说明了有时您可以承受丢失事务或让数据库暂时处于不一致状态的后果。
I wouldn't consider it to be something that needs to be "fixed". The fix is to use an ACID-compliant relational database. You choose a NoSQL alternative when its behavior meets your application requirements.
我不会认为它是需要“修复”的东西。解决方法是使用符合 ACID 的关系数据库。当 NoSQL 的行为满足您的应用程序要求时,您可以选择它。
回答by SubGate
I think other people gave good answers already. However i would like to add that there are ACID NOSQL DBs (like http://ravendb.net/). So it is not only decision NOSQL - no ACID vs Relational with ACID....
我认为其他人已经给出了很好的答案。但是我想补充一点,有 ACID NOSQL DB(如http://ravendb.net/)。所以这不仅是决定 NOSQL - 没有 ACID 与有 ACID 的关系......
回答by Sergey
"won't save correctly" could mean:
“不会正确保存”可能意味着:
By default MongoDB does not save your changes to the drive immediately. So there is a possibility that you tell a user "update is successful", power outage happens and the update is lost. MongoDB provides options to control level of update "durability". It can wait for the other replica(s) to receive this update (in memory), wait for the write to happen to the local journal file, etc.
There is no easy "atomic" updates to multiple collections and even multiple documents in the same collection. It's not a problem in most cases because it can be circumvented with Two Phase Commit, or restructuring your schema so updates are made to a single document. See this question: Document Databases: Redundant data, references, etc. (MongoDB specifically)
默认情况下,MongoDB 不会立即将您的更改保存到驱动器。因此,您可能会告诉用户“更新成功”,发生断电并且更新丢失。MongoDB 提供了控制更新“持久性”级别的选项。它可以等待其他副本接收此更新(在内存中),等待写入本地日志文件等。
对多个集合甚至同一集合中的多个文档都没有简单的“原子”更新。在大多数情况下这不是问题,因为它可以通过两阶段提交来规避,或者重构您的架构以便对单个文档进行更新。请参阅此问题:文档数据库:冗余数据、引用等(特别是 MongoDB)
回答by Grigori Melnik
As of MongoDB v4.0, multi-document ACID transactions are to be supported. Through snapshot isolation, transactions will provide a globally consistent view of data, and enforce all-or-nothing execution to maintain data integrity.
从 MongoDB v4.0 开始,将支持多文档 ACID 事务。通过快照隔离,事务将提供全局一致的数据视图,并强制执行全有或全无执行以维护数据完整性。
They feel like transactions from the relational world, e.g.:
他们感觉像是来自关系世界的交易,例如:
with client.start_session() as s:
s.start_transaction()
try:
collection.insert_one(doc1, session=s)
collection.insert_one(doc2, session=s)
s.commit_transaction()
except Exception:
s.abort_transaction()
See https://www.mongodb.com/blog/post/multi-document-transactions-in-mongodb
见https://www.mongodb.com/blog/post/multi-document-transactions-in-mongodb
回答by Ely
Please read about the ACID propertiesto gain better understanding.
请阅读ACID 属性以获得更好的理解。
Also in the MongoDB documentation you can find a question and answer.
同样在 MongoDB 文档中,您可以找到一个问题和答案。
MongoDB is not ACID compliant. Read below for a discussion of the ACID compliance.
MongoDB 不符合 ACID。阅读以下内容,了解有关 ACID 合规性的讨论。
- MongoDB is
A
tomic on document level only. It does not comply with the definition of atomic that we know from relational database systems, in particular the link above. In this sense MongoDB does not comply with the A from ACID. - MongoDB is
C
onsitent by default. However, you canread from secondary servers in a replica set. You can only have eventual consistencyin this case. This is useful if you don't mind to read slightly outdated data. - MongoDB does not guarantee
I
solation (again according to above definition):
- MongoDB
A
仅在文档级别是主题。它不符合我们从关系数据库系统中知道的原子的定义,特别是上面的链接。从这个意义上说,MongoDB 不符合 ACID 中的 A。 C
默认情况下,MongoDB 是现场的。 但是,您可以从副本集中的辅助服务器读取。在这种情况下,您只能具有最终一致性。如果您不介意阅读稍微过时的数据,这将非常有用。- MongoDB 不保证
I
隔离(再次根据上面的定义):
- For systems with multiple concurrent readers and writers, MongoDB will allow clients to read the results of a write operation before the write operation returns.
- If the mongod terminates before the journal commits, even if a write returns successfully, queries may have read data that will not exist after the mongod restarts.
However, MongoDB modifies each document in isolation (for inserts and updates); on document level only, not on multi-document transactions.
- 对于具有多个并发读写器的系统,MongoDB 将允许客户端在写操作返回之前读取写操作的结果。
- 如果 mongod 在日志提交之前终止,即使写入成功返回,查询可能会读取在 mongod 重新启动后将不存在的数据。
但是,MongoDB 会单独修改每个文档(用于插入和更新);仅在文档级别,而不是多文档交易。
- In regards to
D
urability - you can configure this behaviour with thewrite concern
option, not sure though. Maybe someone knows better.
- 关于
D
可用性 - 您可以使用该write concern
选项配置此行为,但不确定。也许有人知道得更好。
I believe some research is ongoing to move NoSQL towards ACID constraints or similar. This is a challenge because NoSQL databases are usually fast(er) and ACID constraints can slow down performance significantly.
我相信正在进行一些研究,以将 NoSQL 转向 ACID 约束或类似的约束。这是一个挑战,因为 NoSQL 数据库通常更快(更)并且 ACID 约束会显着降低性能。
回答by joeshmoe
The only reason atomic modifies work against a single-collection is because the mongodb developers recently exchanged a database lock with a collection wide write-lock. Deciding that the increased concurrency here was worth the trade-off. At it's core, mongodb is a memory-mapped file: they've delegated the buffer-pool management to the machine's vm subsystem. Because it's always in memory, they're able to get away with very course grained locks: you'll be performing in-memory only operations while holding it, which will be extremely fast. This differs significantly from a traditional database system which is sometimes forced to perform I/O while holding a pagelock or a rowlock.
原子修改针对单个集合的工作的唯一原因是因为 mongodb 开发人员最近用集合范围的写锁交换了数据库锁。决定这里增加的并发性值得权衡。mongodb 的核心是一个内存映射文件:他们将缓冲池管理委托给机器的 vm 子系统。因为它总是在内存中,所以他们能够摆脱非常粗粒度的锁:当你持有它时,你将只在内存中执行操作,这将非常快。这与传统的数据库系统有很大的不同,传统的数据库系统有时会在持有页锁或行锁的情况下被迫执行 I/O。
回答by Mysterious25K
"In MongoDB, an operation on a single document is atomic" - That's the thing for past
“在 MongoDB 中,对单个文档的操作是原子性的”——这是过去的事情了
In the new version of MongoDB 4.0you CAN :
在 MongoDB 4.0的新版本中,您可以:
However, for situations that require atomicity for updates to multiple documents or consistency between reads to multiple documents, MongoDB provides the ability to perform multi-document transactions against replica sets. Multi-document transactions can be used across multiple operations, collections, databases, and documents. Multi-document transactions provide an “all-or-nothing” proposition. When a transaction commits, all data changes made in the transaction are saved. If any operation in the transaction fails, the transaction aborts and all data changes made in the transaction are discarded without ever becoming visible. Until a transaction commits, no write operations in the transaction are visible outside the transaction.
但是,对于需要原子性更新多个文档或读取多个文档之间的一致性的情况,MongoDB 提供了针对副本集执行多文档事务的能力。多文档事务可用于多个操作、集合、数据库和文档。多文档交易提供了“全有或全无”的主张。当事务提交时,事务中所做的所有数据更改都将被保存。如果事务中的任何操作失败,事务就会中止,并且事务中所做的所有数据更改都将被丢弃,而不会变得可见。在事务提交之前,事务中的写操作在事务之外是不可见的。
Though there are few limitations for Howand Whatoperations can be performed.
尽管对如何执行和执行什么操作几乎没有限制。
Check the Mongo Doc. https://docs.mongodb.com/master/core/transactions/
检查 Mongo 文档。 https://docs.mongodb.com/master/core/transactions/
回答by rystsov
You can implement atomic multi-key updates (serializable transaction) on the client side if your storage supports per key linearizability and compare and set (which is true for MongoDB). This approach is used in Google's Percolatorand in the CockroachDBbut nothing prevents you from using it with MongoDB.
如果您的存储支持每个键的线性化和比较和设置(这对于 MongoDB 是正确的),您可以在客户端实现原子多键更新(可序列化事务)。这种方法在Google 的 Percolator和CockroachDB 中使用,但没有什么可以阻止您将它与 MongoDB 一起使用。
I've created a step-by-step visualizationof such transactions. I hope it will help you to understand them.
我已经创建了此类交易的分步可视化。我希望它能帮助你理解它们。
If you're fine with read committed isolation level then it makes sense to take a look on RAMP transactionsby Peter Bailis. They also can be implemented for MongoDB on the client side.
如果您对读取提交的隔离级别感到满意,那么查看Peter Bailis 的RAMP 事务是有意义的。它们也可以在客户端为 MongoDB 实现。