如何解决 MongoDB 中缺少事务的问题?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6635718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 12:09:24  来源:igfitidea点击:

How to work around the lack of transactions in MongoDB?

mongodbtransactions

提问by NagyI

I know there are similar questions here but they are either telling meto switch back to regular RDBMS systems if I need transactions or use atomic operationsor two-phase commit. The second solution seems the best choice. The third I don't wish to follow because it seems that many things could go wrong and I can't test it in every aspect. I'm having a hard time refactoring my project to perform atomic operations. I don't know whether this comes from my limited viewpoint (I have only worked with SQL databases so far), or whether it actually can't be done.

我知道这里有类似的问题,但如果我需要事务或使用原子操作两阶段提交,他们要么告诉我切换回常规 RDBMS 系统。第二种解决方案似乎是最好的选择。第三个我不想遵循,因为似乎很多事情都可能出错,我无法在每个方面都对其进行测试。我很难重构我的项目以执行原子操作。我不知道这是出于我有限的观点(到目前为止我只使用过 SQL 数据库),还是实际上无法完成。

We would like to pilot test MongoDB at our company. We have chosen a relatively simple project - an SMS gateway. It allows our software to send SMS messages to the cellular network and the gateway does the dirty work: actually communicating with the providers via different communication protocols. The gateway also manages the billing of the messages. Every customer who applies for the service has to buy some credits. The system automatically decreases the user's balance when a message is sent and denies the access if the balance is insufficient. Also because we are customers of third party SMS providers, we may also have our own balances with them. We have to keep track of those as well.

我们想在我们公司试点测试 MongoDB。我们选择了一个相对简单的项目——短信网关。它允许我们的软件将 SMS 消息发送到蜂窝网络,网关会做一些肮脏的工作:实际上是通过不同的通信协议与提供商进行通信。网关还管理消息的计费。每个申请该服务的客户都必须购买一些积分。系统在发送消息时自动减少用户的余额,如果余额不足则拒绝访问。此外,由于我们是第三方 SMS 提供商的客户,我们也可能与他们有自己的余额。我们也必须跟踪这些。

I started thinking about how I can store the required data with MongoDB if I cut down some complexity (external billing, queued SMS sending). Coming from the SQL world, I would create a separate table for users, another one for SMS messages, and one for storing the transactions regarding the users' balance. Let's say I create separate collections for all of those in MongoDB.

我开始考虑如果我减少一些复杂性(外部计费、排队的 SMS 发送),我可以如何使用 MongoDB 存储所需的数据。来自 SQL 世界,我将为用户创建一个单独的表,另一个用于 SMS 消息,一个用于存储有关用户余额的交易。假设我为 MongoDB 中的所有这些创建单独的集合。

Imagine an SMS sending task with the following steps in this simplified system:

想象一下在这个简化的系统中具有以下步骤的 SMS 发送任务:

  1. check if the user has sufficient balance; deny access if there's not enough credit

  2. send and store the message in the SMS collection with the details and cost (in the live system the message would have a statusattribute and a task would pick up it for delivery and set the price of the SMS according to its current state)

  3. decrease the users's balance by the cost of the sent message

  4. log the transaction in the transaction collection

  1. 检查用户是否有足够的余额;如果没有足够的信用,则拒绝访问

  2. 在 SMS 集合中发送和存储带有详细信息和成本的消息(在实时系统中,该消息将具有一个status属性,并且一个任务将接收它以进行传送并根据其当前状态设置 SMS 的价格)

  3. 通过发送消息的成本减少用户的余额

  4. 在交易集合中记录交易

Now what's the problem with that? MongoDB can do atomic updates only on one document. In the previous flow it could happen that some kind of error creeps in and the message gets stored in the database but the user's balance is not updated and/or the transaction is not logged.

现在有什么问题吗?MongoDB 只能对一个文档进行原子更新。在前面的流程中,可能会发生某种错误,消息被存储在数据库中,但用户的余额没有更新和/或交易没有记录。

I came up with two ideas:

我想出了两个想法:

  • Create a single collection for the users, and store the balance as a field, user related transactions and messages as sub documents in the user's document. Because we can update documents atomically, this actually solves the transaction problem. Disadvantages: if the user sends many SMS messages, the size of the document could become large and the 4MB document limit could be reached. Maybe I can create history documents in such scenarios, but I don't think this would be a good idea. Also I don't know how fast the system would be if I push more and more data to the same big document.

  • Create one collection for users, and one for transactions. There can be two kinds of transactions: credit purchasewith positive balance change and messages sentwith negative balance change. Transaction may have a subdocument; for example in messages sentthe details of the SMS can be embedded in the transaction. Disadvantages: I don't store the current user balance so I have to calculate it every time a user tries to send a message to tell if the message could go through or not. I'm afraid this calculation can became slow as the number of stored transactions grows.

  • 为用户创建单个集合,并将余额存储为字段,用户相关的交易和消息存储为用户文档中的子文档。因为我们可以原子地更新文档,这实际上解决了事务问题。缺点:如果用户发送大量短信,文件大小可能会变大,可能会达到4MB 的文件限制。也许我可以在这种情况下创建历史文档,但我认为这不是一个好主意。此外,如果我将越来越多的数据推送到同一个大文档,我也不知道系统的速度有多快。

  • 为用户创建一个集合,为交易创建一个集合。可以有两种交易:余额变化为正的信用购买和余额变化为负的消息发送。交易可能有一个子文件;例如,在发送消息中,可以将 SMS 的详细信息嵌入到交易中。缺点:我不存储当前用户余额,因此每次用户尝试发送消息时我都必须计算它以判断消息是否可以通过。恐怕随着存储交易数量的增加,这种计算会变得很慢。

I'm a little bit confused about which method to pick. Are there other solutions? I couldn't find any best practices online about how to work around these kinds of problems. I guess many programmers who are trying to become familiar with the NoSQL world are facing similar problems in the beginning.

我对选择哪种方法有点困惑。还有其他解决方案吗?我在网上找不到任何关于如何解决这些问题的最佳实践。我想许多试图熟悉 NoSQL 世界的程序员在开始时都面临着类似的问题。

采纳答案by Grigori Melnik

As of 4.0, MongoDB will have multi-document ACID transactions. The plan is to enable those in replica set deployments first, followed by the sharded clusters. Transactions in MongoDB will feel just like transactions developers are familiar with from relational databases - they'll be multi-statement, with similar semantics and syntax (like start_transactionand commit_transaction). Importantly, the changes to MongoDB that enable transactions do not impact performance for workloads that do not require them.

从 4.0 开始,MongoDB 将具有多文档 ACID 事务。计划是首先启用副本集部署中的那些,然后是分片集群。MongoDB 中的事务感觉就像开发人员熟悉的关系数据库中的事务一样——它们是多语句的,具有相似的语义和语法(如start_transactioncommit_transaction)。重要的是,启用事务的 MongoDB 更改不会影响不需要它们的工作负载的性能。

For more details see here.

有关更多详细信息,请参阅此处

Having distributed transactions, doesn't mean that you should model your data like in tabular relational databases. Embrace the power of the document model and follow the good and recommended practicesof data modeling.

拥有分布式事务并不意味着您应该像在表格关系数据库中那样对数据进行建模。拥抱文档模型的强大功能,并遵循数据建模的良好和推荐做法

回答by xameeramir

Living Without Transactions

没有交易的生活

Transactions support ACIDproperties but although there are no transactions in MongoDB, we do have atomic operations. Well, atomic operations means that when you work on a single document that that work will be completed before anyone else sees the document. They'll see all the changes we made or none of them. And using atomic operations, you can often accomplish the same thing we would have accomplished using transactions in a relational database. And the reason is that, in a relational database, we need to make changes across multiple tables. Usually tables that need to be joined and so we want to do that all at once. And to do it, since there are multiple tables, we'll have to begin a transaction and do all those updates and then end the transaction. But with MongoDB, we're going to embed the data, since we're going to pre-joinit in documents and they're these rich documents that have hierarchy. We can often accomplish the same thing. For instance, in the blog example, if we wanted to make sure that we updated a blog post atomically, we can do that because we can update the entire blog post at once. Where as if it were a bunch of relational tables, we'd probably have to open a transaction so that we can update the post collection and comments collection.

事务支持ACID属性,但尽管 中没有事务MongoDB,但我们确实有原子操作。好吧,原子操作意味着当您处理单个文档时,该工作将在其他人看到该文档之前完成。他们会看到我们所做的所有更改,或者不会看到任何更改。使用原子操作,您通常可以完成我们在关系数据库中使用事务完成的相同的事情。原因是,在关系数据库中,我们需要跨多个表进行更改。通常需要连接的表,所以我们希望一次完成。要做到这一点,因为有多个表,我们必须开始一个事务并完成所有这些更新,然后结束该事务。但是随着MongoDB,我们将嵌入数据,因为我们将在文档中预先加入它,而且它们是具有层次结构的丰富文档。我们经常可以完成同样的事情。例如,在博客示例中,如果我们想确保以原子方式更新博客文章,我们可以这样做,因为我们可以一次更新整个博客文章。就像一堆关系表一样,我们可能必须打开一个事务,以便我们可以更新帖子集合和评论集合。

So what are our approaches that we can take in MongoDBto overcome a lack of transactions?

那么我们可以采取哪些方法MongoDB来克服缺乏交易的问题呢?

  • restructure- restructure the code, so that we're working within a single document and taking advantage of the atomic operations that we offer within that document. And if we do that, then usually we're all set.
  • implement in software- we can implement locking in software, by creating a critical section. We can build a test, test and set using find and modify. We can build semaphores, if needed. And in a way, that is the way the larger world works anyway. If we think about it, if one bank need to transfer money to another bank, they're not living in the same relational system. And they each have their own relational databases often. And they've to be able to coordinate that operation even though we cannot begin transaction and end transaction across those database systems, only within one system within one bank. So there's certainly ways in software to get around the problem.
  • tolerate- the final approach, which often works in modern web apps and other applications that take in a tremendous amount of data is to just tolerate a bit of inconsistency. An example would, if we're talking about a friend feed in Facebook, it doesn't matter if everybody sees your wall update simultaneously. If okey, if one person's a few beats behind for a few seconds and they catch up. It often isn't critical in a lot of system designs that everything be kept perfectly consistent and that everyone have a perfectly consistent and the same view of the database. So we could simply tolerate a little bit of inconsistency that's somewhat temporary.
  • 重构- 重构代码,以便我们在单个文档中工作并利用我们在该文档中提供的原子操作。如果我们这样做,那么通常我们都准备好了。
  • 在软件中实现——我们可以通过创建一个临界区来在软件中实现锁定。我们可以使用查找和修改来构建测试、测试和设置。如果需要,我们可以构建信号量。在某种程度上,这就是更大世界的运作方式。如果我们考虑一下,如果一家银行需要将钱转移到另一家银行,他们就不会生活在同一个关系系统中。他们每个人都有自己的关系数据库。而且他们必须能够协调该操作,即使我们无法跨这些数据库系统开始和结束交易,只能在一家银行的一个系统内进行。所以在软件中肯定有办法解决这个问题。
  • 容忍- 最后一种方法,通常适用于现代网络应用程序和其他接受大量数据的应用程序,只是容忍一点不一致。例如,如果我们谈论的是 Facebook 中的好友动态,那么每个人是否同时看到您的墙更新并不重要。如果还好,如果一个人落后几秒钟,然后他们赶上了。在许多系统设计中,一切都保持完全一致并且每个人都拥有完全一致且相同的数据库视图,这通常并不重要。所以我们可以简单地容忍一些暂时的不一致。

Update, findAndModify, $addToSet(within an update) & $push(within an update) operations operate atomically within a single document.

Update, findAndModify, $addToSet(在更新内) & $push(在更新内) 操作在单个文档中原子地操作。

回答by Giovanni Bitliner

Check thisout, by Tokutek. They develop a plugin for Mongo that promises not only transactions but also a boosting in performance.

看看这个,由 Tokutek。他们为 Mongo 开发了一个插件,不仅承诺交易,还承诺提高性能。

回答by Andreas Jung

Bring it to the point: if transactional integrity is a mustthen don't use MongoDB but use only components in the system supporting transactions. It is extremely hard to build something on top of component in order to provide ACID-similar functionality for non-ACID compliant components. Depending on the individual usecases it may make sense to separate actions into transactional and non-transactional actions in some way...

说到点子上:如果事务完整性是必须的,那么不要使用 MongoDB,而只使用支持事务的系统中的组件。为了为不符合 ACID 的组件提供类似 ACID 的功能,在组件之上构建某些东西是极其困难的。根据各个用例,以某种方式将操作分为事务性和非事务性操作可能是有意义的......

回答by pingw33n

Now what's the problem with that? MongoDB can do atomic updates only on one document. In the previous flow it could happen that some kind of error creeps in and the message gets stored in the database but the user's balance is not gets reduced and/or the transaction is not gets logged.

现在有什么问题吗?MongoDB 只能对一个文档进行原子更新。在前面的流程中,可能会发生某种错误,消息会存储在数据库中,但用户的余额没有减少和/或交易没有被记录。

This is not really a problem. The error you mentioned is either a logical (bug) or IO error (network, disk failure). Such kind of error can leave both transactionless and transactional stores in non-consistent state. For example, if it has already sent SMS but while storing message error occurred - it can't rollback SMS sending, which means it won't be logged, user balance won't be reduced etc.

这不是真正的问题。您提到的错误是逻辑(错误)或 IO 错误(网络、磁盘故障)。这种错误会使无事务和事务性存储处于不一致状态。例如,如果它已经发送了短信,但在存储消息时发生错误 - 它无法回滚短信发送,这意味着它不会被记录,用户余额不会减少等。

The real problem here is the user can take advantage of race condition and send more messages than his balance allows. This also applies to RDBMS, unless you do SMS sending inside transaction with balance field locking (which would be a great bottleneck). As a possible solution for MongoDB would be using findAndModifyfirst to reduce the balance and check it, if it's negative disallow sending and refund the amount (atomic increment). If positive, continue sending and in case it fails refund the amount. The balance history collection can be also maintained to help fix/verify balance field.

这里真正的问题是用户可以利用竞争条件并发送比他的余额允许的更多的消息。这也适用于 RDBMS,除非您在具有余额字段锁定的事务内部发送 SMS(这将是一个很大的瓶颈)。作为 MongoDB 的可能解决方案,首先使用findAndModify减少余额并检查它,如果它是负数,则不允许发送和退款金额(原子增量)。如果是肯定的,继续发送,如果失败退还金额。还可以维护余额历史收集以帮助修复/验证余额字段。

回答by Vaibhav

This is probably the best blog I found regarding implementing transaction like feature for mongodb .!

这可能是我发现的关于为 mongodb 实现类似事务的功能的最好的博客。!

Syncing Flag: best for just copying data over from a master document

同步标志:最适合从主文档复制数据

Job Queue: very general purpose, solves 95% of cases. Most systems need to have at least one job queue around anyway!

作业队列:非常通用,解决 95% 的情况。无论如何,大多数系统都需要至少有一个作业队列!

Two Phase Commit: this technique ensure that each entity always has all information needed to get to a consistent state

两阶段提交:此技术确保每个实体始终拥有达到一致状态所需的所有信息

Log Reconciliation: the most robust technique, ideal for financial systems

日志协调:最强大的技术,非常适合金融系统

Versioning: provides isolation and supports complex structures

版本控制:提供隔离并支持复杂的结构

Read this for more info: https://dzone.com/articles/how-implement-robust-and

阅读更多信息:https: //dzone.com/articles/how-implement-robust-and

回答by kheya

Transactions are absent in MongoDB for valid reasons. This is one of those things that make MongoDB faster.

出于正当原因,MongoDB 中不存在事务。这是使 MongoDB 更快的因素之一。

In your case, if transaction is a must, mongo seems not a good fit.

在您的情况下,如果必须进行交易,则 mongo 似乎不太合适。

May be RDMBS + MongoDB, but that will add complexities and will make it harder to manage and support application.

可能是 RDMBS + MongoDB,但这会增加复杂性并使管理和支持应用程序变得更加困难。

回答by Karoly Horvath

The project is simple, but you have to support transactions for payment, which makes the whole thing difficult. So, for example, a complex portal system with hundreds of collections (forum, chat, ads, etc...) is in some respect simpler, because if you lose a forum or chat entry, nobody really cares. If you, on the otherhand, lose a payment transaction that's a serious issue.

项目很简单,但是你必须支持交易支付,这让整个事情变得困难。因此,例如,具有数百个集合(论坛、聊天、广告等)的复杂门户系统在某些方面更简单,因为如果您丢失了一个论坛或聊天条目,没有人真正关心。另一方面,如果您丢失了付款交易,那将是一个严重的问题。

So, if you really want a pilot project for MongoDB, choose one which is simple in thatrespect.

因此,如果您真的想要 MongoDB 的试点项目,请选择一个在方面简单的项目。

回答by ?inh Anh Huy

This is late but think this will help in future. I use Redisfor make a queueto solve this problem.

这已经很晚了,但认为这将有助于未来。我使用Redis制作队列来解决这个问题。

  • Requirement:
    Image below show 2 actions need execute concurrently but phase 2 and phase 3 of action 1 need finish before start phase 2 of action 2 or opposite (A phase can be a request REST api, a database request or execute javascript code...). enter image description here

  • How a queue help you
    Queue make sure that every block code between lock()and release()in many function will not run as the same time, make them isolate.

    function action1() {
      phase1();
      queue.lock("action_domain");
      phase2();
      phase3();
      queue.release("action_domain");
    }
    
    function action2() {
      phase1();
      queue.lock("action_domain");
      phase2();
      queue.release("action_domain");
    }
    
  • How to build a queue
    I will only focus on how avoid race conditonpart when building a queue on backend site. If you don't know the basic idea of queue, come here.
    The code below only show the concept, you need implement in correct way.

    function lock() {
      if(isRunning()) {
        addIsolateCodeToQueue(); //use callback, delegate, function pointer... depend on your language
      } else {
        setStateToRunning();
        pickOneAndExecute();
      }
    }
    
    function release() {
      setStateToRelease();
      pickOneAndExecute();
    }
    
  • 要求:
    下图显示 2 个动作需要同时执行,但动作 1 的阶段 2 和阶段 3 需要在开始动作 2 的阶段 2 或相反之前完成(阶段可以是请求 REST api、数据库请求或执行 javascript 代码...... )。 在此处输入图片说明

  • 队列如何帮助您
    队列确保许多函数之间lock()release()中的每个块代码不会同时运行,使它们隔离。

    function action1() {
      phase1();
      queue.lock("action_domain");
      phase2();
      phase3();
      queue.release("action_domain");
    }
    
    function action2() {
      phase1();
      queue.lock("action_domain");
      phase2();
      queue.release("action_domain");
    }
    
  • 如何构建队列
    我将只关注在后端站点上构建队列时如何避免竞争条件部分。如果你不知道队列的基本概念,来这里
    下面的代码只展示了这个概念,你需要以正确的方式实现。

    function lock() {
      if(isRunning()) {
        addIsolateCodeToQueue(); //use callback, delegate, function pointer... depend on your language
      } else {
        setStateToRunning();
        pickOneAndExecute();
      }
    }
    
    function release() {
      setStateToRelease();
      pickOneAndExecute();
    }
    

But you need isRunning()setStateToRelease()setStateToRunning()isolate it's self or else you face race condition again. To do this I choose Redis for ACIDpurpose and scalable.
Redis documenttalk about it's transaction:

但是您需要isRunning()setStateToRelease()setStateToRunning()隔离它的自我,否则您将再次面临竞争状况。为此,我选择 Redis 用于ACID目的和可扩展性。
Redis文档谈论它的事务:

All the commands in a transaction are serialized and executed sequentially. It can never happen that a request issued by another client is served in the middle of the execution of a Redis transaction. This guarantees that the commands are executed as a single isolated operation.

事务中的所有命令都被序列化并按顺序执行。在 Redis 事务的执行过程中,永远不会发生另一个客户端发出的请求。这保证了命令作为单个隔离操作执行。

P/s:
I use Redis because my service already use it, you can use any other way support isolation to do that.
The action_domainin my code is above for when you need only action 1 call by user A block action 2 of user A, don't block other user. The idea is put a unique key for lock of each user.

P/s:
我使用Redis是因为我的服务已经使用了它,你可以使用任何其他支持隔离的方式来做到这一点。
action_domain在我的代码是上面的时候,你只需要操作1调用用户A的用户A块动作2,不要妨碍其他用户的。这个想法是为每个用户的锁定放置一个唯一的密钥。

回答by Manish Jain

Transactions are available now in MongoDB 4.0. Sample here

事务现在在 MongoDB 4.0 中可用。样品在这里

// Runs the txnFunc and retries if TransientTransactionError encountered

function runTransactionWithRetry(txnFunc, session) {
    while (true) {
        try {
            txnFunc(session);  // performs transaction
            break;
        } catch (error) {
            // If transient error, retry the whole transaction
            if ( error.hasOwnProperty("errorLabels") && error.errorLabels.includes("TransientTransactionError")  ) {
                print("TransientTransactionError, retrying transaction ...");
                continue;
            } else {
                throw error;
            }
        }
    }
}

// Retries commit if UnknownTransactionCommitResult encountered

function commitWithRetry(session) {
    while (true) {
        try {
            session.commitTransaction(); // Uses write concern set at transaction start.
            print("Transaction committed.");
            break;
        } catch (error) {
            // Can retry commit
            if (error.hasOwnProperty("errorLabels") && error.errorLabels.includes("UnknownTransactionCommitResult") ) {
                print("UnknownTransactionCommitResult, retrying commit operation ...");
                continue;
            } else {
                print("Error during commit ...");
                throw error;
            }
       }
    }
}

// Updates two collections in a transactions

function updateEmployeeInfo(session) {
    employeesCollection = session.getDatabase("hr").employees;
    eventsCollection = session.getDatabase("reporting").events;

    session.startTransaction( { readConcern: { level: "snapshot" }, writeConcern: { w: "majority" } } );

    try{
        employeesCollection.updateOne( { employee: 3 }, { $set: { status: "Inactive" } } );
        eventsCollection.insertOne( { employee: 3, status: { new: "Inactive", old: "Active" } } );
    } catch (error) {
        print("Caught exception during transaction, aborting.");
        session.abortTransaction();
        throw error;
    }

    commitWithRetry(session);
}

// Start a session.
session = db.getMongo().startSession( { mode: "primary" } );

try{
   runTransactionWithRetry(updateEmployeeInfo, session);
} catch (error) {
   // Do something with error
} finally {
   session.endSession();
}