mongodb 在MongoDB中实现数据版本控制的方法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4185105/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Ways to implement data versioning in MongoDB
提问by Piotr Czapla
Can you share your thoughts how would you implement data versioning in MongoDB. (I've asked similar question regarding Cassandra. If you have any thoughts which db is better for that please share)
您能否分享您的想法,您将如何在 MongoDB 中实现数据版本控制。(我问过关于 Cassandra 的类似问题。如果您有任何想法,请分享哪个 db 更好)
Suppose that I need to version records in an simple address book. (Address book records are stored as flat json objects). I expect that the history:
假设我需要在一个简单的地址簿中版本记录。(地址簿记录存储为平面 json 对象)。我希望历史:
- will be used infrequently
- will be used all at once to present it in a "time machine" fashion
- there won't be more versions than few hundred to a single record. history won't expire.
- 将很少使用
- 将一次全部使用,以“时间机器”的方式呈现
- 单个记录的版本不会超过几百个。历史不会过期。
I'm considering the following approaches:
我正在考虑以下方法:
Create a new object collection to store history of records or changes to the records. It would store one object per version with a reference to the address book entry. Such records would looks as follows:
{ '_id': 'new id', 'user': user_id, 'timestamp': timestamp, 'address_book_id': 'id of the address book record' 'old_record': {'first_name': 'Jon', 'last_name':'Doe' ...} }
This approach can be modified to store an array of versions per document. But this seems to be slower approach without any advantages.
Store versions as serialized (JSON) object attached to address book entries. I'm not sure how to attach such objects to MongoDB documents. Perhaps as an array of strings. (Modelled after Simple Document Versioning with CouchDB)
创建一个新的对象集合来存储记录的历史记录或对记录的更改。它将为每个版本存储一个对象,并引用地址簿条目。此类记录如下所示:
{ '_id': 'new id', 'user': user_id, 'timestamp': timestamp, 'address_book_id': 'id of the address book record' 'old_record': {'first_name': 'Jon', 'last_name':'Doe' ...} }
可以修改此方法以存储每个文档的版本数组。但这似乎是一种较慢的方法,没有任何优势。
将版本存储为附加到地址簿条目的序列化 (JSON) 对象。我不确定如何将这些对象附加到 MongoDB 文档。也许作为一个字符串数组。(模仿使用 CouchDB 的简单文档版本控制)
采纳答案by Gates VP
The first big question when diving in to this is "how do you want to store changesets"?
深入研究的第一个大问题是“您希望如何存储变更集”?
- Diffs?
- Whole record copies?
- 差异?
- 全记录副本?
My personal approach would be to store diffs. Because the display of these diffs is really a special action, I would put the diffs in a different "history" collection.
我个人的方法是存储差异。因为这些差异的显示确实是一个特殊的动作,所以我会将差异放在不同的“历史”集合中。
I would use the different collection to save memory space. You generally don't want a full history for a simple query. So by keeping the history out of the object you can also keep it out of the commonly accessed memory when that data is queried.
我会使用不同的集合来节省内存空间。您通常不需要简单查询的完整历史记录。因此,通过将历史记录保留在对象之外,您还可以在查询该数据时将其保留在常用访问的内存之外。
To make my life easy, I would make a history document contain a dictionary of time-stamped diffs. Something like this:
为了让我的生活更轻松,我会制作一个包含时间戳差异字典的历史文档。像这样的东西:
{
_id : "id of address book record",
changes : {
1234567 : { "city" : "Omaha", "state" : "Nebraska" },
1234568 : { "city" : "Kansas City", "state" : "Missouri" }
}
}
To make my life really easy, I would make this part of my DataObjects (EntityWrapper, whatever) that I use to access my data. Generally these objects have some form of history, so that you can easily override the save()
method to make this change at the same time.
为了让我的生活变得更轻松,我会将这部分作为我用来访问我的数据的 DataObjects(EntityWrapper,无论如何)的一部分。通常,这些对象具有某种形式的历史记录,因此您可以轻松地覆盖save()
方法以同时进行此更改。
UPDATE: 2015-10
更新:2015-10
It looks like there is now a spec for handling JSON diffs. This seems like a more robust way to store the diffs / changes.
看起来现在有一个处理 JSON diffs 的规范。这似乎是一种更强大的存储差异/更改的方式。
回答by Marian
There is a versioning scheme called "Vermongo" which addresses some aspects which haven't been dealt with in the other replies.
有一个称为“Vermongo”的版本控制方案,它解决了其他答复中未涉及的某些方面。
One of these issues is concurrent updates, another one is deleting documents.
这些问题之一是并发更新,另一个是删除文档。
Vermongo stores complete document copies in a shadow collection. For some use cases this might cause too much overhead, but I think it also simplifies many things.
Vermongo 将完整的文档副本存储在一个影子集合中。对于某些用例,这可能会导致过多的开销,但我认为它也简化了很多事情。
回答by Benjamin M
Here's another solution using a single document for the current version and all old versions:
这是使用当前版本和所有旧版本的单个文档的另一种解决方案:
{
_id: ObjectId("..."),
data: [
{ vid: 1, content: "foo" },
{ vid: 2, content: "bar" }
]
}
data
contains allversions. The data
array is ordered, new versions will only get $push
ed to the end of the array. data.vid
is the version id, which is an incrementing number.
data
包含所有版本。该data
阵列是有序的,新版本将只能得到$push
编到数组的末尾。data.vid
是版本 ID,它是一个递增的数字。
Get the most recent version:
获取最新版本:
find(
{ "_id":ObjectId("...") },
{ "data":{ $slice:-1 } }
)
Get a specific version by vid
:
通过vid
以下方式获取特定版本:
find(
{ "_id":ObjectId("...") },
{ "data":{ $elemMatch:{ "vid":1 } } }
)
Return only specified fields:
仅返回指定字段:
find(
{ "_id":ObjectId("...") },
{ "data":{ $elemMatch:{ "vid":1 } }, "data.content":1 }
)
Insert new version:(and prevent concurrent insert/update)
插入新版本:(并防止并发插入/更新)
update(
{
"_id":ObjectId("..."),
$and:[
{ "data.vid":{ $not:{ $gt:2 } } },
{ "data.vid":2 }
]
},
{ $push:{ "data":{ "vid":3, "content":"baz" } } }
)
2
is the vid
of the current most recent version and 3
is the new version getting inserted. Because you need the most recent version's vid
, it's easy to do get the next version's vid
: nextVID = oldVID + 1
.
2
是vid
当前最新版本的 并且3
是插入的新版本。因为您需要最新版本的vid
,所以很容易获得下一个版本的vid
: nextVID = oldVID + 1
。
The $and
condition will ensure, that 2
is the latest vid
.
该$and
条件将确保,这2
是最新的vid
。
This way there's no need for a unique index, but the application logic has to take care of incrementing the vid
on insert.
这样就不需要唯一索引,但应用程序逻辑必须注意vid
在插入时递增。
Remove a specific version:
删除特定版本:
update(
{ "_id":ObjectId("...") },
{ $pull:{ "data":{ "vid":2 } } }
)
That's it!
就是这样!
(remember the 16MB per document limit)
(记住每个文档 16MB 的限制)
回答by s01ipsist
If you're looking for a ready-to-roll solution -
如果您正在寻找即用型解决方案 -
Mongoid has built in simple versioning
Mongoid 内置了简单的版本控制
http://mongoid.org/en/mongoid/docs/extras.html#versioning
http://mongoid.org/en/mongoid/docs/extras.html#versioning
mongoid-history is a Ruby plugin that provides a significantly more complicated solution with auditing, undo and redo
mongoid-history 是一个 Ruby 插件,它提供了一个非常复杂的审计、撤消和重做解决方案
回答by Daniel Watrous
I worked through this solution that accommodates a published, draft and historical versions of the data:
我研究了这个解决方案,该解决方案包含数据的已发布、草稿和历史版本:
{
published: {},
draft: {},
history: {
"1" : {
metadata: <value>,
document: {}
},
...
}
}
I explain the model further here: http://software.danielwatrous.com/representing-revision-data-in-mongodb/
我在这里进一步解释模型:http: //software.danielwatrous.com/representing-revision-data-in-mongodb/
For those that may implement something like this in Java, here's an example:
对于那些可能在Java 中实现这样的东西的人,这里有一个例子:
http://software.danielwatrous.com/using-java-to-work-with-versioned-data/
http://software.danielwatrous.com/using-java-to-work-with-versioned-data/
Including all the code that you can fork, if you like
如果你愿意,包括所有你可以分叉的代码
回答by bmw15
If you are using mongoose, I have found the following plugin to be a useful implementation of the JSON Patchformat
如果您使用的是猫鼬,我发现以下插件是JSON Patch格式的有用实现
回答by Muhammad Reda
Another option is to use mongoose-historyplugin.
另一种选择是使用mongoose-history插件。
let mongoose = require('mongoose');
let mongooseHistory = require('mongoose-history');
let Schema = mongoose.Schema;
let MySchema = Post = new Schema({
title: String,
status: Boolean
});
MySchema.plugin(mongooseHistory);
// The plugin will automatically create a new collection with the schema name + "_history".
// In this case, collection with name "my_schema_history" will be created.
回答by helcode
I have used the below package for a meteor/MongoDB project, and it works well, the main advantage is that it stores history/revisions within an array in the same document, hence no need for an additional publications or middleware to access change-history. It can support a limited number of previous versions (ex. last ten versions), it also supports change-concatenation (so all changes happened within a specific period will be covered by one revision).
我已经将下面的包用于流星/MongoDB 项目,它运行良好,主要优点是它将历史/修订存储在同一文档的数组中,因此不需要额外的出版物或中间件来访问更改历史. 它可以支持有限数量的先前版本(例如最后十个版本),它还支持更改连接(因此在特定时间段内发生的所有更改都将包含在一个修订版中)。
nicklozon/meteor-collection-revisions
nicklozon/meteor-collection-revisions
Another sound option is to use Meteor Vermongo (here)
另一个声音选项是使用 Meteor Vermongo(这里)