mongodb 将文档从一个集合移动到另一个集合
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27039083/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
mongodb move documents from one collection to another collection
提问by manojpt
How can documentsbe moved from one collection to another collectionin MongoDB?? For example: I have lot of documents in collection A and I want to move all 1 month older documents to collection B (these 1 month older documents should not be in collection A).
如何文档可以从一个集合移动到另一个集合在MongoDB中?例如:我在集合 A 中有很多文档,我想将所有 1 个月前的文档移动到集合 B(这些 1 个月前的文档不应该在集合 A 中)。
Using aggregationwe can do copy. But what I am trying to do is movingof documents. What method can be used to move documents?
使用聚合我们可以做copy。但我想做的是移动文件。可以使用什么方法移动文档?
回答by jasongarber
The bulk operations @markus-w-mahlberg showed (and @mark-mullin refined) are efficient but unsafe as written. If the bulkInsert fails, the bulkRemove will still continue. To make sure you don't lose any records when moving, use this instead:
@markus-w-mahlberg 显示的批量操作(和 @mark-mullin 改进)是有效的,但在书面上是不安全的。如果bulkInsert 失败,bulkRemove 仍将继续。为确保您在移动时不会丢失任何记录,请改用:
function insertBatch(collection, documents) {
var bulkInsert = collection.initializeUnorderedBulkOp();
var insertedIds = [];
var id;
documents.forEach(function(doc) {
id = doc._id;
// Insert without raising an error for duplicates
bulkInsert.find({_id: id}).upsert().replaceOne(doc);
insertedIds.push(id);
});
bulkInsert.execute();
return insertedIds;
}
function deleteBatch(collection, documents) {
var bulkRemove = collection.initializeUnorderedBulkOp();
documents.forEach(function(doc) {
bulkRemove.find({_id: doc._id}).removeOne();
});
bulkRemove.execute();
}
function moveDocuments(sourceCollection, targetCollection, filter, batchSize) {
print("Moving " + sourceCollection.find(filter).count() + " documents from " + sourceCollection + " to " + targetCollection);
var count;
while ((count = sourceCollection.find(filter).count()) > 0) {
print(count + " documents remaining");
sourceDocs = sourceCollection.find(filter).limit(batchSize);
idsOfCopiedDocs = insertBatch(targetCollection, sourceDocs);
targetDocs = targetCollection.find({_id: {$in: idsOfCopiedDocs}});
deleteBatch(sourceCollection, targetDocs);
}
print("Done!")
}
回答by Markus W Mahlberg
Update 2
更新 2
Please do NOT upvote this answer any more. As written @jasongarber's answeris better in any aspect.
请不要再给这个答案点赞了。正如所写的@jasongarber 的答案在任何方面都更好。
Update
更新
This answer by @jasongarberis a safer approach and should be used instead of mine.
@jasongarber 的这个答案是一种更安全的方法,应该用来代替我的。
Provided I got you right and you want to move all documents older than 1 month, and you use mongoDB 2.6, there is no reason not to use bulk operations, which are the most efficient way of doing multiple operations I am aware of:
如果我说对了,并且您想移动超过 1 个月的所有文档,并且您使用 mongoDB 2.6,则没有理由不使用批量操作,这是我所知道的执行多项操作的最有效方法:
> var bulkInsert = db.target.initializeUnorderedBulkOp()
> var bulkRemove = db.source.initializeUnorderedBulkOp()
> var date = new Date()
> date.setMonth(date.getMonth() -1)
> db.source.find({"yourDateField":{$lt: date}}).forEach(
function(doc){
bulkInsert.insert(doc);
bulkRemove.find({_id:doc._id}).removeOne();
}
)
> bulkInsert.execute()
> bulkRemove.execute()
This should be pretty fast and it has the advantage that in case something goes wrong during the bulk insert, the original data still exists.
这应该非常快,而且它的优点是,如果在批量插入过程中出现问题,原始数据仍然存在。
Edit
编辑
In order to prevent too much memory to be utilized, you can execute the bulk operation on every x
docs processed:
为了防止占用过多内存,您可以对每个x
处理的文档执行批量操作:
> var bulkInsert = db.target.initializeUnorderedBulkOp()
> var bulkRemove = db.source.initializeUnorderedBulkOp()
> var x = 10000
> var counter = 0
> var date = new Date()
> date.setMonth(date.getMonth() -1)
> db.source.find({"yourDateField":{$lt: date}}).forEach(
function(doc){
bulkInsert.insert(doc);
bulkRemove.find({_id:doc._id}).removeOne();
counter ++
if( counter % x == 0){
bulkInsert.execute()
bulkRemove.execute()
bulkInsert = db.target.initializeUnorderedBulkOp()
bulkRemove = db.source.initializeUnorderedBulkOp()
}
}
)
> bulkInsert.execute()
> bulkRemove.execute()
回答by manojpt
Insert and remove:
插入和删除:
var documentsToMove = db.collectionA.find({});
documentsToMove.forEach(function(doc) {
db.collectionB.insert(doc);
db.collectionA.remove(doc);
});
note: this method might be quite slow for large collections or collections holding large documents.
注意:对于大型集合或包含大型文档的集合,此方法可能会很慢。
回答by karthi
$out is use to create the new collection with data , so use $out
$out 用于创建包含数据的新集合,因此请使用 $out
db.oldCollection.aggregate([{$out : "newCollection"}])
then use drop
然后使用 drop
db.oldCollection.drop()
回答by ialekseev
May be from the performance point of view it's better to remove a lot of documents using one command(especially if you have indexes for query part) rather than deleting them one-by-one.
可能从性能的角度来看,最好使用一个命令删除大量文档(特别是如果您有查询部分的索引),而不是一个一个地删除它们。
For example:
例如:
db.source.find({$gte: start, $lt: end}).forEach(function(doc){
db.target.insert(doc);
});
db.source.remove({$gte: start, $lt: end});
回答by Mark Mullin
This is a restatement of @Markus W Mahlberg
这是@Markus W Mahlberg 的重述
Returning the favor - as a function
回报恩惠 - 作为一种功能
function moveDocuments(sourceCollection,targetCollection,filter) {
var bulkInsert = targetCollection.initializeUnorderedBulkOp();
var bulkRemove = sourceCollection.initializeUnorderedBulkOp();
sourceCollection.find(filter)
.forEach(function(doc) {
bulkInsert.insert(doc);
bulkRemove.find({_id:doc._id}).removeOne();
}
)
bulkInsert.execute();
bulkRemove.execute();
}
An example use
一个例子使用
var x = {dsid:{$exists: true}};
moveDocuments(db.pictures,db.artifacts,x)
to move all documents that have top level element dsid from the pictures to the artifacts collection
将所有具有顶级元素 dsid 的文档从图片移动到工件集合
回答by Diogo Rosa
回答by Ninad
you can use range query to get data from sourceCollection and keep the cursor data in variable and loop on it and insert to target collection:
您可以使用范围查询从 sourceCollection 获取数据,并将游标数据保存在变量中并在其上循环并插入到目标集合:
var doc = db.sourceCollection.find({
"Timestamp":{
$gte:ISODate("2014-09-01T00:00:00Z"),
$lt:ISODate("2014-10-01T00:00:00Z")
}
});
doc.forEach(function(doc){
db.targetCollection.insert(doc);
})
Hope so it helps!!
希望有帮助!!
回答by Matt Wills
Here's an update to @jasongarber's answer which uses the more recent mongo 'bulkWrite' operation (Read docs here), and also keeps the whole process asynchronous so you can run it as part of a wider script which depends on its' completion.
这是对@jasongarber 的回答的更新,它使用了最近的 mongo 'bulkWrite' 操作(在这里阅读文档),并且还保持整个过程异步,因此您可以将其作为更广泛的脚本的一部分运行,这取决于它的完成情况。
async function moveDocuments (sourceCollection, targetCollection, filter) {
const sourceDocs = await sourceCollection.find(filter)
console.log(`Moving ${await sourceDocs.count()} documents from ${sourceCollection.collectionName} to ${targetCollection.collectionName}`)
const idsOfCopiedDocs = await insertDocuments(targetCollection, sourceDocs)
const targetDocs = await targetCollection.find({_id: {$in: idsOfCopiedDocs}})
await deleteDocuments(sourceCollection, targetDocs)
console.log('Done!')
}
async function insertDocuments (collection, documents) {
const insertedIds = []
const bulkWrites = []
await documents.forEach(doc => {
const {_id} = doc
insertedIds.push(_id)
bulkWrites.push({
replaceOne: {
filter: {_id},
replacement: doc,
upsert: true,
},
})
})
if (bulkWrites.length) await collection.bulkWrite(bulkWrites, {ordered: false})
return insertedIds
}
async function deleteDocuments (collection, documents) {
const bulkWrites = []
await documents.forEach(({_id}) => {
bulkWrites.push({
deleteOne: {
filter: {_id},
},
})
})
if (bulkWrites.length) await collection.bulkWrite(bulkWrites, {ordered: false})
}
回答by Dzmitry Rudkouski
It can be done on the server-side using the $mergeoperator (starting from MongoDB 4.2).
它可以在服务器端使用$merge操作符(从 MongoDB 4.2 开始)完成。
db.getCollection("sourceColl").aggregate([
{ $merge: {
into: "targetColl",
on: "_id",
whenMatched: "fail",
whenNotMatched: "insert"
}}
]);
db.getCollection("sourceColl").deleteMany({})