mongodb 将文档从一个集合移动到另一个集合

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27039083/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 20:13:05  来源:igfitidea点击:

mongodb move documents from one collection to another collection

mongodb

提问by manojpt

How can documentsbe moved from one collection to another collectionin MongoDB?? For example: I have lot of documents in collection A and I want to move all 1 month older documents to collection B (these 1 month older documents should not be in collection A).

如何文档可以从一个集合移动到另一个集合MongoDB中?例如:我在集合 A 中有很多文档,我想将所有 1 个月前的文档移动到集合 B(这些 1 个月前的文档不应该在集合 A 中)。

Using aggregationwe can do copy. But what I am trying to do is movingof documents. What method can be used to move documents?

使用聚合我们可以做copy。但我想做的是移动文件。可以使用什么方法移动文档?

回答by jasongarber

The bulk operations @markus-w-mahlberg showed (and @mark-mullin refined) are efficient but unsafe as written. If the bulkInsert fails, the bulkRemove will still continue. To make sure you don't lose any records when moving, use this instead:

@markus-w-mahlberg 显示的批量操作(和 @mark-mullin 改进)是有效的,但在书面上是不安全的。如果bulkInsert 失败,bulkRemove 仍将继续。为确保您在移动时不会丢失任何记录,请改用:

function insertBatch(collection, documents) {
  var bulkInsert = collection.initializeUnorderedBulkOp();
  var insertedIds = [];
  var id;
  documents.forEach(function(doc) {
    id = doc._id;
    // Insert without raising an error for duplicates
    bulkInsert.find({_id: id}).upsert().replaceOne(doc);
    insertedIds.push(id);
  });
  bulkInsert.execute();
  return insertedIds;
}

function deleteBatch(collection, documents) {
  var bulkRemove = collection.initializeUnorderedBulkOp();
  documents.forEach(function(doc) {
    bulkRemove.find({_id: doc._id}).removeOne();
  });
  bulkRemove.execute();
}

function moveDocuments(sourceCollection, targetCollection, filter, batchSize) {
  print("Moving " + sourceCollection.find(filter).count() + " documents from " + sourceCollection + " to " + targetCollection);
  var count;
  while ((count = sourceCollection.find(filter).count()) > 0) {
    print(count + " documents remaining");
    sourceDocs = sourceCollection.find(filter).limit(batchSize);
    idsOfCopiedDocs = insertBatch(targetCollection, sourceDocs);

    targetDocs = targetCollection.find({_id: {$in: idsOfCopiedDocs}});
    deleteBatch(sourceCollection, targetDocs);
  }
  print("Done!")
}

回答by Markus W Mahlberg

Update 2

更新 2

Please do NOT upvote this answer any more. As written @jasongarber's answeris better in any aspect.

请不要再给这个答案点赞了。正如所写的@jasongarber 的答案在任何方面都更好。

Update

更新

This answer by @jasongarberis a safer approach and should be used instead of mine.

@jasongarber 的这个答案是一种更安全的方法,应该用来代替我的。



Provided I got you right and you want to move all documents older than 1 month, and you use mongoDB 2.6, there is no reason not to use bulk operations, which are the most efficient way of doing multiple operations I am aware of:

如果我说对了,并且您想移动超过 1 个月的所有文档,并且您使用 mongoDB 2.6,则没有理由不使用批量操作,这是我所知道的执行多项操作的最有效方法:

> var bulkInsert = db.target.initializeUnorderedBulkOp()
> var bulkRemove = db.source.initializeUnorderedBulkOp()
> var date = new Date()
> date.setMonth(date.getMonth() -1)
> db.source.find({"yourDateField":{$lt: date}}).forEach(
    function(doc){
      bulkInsert.insert(doc);
      bulkRemove.find({_id:doc._id}).removeOne();
    }
  )
> bulkInsert.execute()
> bulkRemove.execute()

This should be pretty fast and it has the advantage that in case something goes wrong during the bulk insert, the original data still exists.

这应该非常快,而且它的优点是,如果在批量插入过程中出现问题,原始数据仍然存在。



Edit

编辑

In order to prevent too much memory to be utilized, you can execute the bulk operation on every xdocs processed:

为了防止占用过多内存,您可以对每个x处理的文档执行批量操作:

> var bulkInsert = db.target.initializeUnorderedBulkOp()
> var bulkRemove = db.source.initializeUnorderedBulkOp()
> var x = 10000
> var counter = 0
> var date = new Date()
> date.setMonth(date.getMonth() -1)
> db.source.find({"yourDateField":{$lt: date}}).forEach(
    function(doc){
      bulkInsert.insert(doc);
      bulkRemove.find({_id:doc._id}).removeOne();
      counter ++
      if( counter % x == 0){
        bulkInsert.execute()
        bulkRemove.execute()
        bulkInsert = db.target.initializeUnorderedBulkOp()
        bulkRemove = db.source.initializeUnorderedBulkOp()
      }
    }
  )
> bulkInsert.execute()
> bulkRemove.execute()

回答by manojpt

Insert and remove:

插入和删除:

var documentsToMove = db.collectionA.find({});
documentsToMove.forEach(function(doc) {
    db.collectionB.insert(doc);
    db.collectionA.remove(doc);
});

note: this method might be quite slow for large collections or collections holding large documents.

注意:对于大型集合或包含大型文档的集合,此方法可能会很慢。

回答by karthi

$out is use to create the new collection with data , so use $out

$out 用于创建包含数据的新集合,因此请使用 $out

db.oldCollection.aggregate([{$out : "newCollection"}])

then use drop

然后使用 drop

db.oldCollection.drop()

回答by ialekseev

May be from the performance point of view it's better to remove a lot of documents using one command(especially if you have indexes for query part) rather than deleting them one-by-one.

可能从性能的角度来看,最好使用一个命令删除大量文档(特别是如果您有查询部分的索引),而不是一个一个地删除它们。

For example:

例如:

db.source.find({$gte: start, $lt: end}).forEach(function(doc){
   db.target.insert(doc);
});
db.source.remove({$gte: start, $lt: end});

回答by Mark Mullin

This is a restatement of @Markus W Mahlberg

这是@Markus W Mahlberg 的重述

Returning the favor - as a function

回报恩惠 - 作为一种功能

function moveDocuments(sourceCollection,targetCollection,filter) {
    var bulkInsert = targetCollection.initializeUnorderedBulkOp();
    var bulkRemove = sourceCollection.initializeUnorderedBulkOp();
    sourceCollection.find(filter)
        .forEach(function(doc) {
        bulkInsert.insert(doc);
        bulkRemove.find({_id:doc._id}).removeOne();
        }
  )
  bulkInsert.execute();
  bulkRemove.execute();
}

An example use

一个例子使用

var x = {dsid:{$exists: true}};
moveDocuments(db.pictures,db.artifacts,x)

to move all documents that have top level element dsid from the pictures to the artifacts collection

将所有具有顶级元素 dsid 的文档从图片移动到工件集合

回答by Diogo Rosa

From MongoDB 3.0 up, you can use the copyTocommand with the following syntax:

从 MongoDB 3.0 开始,您可以使用具有以下语法的copyTo命令:

db.source_collection.copyTo("target_collection")

Then you can use the dropcommand to remove the old collection:

然后你可以使用drop命令删除旧的集合:

db.source_collection.drop()

回答by Ninad

you can use range query to get data from sourceCollection and keep the cursor data in variable and loop on it and insert to target collection:

您可以使用范围查询从 sourceCollection 获取数据,并将游标数据保存在变量中并在其上循环并插入到目标集合:

 var doc = db.sourceCollection.find({
        "Timestamp":{
              $gte:ISODate("2014-09-01T00:00:00Z"),
              $lt:ISODate("2014-10-01T00:00:00Z")
        }
 });

 doc.forEach(function(doc){
    db.targetCollection.insert(doc);
 })

Hope so it helps!!

希望有帮助!!

回答by Matt Wills

Here's an update to @jasongarber's answer which uses the more recent mongo 'bulkWrite' operation (Read docs here), and also keeps the whole process asynchronous so you can run it as part of a wider script which depends on its' completion.

这是对@jasongarber 的回答的更新,它使用了最近的 mongo 'bulkWrite' 操作(在这里阅读文档),并且还保持整个过程异步,因此您可以将其作为更广泛的脚本的一部分运行,这取决于它的完成情况。

async function moveDocuments (sourceCollection, targetCollection, filter) {
  const sourceDocs = await sourceCollection.find(filter)

  console.log(`Moving ${await sourceDocs.count()} documents from ${sourceCollection.collectionName} to ${targetCollection.collectionName}`)

  const idsOfCopiedDocs = await insertDocuments(targetCollection, sourceDocs)

  const targetDocs = await targetCollection.find({_id: {$in: idsOfCopiedDocs}})
  await deleteDocuments(sourceCollection, targetDocs)

  console.log('Done!')
}

async function insertDocuments (collection, documents) {
  const insertedIds = []
  const bulkWrites = []

  await documents.forEach(doc => {
    const {_id} = doc

    insertedIds.push(_id)
    bulkWrites.push({
      replaceOne: {
        filter: {_id},
        replacement: doc,
        upsert: true,
      },
    })
  })

  if (bulkWrites.length) await collection.bulkWrite(bulkWrites, {ordered: false})

  return insertedIds
}

async function deleteDocuments (collection, documents) {
  const bulkWrites = []

  await documents.forEach(({_id}) => {
    bulkWrites.push({
      deleteOne: {
        filter: {_id},
      },
    })
  })

  if (bulkWrites.length) await collection.bulkWrite(bulkWrites, {ordered: false})
}

回答by Dzmitry Rudkouski

It can be done on the server-side using the $mergeoperator (starting from MongoDB 4.2).

它可以在服务器端使用$merge操作符(从 MongoDB 4.2 开始)完成。

db.getCollection("sourceColl").aggregate([
  { $merge: {
     into: "targetColl",
     on: "_id",
     whenMatched: "fail",
     whenNotMatched: "insert"
  }}
]);
db.getCollection("sourceColl").deleteMany({})