MongoDB - 使用聚合展开数组并删除重复项

Question

提问by l a s

I am unwinding an array using MongoDB aggregation framework and the array has duplicates and I need to ignore those duplicates while doing a grouping further.

我正在使用 MongoDB 聚合框架展开一个数组，并且该数组有重复项，我需要在进一步分组时忽略这些重复项。

How can I achieve that?

我怎样才能做到这一点？

Answer 1

回答by Roman Pekar

you can use $addToSetto do this:

你可以使用$addToSet来做到这一点：

db.users.aggregate([
  { $unwind: '$data' },
  { $group: { _id: '$_id', data: { $addToSet: '$data' } } }
]);

It's hard to give you more specific answer without seeing your actual query.

如果没有看到您的实际查询，很难为您提供更具体的答案。

Answer 2

回答by Enrique Coslado

You have to use $addToSet, but at first you have to group by _id, because if you don't you'll get an element per item in the list.

您必须使用 $addToSet，但首先您必须按 _id 分组，因为如果不这样做，您将获得列表中每个项目的元素。

Imagine a collection posts with documents like this:

想象一个包含这样的文档的集合帖子：

{
     body: "Lorem Ipsum...", 
     tags: ["stuff", "lorem", "lorem"],
     author: "Enrique Coslado"
}

Imagine you want to calculate the most usual tag per author. You'd make an aggregate query like that:

想象一下，您想计算每个作者最常用的标签。你会做一个这样的聚合查询：

db.posts.aggregate([
    {$project: {
        author: "$author", 
        tags: "$tags", 
        post_id: "$_id"
    }}, 

    {$unwind: "$tags"}, 

    {$group: {
        _id: "$post_id", 
        author: {$first: "$author"}, 
        tags: {$addToSet: "$tags"}
    }}, 

    {$unwind: "$tags"},

    {$group: {
        _id: {
            author: "$author",
            tags: "$tags"
        },
        count: {$sum: 1}
    }}
])

That way you'll get documents like this:

这样你会得到这样的文件：

{
     _id: {
         author: "Enrique Coslado", 
         tags: "lorem"
     },
     count: 1
}

Answer 3

回答by cephuo

Previous answers are correct, but the procedure of doing $unwind -> $group -> $unwindcould be simplified. You could use $addFields+ $reduceto pass to the pipeline the filtered array which already contains unique entries and then $unwindonly once.

以前的答案是正确的，但$unwind -> $group -> $unwind可以简化做的过程。您可以使用$addFields+$reduce将已包含唯一条目的过滤数组传递给管道，然后$unwind仅传递一次。

Example document:

示例文档：

{
     body: "Lorem Ipsum...", 
     tags: [{title: 'test1'}, {title: 'test2'}, {title: 'test1'}, ],
     author: "First Last name"
}

Query:

询问：

db.posts.aggregate([
    {$addFields: {
        "uniqueTag": {
            $reduce: {
                input: "$tags",
                initialValue: [],
                in: {$setUnion: ["$$value", ["$$this.title"]]}
            }
        }
    }}, 

    {$unwind: "$uniqueTag"}, 

    {$group: {
        _id: {
            author: "$author",
            tags: "$uniqueTag"
        },
        count: {$sum: 1}
    }}
])

MongoDB - 使用聚合展开数组并删除重复项

提问by l a s

回答by Roman Pekar

回答by Enrique Coslado

回答by cephuo

相关推荐

最近更新

标签

MongoDB - 使用聚合展开数组并删除重复项

提问by l a s

回答by Roman Pekar

回答by Enrique Coslado

回答by cephuo

相关推荐

使用 MongoDB 聚合框架四舍五入到小数点后两位

mongodb 在限制结果之前，您如何告诉 Mongo 对集合进行排序？

mongodb 了解MongoDB缓存系统

mongodb.conf bind_ip = 127.0.0.1 不起作用，但 0.0.0.0 起作用

相关推荐

最近更新

标签