MongoDB - 使用聚合展开数组并删除重复项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18804404/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
MongoDB - Unwind array using aggregation and remove duplicates
提问by l a s
I am unwinding an array using MongoDB aggregation framework and the array has duplicates and I need to ignore those duplicates while doing a grouping further.
我正在使用 MongoDB 聚合框架展开一个数组,并且该数组有重复项,我需要在进一步分组时忽略这些重复项。
How can I achieve that?
我怎样才能做到这一点?
回答by Roman Pekar
回答by Enrique Coslado
You have to use $addToSet, but at first you have to group by _id, because if you don't you'll get an element per item in the list.
您必须使用 $addToSet,但首先您必须按 _id 分组,因为如果不这样做,您将获得列表中每个项目的元素。
Imagine a collection posts with documents like this:
想象一个包含这样的文档的集合帖子:
{
body: "Lorem Ipsum...",
tags: ["stuff", "lorem", "lorem"],
author: "Enrique Coslado"
}
Imagine you want to calculate the most usual tag per author. You'd make an aggregate query like that:
想象一下,您想计算每个作者最常用的标签。你会做一个这样的聚合查询:
db.posts.aggregate([
{$project: {
author: "$author",
tags: "$tags",
post_id: "$_id"
}},
{$unwind: "$tags"},
{$group: {
_id: "$post_id",
author: {$first: "$author"},
tags: {$addToSet: "$tags"}
}},
{$unwind: "$tags"},
{$group: {
_id: {
author: "$author",
tags: "$tags"
},
count: {$sum: 1}
}}
])
That way you'll get documents like this:
这样你会得到这样的文件:
{
_id: {
author: "Enrique Coslado",
tags: "lorem"
},
count: 1
}
回答by cephuo
Previous answers are correct, but the procedure of doing $unwind -> $group -> $unwind
could be simplified.
You could use $addFields
+ $reduce
to pass to the pipeline the filtered array which already contains unique entries and then $unwind
only once.
以前的答案是正确的,但$unwind -> $group -> $unwind
可以简化做的过程。您可以使用$addFields
+$reduce
将已包含唯一条目的过滤数组传递给管道,然后$unwind
仅传递一次。
Example document:
示例文档:
{
body: "Lorem Ipsum...",
tags: [{title: 'test1'}, {title: 'test2'}, {title: 'test1'}, ],
author: "First Last name"
}
Query:
询问:
db.posts.aggregate([
{$addFields: {
"uniqueTag": {
$reduce: {
input: "$tags",
initialValue: [],
in: {$setUnion: ["$$value", ["$$this.title"]]}
}
}
}},
{$unwind: "$uniqueTag"},
{$group: {
_id: {
author: "$author",
tags: "$uniqueTag"
},
count: {$sum: 1}
}}
])