如何在 mongoDB 中聚合庞大的数组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26069601/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to aggregate on huge array in mongoDB?
提问by AlexKogan
I have a mongodb of about 400gb. The documents contain a variety of fields, but the key here is an array of IDs.
我有一个大约 400GB 的 mongodb。文档包含各种字段,但这里的关键是一组 ID。
So a json file might look like this
所以一个 json 文件可能看起来像这样
{
"name":"bob"
"dob":"1/1/2011"
"key":
[
"1020123123",
"1234123222",
"5021297723"
]
}
The focal variable here is "key". There is about 10 billion total keys across 50 million documents (so each document has about 200 keys). Keys can repeat, and there are about 15 million UNIQUE keys.
这里的焦点变量是“关键”。在 5000 万个文档中总共有大约 100 亿个密钥(因此每个文档大约有 200 个密钥)。键可以重复,UNIQUE键大约有1500万个。
What I would like to do is return the 10,000 most common keys. I thought aggregate might do this, but I'm having a lot of trouble getting it to run. Here is my code:
我想做的是返回 10,000 个最常用的键。我认为聚合可能会这样做,但我在运行它时遇到了很多麻烦。这是我的代码:
db.users.aggregate(
[
{ $unwind : "$key" },
{ $group : { _id : "$key", number : { $sum : 1 } } },
{ $sort : { number : -1 } },
{ $limit : 10000 }
]
);
Any ideas what I'm doing wrong?
任何想法我做错了什么?
回答by Wizard
Try this:
尝试这个:
db.users.aggregate(
[
{ $unwind : "$key" },
{ $group : { _id : "$key", number : { $sum : 1 } } },
{ $sort : { number : -1 } },
{ $limit : 10000 },
{ $out:"result"},
], {
allowDiskUse:true,
cursor:{}
}
);
Then find result by db.result.find()
.
然后通过 找到结果db.result.find()
。