mongodb 使用 Mongo 聚合框架进行多组操作

Question

提问by Allyl Isocyanate

Given a set of questions that have linked survey and category id:

给定一组关联调查和类别 ID 的问题：

> db.questions.find().toArray();
[
    {
        "_id" : ObjectId("4fda05bc322b1c95b531ac25"),
        "id" : 1,
        "name" : "Question 1",
        "category_id" : 1,
        "survey_id" : 1,
        "score" : 5
    },
    {
        "_id" : ObjectId("4fda05cb322b1c95b531ac26"),
        "id" : 2,
        "name" : "Question 2",
        "category_id" : 1,
        "survey_id" : 1,
        "score" : 3
    },
    {
        "_id" : ObjectId("4fda05d9322b1c95b531ac27"),
        "id" : 3,
        "name" : "Question 3",
        "category_id" : 2,
        "survey_id" : 1,
        "score" : 4
    },
    {
        "_id" : ObjectId("4fda4287322b1c95b531ac28"),
        "id" : 4,
        "name" : "Question 4",
        "category_id" : 2,
        "survey_id" : 1,
        "score" : 7
    }
]

I can find the category average with:

我可以找到类别平均值：

db.questions.aggregate(
    { $group : {
        _id : "$category_id",
        avg_score : { $avg : "$score" }
    }
}
);

{
    "result" : [
        {
            "_id" : 1,
            "avg_score" : 4
        },
        {
            "_id" : 2,
            "avg_score" : 5.5
        }
    ],
    "ok" : 1
}

How can I get the average of category averages (note this is different than simply averaging all questions)?I would assume I would do multiple group operations but this fails:

如何获得类别平均值的平均值（请注意，这与简单地对所有问题求平均值不同）？我假设我会做多个组操作，但这失败了：

> db.questions.aggregate(
...   { $group : {
...     _id : "$category_id",
...     avg_score : { $avg : "$score" },
...   }},
...   { $group : {
...     _id : "$survey_id",
...     avg_score : { $avg : "$score" },
...   }}
... );
{
    "errmsg" : "exception: the _id field for a group must not be undefined",
    "code" : 15956,
    "ok" : 0
}
>

Answer 1

回答by William Z

It's important to understand that the operations in the argument to aggregate() form a pipeline. This meant that the input to any element of the pipeline is the stream of documents produced by the previous element in the pipeline.

理解aggregate() 参数中的操作形成一个管道很重要。这意味着管道任何元素的输入都是管道中前一个元素生成的文档流。

In your example, your first query creates a pipeline of documents that look like this:

在您的示例中，您的第一个查询创建了一个如下所示的文档管道：

{
    "_id" : 2,
    "avg_score" : 5.5
},
{
    "_id" : 1,
    "avg_score" : 4
}

This means that the second element of the pipline is seeing a series of documents where the only keys are "_id" and "avg_score". The keys "category_id" and "score" no longer existin this document stream.

这意味着管道的第二个元素是看到一系列文档，其中唯一的键是“_id”和“avg_score”。此文档流中不再存在键“category_id”和“score” 。

If you want to further aggregate on this stream, you'll have to aggregate using the keys that are seen at this stage in the pipeline. Since you want to average the averages, you need to put in a single constant value for the _id field, so that all of the input documents get grouped into a single result.

如果您想在此流上进一步聚合，则必须使用在此阶段在管道中看到的键进行聚合。由于您想要平均平均值，您需要为 _id 字段输入一个常量值，以便所有输入文档都被分组为一个结果。

The following code produces the correct result:

以下代码产生正确的结果：

db.questions.aggregate(
    { $group : {
        _id : "$category_id",
        avg_score : { $avg : "$score" },
        }
    },
    { $group : {
        _id : "all",
        avg_score : { $avg : "$avg_score" },
        }
    }
);

When run, it produces the following output:

运行时，它会产生以下输出：

 {
    "result" : [
        {
        "_id" : "all",
        "avg_score" : 4.75
        }
    ],
    "ok" : 1
 }

mongodb 使用 Mongo 聚合框架进行多组操作

提问by Allyl Isocyanate

回答by William Z

相关推荐

最近更新

标签

mongodb 使用 Mongo 聚合框架进行多组操作

提问by Allyl Isocyanate

回答by William Z

相关推荐

mongodb mongo db 查询大于和小于的数组值

无法在 Ubuntu 11.04 上重新启动 mongodb

mongodb 在 Meteor 中运行示例的问题

mongodb 是否可以为 mongodump 指定目录/文件夹？

相关推荐

最近更新

标签