Mongodb 聚合框架 | 分组多个值？

Question

提问by Oliver Lloyd

I would like to use mongoDB's Aggregation Framework to run what in SQL would look a bit like:

我想使用 mongoDB 的聚合框架来运行 SQL 中看起来有点像的内容：

SELECT SUM(A), B, C from myTable GROUP BY B, C;

The docs state:

文档状态：

You can specify a single field from the documents in the pipeline, a previously computed value, or an aggregate key made up from several incoming fields.

您可以从管道中的文档中指定单个字段、先前计算的值或由多个传入字段组成的聚合键。

But it's unclear what 'an aggregate key made from several incoming fields' actually is?

但目前还不清楚“由多个传入字段组成的聚合键”究竟是什么？

My dataset is a bit like this:

我的数据集有点像这样：

[{ "timeStamp" : 1341834988666, "label" : "sharon", "responseCode" : "200", "value" : 10, "success" : "true"},
{ "timeStamp" : 1341834988676, "label" : "paul", "responseCode" : "200", "value" : 60, "success" : "true"},
{ "timeStamp" : 1341834988686, "label" : "paul", "responseCode" : "404", "value" : 15, "success" : "true"},
{ "timeStamp" : 1341834988696, "label" : "sharon", "responseCode" : "200", "value" : 35, "success" : "false"},
{ "timeStamp" : 1341834988166, "label" : "paul", "responseCode" : "200", "value" : 40, "success" : "true"},
{ "timeStamp" : 1341834988266, "label" : "paul", "responseCode" : "404", "value" : 99, "success" : "false"}]

My query looks like this:

我的查询如下所示：

resultsCollection.aggregate(
    { $match : { testid : testid} },
    { $skip : alreadyRead },
    { $project : {
            timeStamp : 1 ,
            label : 1,
            responseCode : 1 ,
            value : 1,
            success : 1
        }},
    { $group : {
            _id : "$label",
            max_timeStamp : { $timeStamp : 1 },
            count_responseCode : { $sum : 1 },
            avg_value : { $sum : "$value" },
            count_success : { $sum : 1 }
        }},
    { $group : {
            ?
        }}
);

My instinct was to try to pipe the results through to a second group, I know you can do this but it won't work because the first group already reduces the dataset too much and the required level of detail is lost.

我的直觉是尝试将结果传递给第二组，我知道您可以这样做，但它不会起作用，因为第一组已经减少了太多数据集并且丢失了所需的细节级别。

What I want to do is group using label, responseCodeand successand get the sum of value from the result. It should look a bit like:

我想要做的是使用分组label，responseCode并success从结果中获取值的总和。它应该看起来有点像：

label   | code | success | sum_of_values | count
sharon  | 200  |  true   |      10       |   1
sharon  | 200  |  false  |      35       |   1
paul    | 200  |  true   |      100      |   2
paul    | 404  |  true   |      15       |   1
paul    | 404  |  false  |      99       |   1

Where there are five groups:

其中有五个组：

1. { "timeStamp" : 1341834988666, "label" : "sharon", "responseCode" : "200", "value" : 10, "success" : "true"}

2. { "timeStamp" : 1341834988696, "label" : "sharon", "responseCode" : "200", "value" : 35, "success" : "false"}

3. { "timeStamp" : 1341834988676, "label" : "paul", "responseCode" : "200", "value" : 60, "success" : "true"}
   { "timeStamp" : 1341834988166, "label" : "paul", "responseCode" : "200", "value" : 40, "success" : "true"}

4. { "timeStamp" : 1341834988686, "label" : "paul", "responseCode" : "404", "value" : 15, "success" : "true"}

5. { "timeStamp" : 1341834988266, "label" : "paul", "responseCode" : "404", "value" : 99, "success" : "false"}

Answer 1

回答by Oliver Lloyd

OK, so the solution is to specify an aggregate key for the _id value. This is documented hereas:

好的，所以解决方案是为 _id 值指定一个聚合键。这在此处记录为：

You can specify a single field from the documents in the pipeline, a previously computed value, or an aggregate key made up from several incoming fields.

您可以从管道中的文档中指定单个字段、先前计算的值或由多个传入字段组成的聚合键。

But it doesn't actually define the format for an aggregate key. Reading the earlier documentation hereI saw that the previous collection.group method could take multiple fields and that the same structure is used in the new framework.

但它实际上并没有定义聚合键的格式。阅读此处的早期文档，我看到之前的 collection.group 方法可以采用多个字段，并且在新框架中使用了相同的结构。

So, to group over multiple fields you could use _id : { success:'$success', responseCode:'$responseCode', label:'$label'}

因此，要对多个字段进行分组，您可以使用 _id : { success:'$success', responseCode:'$responseCode', label:'$label'}

As in:

如：

resultsCollection.aggregate(
{ $match : { testid : testid} },
{ $skip : alreadyRead },
{ $project : {
        timeStamp : 1 ,
        label : 1,
        responseCode : 1 ,
        value : 1,
        success : 1
    }},
{ $group : {
        _id :  { success:'$success', responseCode:'$responseCode', label:'$label'},
        max_timeStamp : { $timeStamp : 1 },
        count_responseCode : { $sum : 1 },
        avg_value : { $sum : "$value" },
        count_success : { $sum : 1 }
    }}
);

Mongodb 聚合框架 | 分组多个值？

提问by Oliver Lloyd

回答by Oliver Lloyd

相关推荐

最近更新

标签

Mongodb 聚合框架 | 分组多个值？

提问by Oliver Lloyd

回答by Oliver Lloyd

相关推荐

更新 MongoDB 中精确元素数组中的字段

如何在猫鼬中进行原始 mongodb 操作？

如何使用 mongoose 将文档插入 mongodb 并获取生成的 id？

MongoDB - 更新文档数组中的对象（嵌套更新）

相关推荐

最近更新

标签