具有数组值总和的 MongoDB 聚合

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29319799/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 20:22:01  来源:igfitidea点击:

MongoDB Aggregation with sum of array values

mongodbmongodb-queryaggregation-framework

提问by buddy123

I have a collection with the following data:

我有一个包含以下数据的集合:

{
    "_id" : ObjectId("5516d416d0c2323619ddbca8"),
    "date" : "28/02/2015",
    "driver" : "user1",
    "passengers" : [
        {
            "user" : "user2",
            "times" : 2
        },
        {
            "user" : "user3",
            "times" : 3
        }
    ]
}
{
    "_id" : ObjectId("5516d517d0c2323619ddbca9"),
    "date" : "27/02/2015",
    "driver" : "user2",
    "passengers" : [
        {
            "user" : "user1",
            "times" : 2
        },
        {
            "user" : "user3",
            "times" : 2
        }
    ]
}

And I would like to perform aggregation so that I will know for a certain passenger, times it was with a certain driver, in my example it would be: for user1: [{ driver: user2, times: 2}]for user2: [{ driver: user1, times: 2}]for user3: [{ driver: user1, times: 3}, {driver: user2, times:2}]

我想执行聚合,以便我知道某个乘客是某个司机的时间,在我的示例中,它将是:对于用户 1:[{ driver: user2, times: 2}]对于用户 2:对于用户 3 [{ driver: user1, times: 2}][{ driver: user1, times: 3}, {driver: user2, times:2}]

Im quite new with mongo and know how to perform easy aggregation with sum, but not when its inside arrays, and when my subject is itself in the array. what is the appropriate way to perform this kind of aggregation, and in more specific, how I perform it in express.js based server?

我对 mongo 很陌生,知道如何使用 sum 执行简单的聚合,但当它在数组内部时,以及当我的主题本身在数组中时,我就不知道了。执行这种聚合的合适方法是什么,更具体地说,我如何在基于 express.js 的服务器中执行它?

回答by chridam

To achieve your needs with aggregation framework, the first pipeline stage will be a $matchoperation on the passenger in question that matches the documents with the user in the passenger array, followed by the $unwindoperation which deconstructs the passengers array from the input documents in the previous operation to output a document for each element. Another $matchoperation on the deconstructed array follows that further filters the previous document stream to allow only matching documents to pass unmodified into the next pipeline stage, which is projecting the required fields with the $projectoperator. So essentially your aggregation pipeline for user3will be like:

为了使用聚合框架实现您的需求,第一个管道阶段将是$match对有问题的乘客进行操作,将文档与乘客数组中的用户匹配,然后$unwind是从前一个操作中的输入文档中解构乘客数组的操作为每个元素输出一个文档。$match对解构数组的另一个操作是进一步过滤先前的文档流,以仅允许匹配的文档未经修改地传递到下一个管道阶段,该阶段正在使用$project运算符投影所需的字段。所以基本上你的聚合管道user3将是这样的:

db.collection.aggregate([
     {
        "$match": {
            "passengers.user": "user3"
        }
     },
     {
         "$unwind": "$passengers"
     },
     {
        "$match": {
            "passengers.user": "user3"
        }
     },
     {
         "$project": {
             "_id": 0,
            "driver": "$driver",
            "times": "$passengers.times"
        }
     }
])

Result:

结果

/* 0 */
{
    "result" : [ 
        {
            "driver" : "user1",
            "times" : 3
        }, 
        {
            "driver" : "user2",
            "times" : 2
        }
    ],
    "ok" : 1
}

UPDATE:

更新

For grouping duplicates on drivers with different dates, as you mentioned you can do a $groupoperation just before the last $projectpipeline stage where you compute the total passengers times using the $sumoperator:

对于具有不同日期的驱动程序的重复分组,正如您提到的,您可以$group在最后一个$project管道阶段之前执行一个操作,在该阶段使用$sum运算符计算总乘客时间:

db.collection.aggregate([
     {
        "$match": {
            "passengers.user": "user3"
        }
     },
     {
         "$unwind": "$passengers"
     },
     {
        "$match": {
            "passengers.user": "user3"
        }
     },
     {
         "$group": {
             "_id": "$driver", 
             "total": {
                 "$sum": "$passengers.times"
             }
         }
     },
     {
         "$project": {
            "_id": 0,
            "driver": "$_id",
            "total": 1
        }
     }
])

Result:

结果

/* 0 */
{
    "result" : [ 
        {
            "total" : 2,
            "driver" : "user2"
        }, 
        {
            "total" : 3,
            "driver" : "user1"
        }
    ],
    "ok" : 1
}