MongoDb 聚合 按日期分组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19950202/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 13:30:10  来源:igfitidea点击:

MongoDb aggregation Group by Date

mongodbaggregation-framework

提问by Hymanie

I'm trying to group by timestamp for the collection named "foo" { _id, TimeStamp }

我正在尝试按时间戳对名为“foo”的集合进行分组 { _id, TimeStamp }

db.foos.aggregate(
[
   {$group : { _id : new Date (Date.UTC({ $year : '$TimeStamp' },{ $month : '$TimeStamp' },{$dayOfMonth : '$TimeStamp'}))       }}
])

Expecting many dates but the result is just one date. The data i'm using is correct (has many foo and different dates except 1970). There's some problem in the date parsing but i can not solve yet.

期待多个日期,但结果只是一个日期。我使用的数据是正确的(除了 1970 年有很多 foo 和不同的日期)。日期解析存在一些问题,但我还无法解决。

{
    "result" : [ 
        {
            "_id" : ISODate("1970-01-01T00:00:00.000Z")
        }
    ],
    "ok" : 1
}

Tried this One:

试过这个:

db.foos.aggregate(
[
   {$group : { _id : { year : { $year : '$TimeStamp' }, month : { $month : '$TimeStamp' }, day : {$dayOfMonth : '$TimeStamp'} }, count : { $sum : 1 }       }},
   {$project : { parsedDate : new Date('$_id.year', '$_id.month', '$_id.day') , count : 1, _id : 0} }
])

Result :

结果 :

uncaught exception: aggregate failed: {
    "errmsg" : "exception: disallowed field type Date in object expression (at 'parsedDate')",
    "code" : 15992,
    "ok" : 0
}

And that one:

还有那个:

db.foos.aggregate(
[
   {$group : { _id : { year : { $year : '$TimeStamp' }, month : { $month : '$TimeStamp' }, day : {$dayOfMonth : '$TimeStamp'} }, count : { $sum : 1 }       }},
   {$project : { parsedDate : Date.UTC('$_id.year', '$_id.month', '$_id.day') , count : 1, _id : 0} }
])

Can not see dates in the result

结果中看不到日期

{
    "result" : [ 
        {
            "count" : 412
        }, 
        {
            "count" : 1702
        }, 
        {
            "count" : 422
        }
    ],
    "ok" : 1
}

回答by petra

db.foos.aggregate(
    [   
        {   $project : { day : {$substr: ["$TimeStamp", 0, 10] }}},        
        {   $group   : { _id : "$day",  number : { $sum : 1 }}},
        {   $sort    : { _id : 1 }}        
    ]
)

Group by date can be done in two steps in the aggregation framework, an additional third step is needed for sorting the result, if sorting is desired:

按日期分组可以在聚合框架中分两步完成,如果需要排序,则需要额外的第三步来对结果进行排序:

  1. $projectin combination with $substrtakes the first 10 characters (YYYY:MM:DD) of the ISODate object from each document (the result is a collection of documents with the fields "_id" and "day");
  2. $groupgroups by day, adding (summing) the number 1 for each matching document;
  3. $sortascending by "_id", which is the day from the previous aggregation step - this is optional if sorted result is desired.
  1. $projectin 组合$substr从每个文档中获取 ISODate 对象的前 10 个字符 (YYYY:MM:DD)(结果是包含字段“_id”和“day”的文档集合);
  2. $group按天分组,为每个匹配的文档添加(求和)数字 1;
  3. $sort按“_id”升序,这是上一个聚合步骤的日期 - 如果需要排序结果,这是可选的。

This solution can not take advantage of indexes like db.twitter.ensureIndex( { TimeStamp: 1 } ), because it transforms the ISODate object to a string object on the fly. For large collections (millions of documents) this could be a performance bottleneck and more sophisticated approaches should be used.

此解决方案不能利用像 的索引db.twitter.ensureIndex( { TimeStamp: 1 } ),因为它会动态地将 ISODate 对象转换为字符串对象。对于大型集合(数百万个文档),这可能是性能瓶颈,应该使用更复杂的方法。

回答by Asya Kamsky

It depends on whether you want to have the date as ISODate type in the final output. If so, then you can do one of two things:

这取决于您是否希望在最终输出中将日期作为 ISODate 类型。如果是这样,那么您可以执行以下两项操作之一:

  1. Extract $year, $month, $dayOfMonthfrom your timestamp and then reconstruct a new date out of them (you are already trying to do that, but you're using syntax that doesn't work in aggregation framework).

  2. If the original Timestamp is of type ISODate() then you can do date arithmetic to subtract the hours, minutes, seconds and milliseconds from your timestamp to get a new date that's "rounded" to the day.

  1. 从您的时间戳中提取$year, $month,$dayOfMonth然后从中重建一个新日期(您已经在尝试这样做,但是您使用的语法在聚合框架中不起作用)。

  2. 如果原始时间戳是 ISODate() 类型,那么您可以进行日期算术,从时间戳中减去小时、分钟、秒和毫秒,以获得“四舍五入”到当天的新日期。

There is an example of 2 here.

这里有一个2例子

Here is how you would do 1. I'm making an assumption that all your dates are this year, but you can easily adjust the math to accommodate your oldest date.

以下是您将如何做 1. 我假设您所有的日期都是今年,但您可以轻松调整数学以适应您最早的日期。

project1={$project:{_id:0, 
                   y:{$subtract:[{$year:"$TimeStamp"}, 2013]},
                   d:{$subtract:[{$dayOfYear:"$TimeStamp"},1]}, 
                   TimeStamp:1, 
                   jan1:{$literal:new ISODate("2013-01-01T00:00:00")}
         } };
project2={$project:{tsDate:{$add:[
                       "$jan1",
                       {$multiply:["$y", 365*24*60*60*1000]},
                       {$multiply:["$d", 24*60*60*1000]}
         ] } } };

Sample data:

样本数据:

db.foos.find({},{_id:0,TimeStamp:1})
{ "TimeStamp" : ISODate("2013-11-13T19:15:05.600Z") }
{ "TimeStamp" : ISODate("2014-02-01T10:00:00Z") }

Aggregation result:

聚合结果:

> db.foos.aggregate(project1, project2)
{ "tsDate" : ISODate("2013-11-13T00:00:00Z") }
{ "tsDate" : ISODate("2014-02-01T00:00:00Z") }

回答by nha

This is what I use in one of my projects :

这是我在我的一个项目中使用的:

   collection.aggregate(
      // group results by date
      {$group : {
        _id : { date : "$date" }
        // do whatever you want here, like $push, $sum...
      }},

      // _id is the date
      {$sort : { _id : -1}},                        
      {$orderby: { _id : -1 }})
    .toArray()

Where $date is a Date object in mongo. I get results indexed by date.

$date 是 mongo 中的 Date 对象。我得到按日期索引的结果。