在 mongodb 中按日期分组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5168904/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
group by dates in mongodb
提问by Mark Gill
I am working on a project in which I am tracking number of clicks on a topic.
我正在从事一个项目,在该项目中我正在跟踪某个主题的点击次数。
I am using mongodb and I have to group number of click by date( i want to group data for 15 days).
我正在使用 mongodb,我必须按日期对点击次数进行分组(我想将数据分组 15 天)。
I am having data store in following format in mongodb
我在 mongodb 中有以下格式的数据存储
{
"_id" : ObjectId("4d663451d1e7242c4b68e000"),
"date" : "Mon Dec 27 2010 18:51:22 GMT+0000 (UTC)",
"topic" : "abc",
"time" : "18:51:22"
}
{
"_id" : ObjectId("4d6634514cb5cb2c4b69e000"),
"date" : "Mon Dec 27 2010 18:51:23 GMT+0000 (UTC)",
"topic" : "bce",
"time" : "18:51:23"
}
i want to group number of clicks on topic:abc by days(for 15 days)..i know how to group that but how can I group by date which are stored in my database
我想对主题的点击次数进行分组:abc 按天(15 天)...我知道如何分组,但如何按存储在我的数据库中的日期分组
I am looking for result in following format
我正在寻找以下格式的结果
[
{
"date" : "date in log",
"click" : 9
},
{
"date" : "date in log",
"click" : 19
},
]
I have written code but it will work only if date are in string (code is here http://pastebin.com/2wm1n1ix) ...please guide me how do I group it
我已经编写了代码,但只有当日期在字符串中时它才会工作(代码在此处http://pastebin.com/2wm1n1ix)...请指导我如何对其进行分组
回答by Old Pro
New answer using Mongo aggregation framework
使用 Mongo 聚合框架的新答案
After this question was asked and answered, 10gen released Mongodb version 2.2 with an aggregation framework, which is now the better way to do this sort of query. This query is a little challenging because you want to group by date and the values stored are timestamps, so you have to do something to convert the timestamps to dates that match. For the purposes of example I will just write a query that gets the right counts.
在提出并回答了这个问题之后,10gen 发布了带有聚合框架的 Mongodb 2.2 版,这是现在进行此类查询的更好方法。这个查询有点挑战性,因为你想按日期分组,存储的值是时间戳,所以你必须做一些事情来将时间戳转换为匹配的日期。出于示例的目的,我将只编写一个获取正确计数的查询。
db.col.aggregate(
{ $group: { _id: { $dayOfYear: "$date"},
click: { $sum: 1 } } }
)
This will return something like:
这将返回如下内容:
[
{
"_id" : 144,
"click" : 165
},
{
"_id" : 275,
"click" : 12
}
]
You need to use $match
to limit the query to the date range you are interested in and $project
to rename _id
to date
. How you convert the day of year back to a date is left as an exercise for the reader. :-)
您需要使用$match
将查询限制在您感兴趣的日期范围内$project
并重命名_id
为date
. 如何将一年中的某一天转换回日期作为练习留给读者。:-)
10gen has a handy SQL to Mongo Aggregation conversion chartworth bookmarking. There is also a specific article on date aggregation operators.
10gen 有一个方便的SQL 到 Mongo 聚合转换图表,值得收藏。还有一篇关于日期聚合运算符的特定文章。
Getting a little fancier, you can use:
变得有点狂热,你可以使用:
db.col.aggregate([
{ $group: {
_id: {
$add: [
{ $dayOfYear: "$date"},
{ $multiply:
[400, {$year: "$date"}]
}
]},
click: { $sum: 1 },
first: {$min: "$date"}
}
},
{ $sort: {_id: -1} },
{ $limit: 15 },
{ $project: { date: "$first", click: 1, _id: 0} }
])
which will get you the latest 15 days and return some datetime within each day in the date
field. For example:
这将为您提供最近的 15 天,并在date
现场每天返回一些日期时间。例如:
[
{
"click" : 431,
"date" : ISODate("2013-05-11T02:33:45.526Z")
},
{
"click" : 702,
"date" : ISODate("2013-05-08T02:11:00.503Z")
},
...
{
"click" : 814,
"date" : ISODate("2013-04-25T00:41:45.046Z")
}
]
回答by mindthief
Late answer, but for the record (for anyone else that comes to this page): You'll need to use the 'keyf' argument instead of 'key', since your key is actually going to be a function of the date on the event (i.e. the "day" extracted from the date) and not the date itself. This should do what you're looking for:
迟到的答案,但为了记录(对于来到此页面的其他任何人):您需要使用“keyf”参数而不是“key”,因为您的密钥实际上将是日期的函数事件(即从日期中提取的“天”)而不是日期本身。这应该做你正在寻找的:
db.coll.group(
{
keyf: function(doc) {
var date = new Date(doc.date);
var dateKey = (date.getMonth()+1)+"/"+date.getDate()+"/"+date.getFullYear()+'';
return {'day':dateKey};
},
cond: {topic:"abc"},
initial: {count:0},
reduce: function(obj, prev) {prev.count++;}
});
For more information, take a look at MongoDB's doc page on aggregation and group: http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group
有关更多信息,请查看关于聚合和组的 MongoDB 文档页面:http: //www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group
回答by Jonas Tomanga
This can help
这可以帮助
return new Promise(function(resolve, reject) {
db.doc.aggregate(
[
{ $match: {} },
{ $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }, count: { $sum: 1 } } },
{ $sort: { _id: 1 } }
]
).then(doc => {
/* if you need a date object */
doc.forEach(function(value, index) {
doc[index]._id = new Date(value._id);
}, this);
resolve(doc);
}).catch(reject);
}
回答by mhalttu
There are already many answers to this question, but I wasn't happy with any of them. MongoDB has improved over the years, and there are now easier ways to do it. The answer by Jonas Tomangagets it right, but is a bit too complex.
这个问题已经有很多答案了,但我对其中任何一个都不满意。MongoDB 多年来一直在改进,现在有更简单的方法来做到这一点。Jonas Tomanga的答案是正确的,但有点过于复杂。
If you are using MongoDB 3.0 or later, here's how you can group by date. I start with the $match
aggregation because the author also asked how to limit the results.
如果您使用的是 MongoDB 3.0 或更高版本,以下是按日期分组的方法。我从$match
聚合开始,因为作者也问了如何限制结果。
db.yourCollection.aggregate([
{ $match: { date: { $gte: ISODate("2019-05-01") } } },
{ $group: { _id: { $dateToString: { format: "%Y-%m-%d", date: "$date"} }, count: { $sum: 1 } } },
{ $sort: { _id: 1} }
])
回答by anshuman burmman
To fetch data group by date in mongodb
在mongodb中按日期获取数据组
db.getCollection('supportIssuesChat').aggregate([
{
$group : {
_id :{ $dateToString: { format: "%Y-%m-%d", date: "$createdAt"} },
list: { $push: "$$ROOT" },
count: { $sum: 1 }
}
}
])
回答by enricog
Haven't worked that much with MongoDB yet, so I am not completely sure. But aren't you able to use full Javascript?
So you could parse your date with Javascript Date
class, create your date for the day out of it and set as key into an "out" property. And always add one if the key already exists, otherwise create it new with value = 1 (first click). Below is your code with adapted reduce function (untested code!):
还没有在 MongoDB 上做过那么多工作,所以我不完全确定。但是您不能使用完整的 Javascript 吗?
因此,您可以使用 JavascriptDate
类解析您的日期,为其创建当天的日期并将其设置为“out”属性的键。如果键已经存在,总是添加一个,否则用 value = 1 创建新的(第一次单击)。下面是你的代码与适应的减少功能(未经测试的代码!):
db.coll.group(
{
key:{'date':true},
initial: {retVal: {}},
reduce: function(doc, prev){
var date = new Date(doc.date);
var dateKey = date.getFullYear()+''+date.getMonth()+''+date.getDate();
(typeof prev.retVal[dateKey] != 'undefined') ? prev.retVal[dateKey] += 1 : prev.retVal[dateKey] = 1;
},
cond: {topic:"abc"}
}
)
回答by golja
Another late answer, but still. So if you wanna do it in only one iteration and get the number of clicks grouped by date and topic you can use the following code:
另一个迟到的答案,但仍然如此。因此,如果您只想在一次迭代中完成并获得按日期和主题分组的点击次数,您可以使用以下代码:
db.coll.group(
{
$keyf : function(doc) {
return { "date" : doc.date.getDate()+"/"+doc.date.getMonth()+"/"+doc.date.getFullYear(),
"topic": doc.topic };
},
initial: {count:0},
reduce: function(obj, prev) { prev.count++; }
})
Also If you would like to optimize the query as suggested you can use an integer value for date (hint: use valueOf(), for the key date instead of the String, though for my examples the speed was the same.
此外,如果您想按照建议优化查询,您可以使用整数值作为日期(提示:使用 valueOf(),作为关键日期而不是字符串,但对于我的示例,速度是相同的。
Furthermore it's always wise to check the MongoDB docs regularly, because they keep adding new features all the time. For example with the new Aggregation framework, which will be released in the 2.2 version you can achieve the same results much easier http://docs.mongodb.org/manual/applications/aggregation/
此外,定期检查 MongoDB 文档总是明智的,因为它们一直在添加新功能。例如,使用将在 2.2 版本中发布的新聚合框架,您可以更轻松地实现相同的结果http://docs.mongodb.org/manual/applications/aggregation/
回答by phnessu4
thanks for @mindthief, your answer help solve my problem today. The function below can group by day a little more easier, hope can help the others.
感谢@mindthief,您的回答有助于解决我今天的问题。下面的功能可以更轻松地按天分组,希望可以帮助其他人。
/**
* group by day
* @param query document {key1:123,key2:456}
*/
var count_by_day = function(query){
return db.action.group(
{
keyf: function(doc) {
var date = new Date(doc.time);
var dateKey = (date.getMonth()+1)+"/"+date.getDate()+"/"+date.getFullYear();
return {'date': dateKey};
},
cond:query,
initial: {count:0},
reduce: function(obj, prev) {
prev.count++;
}
});
}
count_by_day({this:'is',the:'query'})
回答by Blakes Seven
If You want a Date ojectreturned directly
如果你想直接返回一个Date 对象
Then instead of applying the Date Aggregation Operators, instead apply "Date Math" to round the date object. This can often be desirable as all drivers represent a BSON Date in a form that is commonly used for Date manipulation for all languages where that is possible:
然后不是应用日期聚合运算符,而是应用“日期数学”来舍入日期对象。这通常是可取的,因为所有驱动程序都以一种通常用于所有语言的日期操作的形式表示 BSON 日期,在可能的情况下:
db.datetest.aggregate([
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"click": { "$sum": 1 }
}}
])
Or if as is implied in the question that the grouping interval required is "buckets" of 15 days, then simply apply that to the numeric value in $mod
:
或者,如果问题中暗示所需的分组间隔是 15 天的“桶”,则只需将其应用于 中的数值$mod
:
db.datetest.aggregate([
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$date", new Date(0) ] },
1000 * 60 * 60 * 24 * 15
]}
]},
new Date(0)
]
},
"click": { "$sum": 1 }
}}
])
The basic math applied is that when you $subtract
two Date
objects the result returned will be the milliseconds of differnce numerically. So epoch is represented by Date(0)
as the base for conversion in whatever language constructor you have.
应用的基本数学原理是,当您使用$subtract
两个Date
对象时,返回的结果将是数值上相差的毫秒数。因此,epoch 被表示Date(0)
为在您拥有的任何语言构造函数中进行转换的基础。
With a numeric value, the "modulo" ( $mod
) is applied to round the date ( subtract the remainder from the division ) to the required interval. Being either:
对于数值,“模” ( $mod
) 用于将日期舍入(从除法中减去余数)到所需的间隔。要么:
1000 milliseconds x 60 seconds * 60 minutes * 24 hours = 1 day
1000 毫秒 x 60 秒 * 60 分钟 * 24 小时 = 1 天
Or
或者
1000 milliseconds x 60 seconds * 60 minutes * 24 hours * 15 days = 15 days
1000 毫秒 x 60 秒 * 60 分钟 * 24 小时 * 15 天 = 15 天
So it's flexible to whatever interval you require.
因此,它可以灵活适应您需要的任何间隔。
By the same token from above an $add
operation between a "numeric" value and a Date
object will return a Date
object equivalent to the millseconds value of both objects combined ( epoch is 0, therefore 0 plus difference is the converted date ).
出于同样$add
的原因,“数字”值和Date
对象之间的操作将返回一个Date
对象,该对象相当于两个对象组合的毫秒值(纪元为 0,因此 0 加差值是转换后的日期)。
Easily represented and reproducible in the following listing:
在以下列表中可以轻松表示和重现:
var now = new Date();
var bulk = db.datetest.initializeOrderedBulkOp();
for ( var x = 0; x < 60; x++ ) {
bulk.insert({ "date": new Date( now.valueOf() + ( 1000 * 60 * 60 * 24 * x ))});
}
bulk.execute();
And running the second example with 15 day intervals:
并以 15 天的间隔运行第二个示例:
{ "_id" : ISODate("2016-04-14T00:00:00Z"), "click" : 12 }
{ "_id" : ISODate("2016-03-30T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-03-15T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-02-29T00:00:00Z"), "click" : 15 }
{ "_id" : ISODate("2016-02-14T00:00:00Z"), "click" : 3 }
Or similar distribution depending on the current date when the listing is run, and of course the 15 day intervals will be consistent since the epoch date.
或类似的分布取决于当前运行列表的日期,当然,自纪元日期起 15 天的间隔将保持一致。
Using the "Math" method is a bit easier to tune, especially if you want to adjust time periods for different timezones in aggregation output where you can similarly numerically adjust by adding/subtracting the numeric difference from UTC.
使用“数学”方法更容易调整,特别是如果您想调整聚合输出中不同时区的时间段,您可以通过添加/减去 UTC 的数字差异来类似地进行数字调整。
回答by egvo
Of course, thatis a good solution. Aside from that you can group dates by days as strings (as that answerpropose) or you can get the beginning of dates by projecting date field (in aggregation) like that:
当然,这是一个很好的解决方案。除此之外,您可以按天将日期分组为字符串(如该答案所建议的那样),或者您可以通过像这样投影日期字段(聚合)来获取日期的开始:
{'$project': {
'start_of_day': {'$subtract': [
'$date',
{'$add': [
{'$multiply': [{'$hour': '$date'}, 3600000]},
{'$multiply': [{'$minute': '$date'}, 60000]},
{'$multiply': [{'$second': '$date'}, 1000]},
{'$millisecond': '$date'}
]}
]},
}}
It gives you this:
它给你这个:
{
"start_of_day" : ISODate("2015-12-03T00:00:00.000Z")
},
{
"start_of_day" : ISODate("2015-12-04T00:00:00.000Z")
}
It has some pluses: you can manipulate with your days in date type (not number or string), it allows you to use all of the date aggregation operatorsin following aggregation operations and gives you date type on the output.
它有一些优点:您可以在日期类型(不是数字或字符串)中操作您的天数,它允许您在以下聚合操作中使用所有日期聚合运算符,并在输出中为您提供日期类型。