MongoDB {aggregation $match} 与 {find} 速度
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28364319/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
MongoDB {aggregation $match} vs {find} speed
提问by Owumaro
I have a mongoDB collection with millions of rows and I'm trying to optimize my queries. I'm currently using the aggregation framework to retrieve data and group them as I want. My typical aggregation query is something like : $match > $group > $ group > $project
我有一个包含数百万行的 mongoDB 集合,我正在尝试优化我的查询。我目前正在使用聚合框架来检索数据并根据需要对它们进行分组。我的典型聚合查询类似于:$match > $group > $ group > $project
However, I noticed that the last parts only take a few ms, the beginning is the slowest.
但是,我注意到最后部分只需要几毫秒,开始是最慢的。
I tried to perform a query with only the $match filter, and then to perform the same query with collection.find. The aggregation query takes ~80ms while the find query takes 0 or 1ms.
我尝试仅使用 $match 过滤器执行查询,然后使用 collection.find 执行相同的查询。聚合查询需要大约 80 毫秒,而查找查询需要 0 或 1 毫秒。
I have indexes on pretty much each field so I guess this isn't the problem. Any idea on what could go wrong ? Or is it just a "normal" drawback of the aggregation framework ?
我几乎每个字段都有索引,所以我想这不是问题。知道什么可能出错吗?或者它只是聚合框架的“正常”缺点?
I could use find queries instead of aggregation queries, however I would have to perform a lot of processing after the request and this process can be done quickly with $group
etc. so I would rather keep the aggregation framework.
我可以使用查找查询而不是聚合查询,但是我必须在请求之后执行大量处理,并且这个过程可以通过$group
等快速完成。所以我宁愿保留聚合框架。
Thanks,
谢谢,
EDIT :
编辑 :
Here is my criteria :
这是我的标准:
{
"action" : "click",
"timestamp" : {
"$gt" : ISODate("2015-01-01T00:00:00Z"),
"$lt" : ISODate("2015-02-011T00:00:00Z")
},
"itemId" : "5"
}
采纳答案by vladzam
The main purpose of the aggregation framework
is to ease the query of a big number of entries and generate a low number of results that hold value to you.
的主要目的aggregation framework
是简化大量条目的查询并生成少量对您有价值的结果。
As you have said, you can also use multiple find
queries, but remember that you can not create new fields with find
queries. On the other hand, the $group
stage allows you to define your new fields.
正如您所说,您也可以使用多个find
查询,但请记住,您不能使用find
查询创建新字段。另一方面,该$group
阶段允许您定义新字段。
If you would like to achieve the functionality of the aggregation framework
, you would most likely have to run an initial find
(or chain several ones), pull that information and further manipulate it with a programming language.
如果您想实现 的功能aggregation framework
,您很可能必须运行初始find
(或链接多个),提取该信息并使用编程语言进一步操作它。
The aggregation pipeline
might seem to take longer, but at least you know you only have to take into account the performance of one system - MongoDB engine.
这aggregation pipeline
似乎需要更长的时间,但至少您知道您只需要考虑一个系统的性能 - MongoDB 引擎。
Whereas, when it comes to manipulating the data returned from a find
query, you would most likely have to further manipulate the data with a programming language, thus increasing the complexity depending on the intricacies of the programming language of choice.
然而,在处理从find
查询返回的数据时,您很可能必须使用编程语言进一步处理数据,从而根据所选编程语言的复杂性增加复杂性。
回答by harshad
Have you tried using explain() to your find queries? It'll give you good idea about how much time find() query will exactly take. You can do the same for $match with $explain & see whether there is any difference in index accessing & other parameters.
您是否尝试过使用解释()来查找查询?它会让你很好地了解 find() 查询需要多少时间。你可以用 $explain 对 $match 做同样的事情,看看索引访问和其他参数是否有任何区别。
Also the $group part of aggregation framework doesn't utilize the indexing so it has to process all the records returned by $match stage of aggregation framework. So to better understand the the working of your query see the result set it returns & whether it fits into memory to be processed by MongoDB.
此外,聚合框架的 $group 部分不使用索引,因此它必须处理聚合框架的 $match 阶段返回的所有记录。因此,为了更好地了解查询的工作情况,请查看它返回的结果集以及它是否适合 MongoDB 处理的内存。