mongodb 聚合框架中的 $skip 和 $limit
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24160037/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
$skip and $limit in aggregation framework
提问by yaoxing
When I read the document I found the following notes:
当我阅读文档时,我发现了以下注释:
When a $sort immediately precedes a $limit in the pipeline, the $sort operation only maintains the top n results as it progresses, where n is the specified limit, and MongoDB only needs to store n items in memory. This optimization still applies when allowDiskUse is true and the n items exceed the aggregation memory limit.
当 $sort 紧跟在管道中的 $limit 之前时,$sort 操作只会在执行过程中维护前 n 个结果,其中 n 是指定的限制,而 MongoDB 只需要在内存中存储 n 个项目。当 allowDiskUse 为 true 并且 n 项超过聚合内存限制时,此优化仍然适用。
If I'm right about this, it applies only when I use the $sort and $limit together like
如果我对此是正确的,它仅适用于我将 $sort 和 $limit 一起使用时,例如
db.coll.aggregate([
...,
{$sort: ...},
{$limit: limit},
...
]);
However, I think most of the time we would have
但是,我认为大部分时间我们都会
db.coll.aggregate([
...,
{$sort: ...},
{$skip: skip},
{$limit: limit},
...
]);
Question 1: Does it mean the rule above doesn't apply if I use $skip here?
问题 1:如果我在这里使用 $skip,是否意味着上述规则不适用?
I ask this question because theoretically MongoDB can still calculate the top nrecords and enhance performance by sorting only top nrecords. I didn't find any document about this though. And if the rule doesn't apply,
我问这个问题是因为理论上 MongoDB 仍然可以通过只排序前n条记录来计算前n条记录并提高性能。不过,我没有找到任何关于此的文件。如果规则不适用,
Question 2: Do I need to change my query to the following to enhance performance?
问题 2:我是否需要将查询更改为以下内容以提高性能?
db.coll.aggregate([
...,
{$sort: ...},
{$limit: skip + limit},
{$skip: skip},
{$limit: limit},
...
]);
EDIT: I think explains my use case would make the question above makes more sense. I'm using the text search feature provided by MongoDB 2.6 to look for products. I'm worried if the user inputs a very common key word like "red", there will be too many results returned. Thus I'm looking for better ways to generate this result.
编辑:我认为解释我的用例会使上面的问题更有意义。我正在使用 MongoDB 2.6 提供的文本搜索功能来查找产品。我担心如果用户输入“红色”这样一个非常常见的关键字,返回的结果会太多。因此,我正在寻找更好的方法来产生这个结果。
EDIT2: It turns out that the last code above equals to
EDIT2:事实证明,上面的最后一个代码等于
db.coll.aggregate([
...,
{$sort: ...},
{$limit: skip + limit},
{$skip: skip},
...
]);
Thus I we can always use this form to make the top nrule apply.
因此,我们总是可以使用这种形式来使top n规则适用。
回答by Neil Lunn
Since this is a text search query we are talking about then the most optimal form is this:
由于这是我们正在谈论的文本搜索查询,因此最佳形式是:
db.collection.aggregate([
{
"$match": {
"$text": { "$search": "cake tea" }
}
},
{ "$sort": { "score": { "$meta": "textScore" } } },
{ "$limit": skip + limit },
{ "$skip": skip }
])
The rationale on the memory reserve from the top "sort" results will only work within it's own "limits" as it were and this will not be optimal for anything beyond a few reasonable "pages" of data.
来自顶部“排序”结果的内存保留的基本原理只能在它自己的“限制”内工作,并且这对于超出几个合理“页面”数据的任何内容都不是最佳的。
Beyond what is reasonable for memory consumption, the additional stage will likely have a negative effect rather than positive.
除了内存消耗的合理范围之外,额外的阶段可能会产生负面影响而不是正面影响。
These really are the practical limitations of the text search capabilities available to MongoDB in the current form. But for anything more detailed and requiring more performance, then just as is the case with many SQL "full text" solutions, you are better off using an external "purpose built" text search solution.
这些确实是当前形式中 MongoDB 可用的文本搜索功能的实际限制。但是对于更详细和需要更高性能的任何内容,就像许多 SQL“全文”解决方案的情况一样,您最好使用外部“专用”文本搜索解决方案。