mongodb Mongo $in 操作符性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4955160/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 11:58:06  来源:igfitidea点击:

Mongo $in operator performance

mongodb

提问by Derek Dahmer

Is it slow/poor form to use the $inoperator in MongoDB with a large array of possibilities?

$in在 MongoDB 中使用具有大量可能性的运算符是否缓慢/糟糕?

posts.find({
    author : {
        $in : ['friend1','friend2','friend3'....'friend40'] 
    }
})

App Engine, for example, won't let you use more than 30 because they translate directly to one query per item in the IN array, and so instead force you into using their method for handling fan out. While that's probably the most efficient method in Mongo too, the code for it is significantly more complex so I'd prefer to just use this generic method.

例如,App Engine 不会让您使用超过 30 个,因为它们直接转换为 IN 数组中的每个项目一个查询,因此强制您使用他们的方法来处理fan out。虽然这可能也是 Mongo 中最有效的方法,但它的代码要复杂得多,所以我更愿意只使用这种通用方法。

Will Mongo execute these $inqueries efficiently for reasonable-sized datasets?

Mongo 会$in为合理大小的数据集有效地执行这些查询吗?

采纳答案by Scott Hernandez

It can be fairly efficient with small lists (hard to say what small is, but at least into the tens/hundreds) for $in. It does not work like app-engine since mongodb has actual btree indexes and isn't a column store like bigtable.

对于 $in 的小列表(很难说小是什么,但至少可以达到数十/数百),它可以相当有效。它不像 app-engine 那样工作,因为 mongodb 有实际的 btree 索引,而不是像 bigtable 那样的列存储。

With $in it will skip around in the index to find the matching documents, or walk through the whole collection if there isn't an index to use.

使用 $in 它将在索引中跳过以查找匹配的文档,或者如果没有要使用的索引则遍历整个集合。

回答by Ming

Assuming you have created index on the authorfield, from algorithmic point of view, the time complexity of $inoperation is: $(N*log(M)), where Nis the length of input array and Mis the size of the collection.

假设您在author字段上创建了索引,从算法的角度来看,$in操作的时间复杂度为:$(N*log(M)),其中N是输入数组的长度,M是集合的大小。

The time complexity of $inoperation will not changeunless you change a database (Though I don't think any db can break O(N*log(M))).

除非您更改数据库,否则$in操作的时间复杂度不会改变(尽管我认为任何数据库都不会中断O(N*log(M)))。

However, from engineering point of view, if Ngoes to a big number, it is better to let your business logic server to simulate the $inoperation, either by batch or one-by-one.

但是,从工程的角度来看,如果N去一个大的数字,最好让你的业务逻辑服务器来模拟$in操作,要么批处理,要么一个一个。

This is simply because: memory in database servers is way more valuable than the memory in business logic servers.

这仅仅是因为:数据库服务器中的内存比业务逻辑服务器中的内存更有价值。

回答by Alex

If you build an index (ensureIndex) on the list element, it should be pretty quick.

如果你在列表元素上建立一个索引 (ensureIndex),它应该很快。

Have you tried using explain()? Its a good, built-in way to profile your queries: http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-Use%7B%7Bexplain%7D%7D.

您是否尝试过使用解释()?它是一种很好的内置方式来分析您的查询:http: //www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-Use%7B%7Bexplain%7D%7D