MongoDB 'count()' 非常慢。我们如何改进/解决它?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7658228/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
MongoDB 'count()' is very slow. How do we refine/work around with it?
提问by Winston Chen
I am currently using MongoDB with millions of data records. I discovered one thing that's pretty annoying.
我目前正在使用具有数百万条数据记录的 MongoDB。我发现一件事很烦人。
When I use 'count()' function with a small number of queried data collection, it's very fast. However, when the queried data collection contains thousand or even millions of data records, the entire system becomes very slow.
当我使用 'count()' 函数进行少量查询数据收集时,速度非常快。但是,当查询到的数据集包含数千甚至数百万条数据记录时,整个系统就会变得非常缓慢。
I made sure that I have indexed the required fields.
我确保我已经索引了必填字段。
Has anybody encountered an identical thing? How do you do to improve that?
有人遇到过一模一样的吗?你如何改善这一点?
采纳答案by Andrew Orsich
There is now another optimization than create proper index.
现在还有另一种优化,而不是创建适当的索引。
db.users.ensureIndex({name:1});
db.users.find({name:"Andrei"}).count();
If you need some counters i suggest to precalculate them whenever it possible. By using atomic $incoperation and not use count({})
at all.
如果您需要一些计数器,我建议尽可能预先计算它们。通过使用原子$inc操作而根本不使用count({})
。
But mongodb guys working hard on mongodb, so, count({})
improvements they are planning in mongodb 2.1 according to jira bug.
但是 mongodb 人在 mongodb 上努力工作,因此,count({})
根据 jira bug,他们正在计划在 mongodb 2.1 中进行改进。
回答by kamaradclimber
You can ensure that the index is really used without any disk access.
您可以确保在没有任何磁盘访问的情况下真正使用索引。
Let's say you want to count records with name : "Andrei"
假设您要计算名称为“Andrei”的记录
You ensure index on name (as you've done) and
您确保名称索引(正如您所做的那样)和
db.users.find({name:"andrei"}, {_id:0, name:1}).count()
you can check that it is the fastest way to count (except with precomputing) by checking if
您可以通过检查是否是最快的计数方式(预计算除外)
db.users.find({name:"andrei"}, {_id:0, name:1}).explain()
displays a index_only field set to true.
显示设置为 true 的 index_only 字段。
This trick will ensure that your query will retrieve records only from ram (index) and not from disk.
这个技巧将确保您的查询仅从 ram(索引)而不是从磁盘检索记录。
回答by Vaclav Kohout
For me the solution was change index to sparse. It depend on specific situation, just give it a try if you can.
对我来说,解决方案是将 index 更改为sparse。这取决于具体情况,如果可以,请尝试一下。
db.Account.createIndex( { "date_checked_1": 1 }, { sparse: true } )
db.Account.find({
"dateChecked" : { $exists : true }
}).count()
318 thousands records in collection
收集了 318 万条记录
- 0.31 sec - with sparse index
- 0.79 sec - with non-sparse index
- 0.31 秒 - 使用稀疏索引
- 0.79 秒 - 使用非稀疏索引
回答by Travis Reeder
You are pretty much out of luck for now, count in mongodb is awful and won't be getting better in the near future. See: https://jira.mongodb.org/browse/SERVER-1752
您现在几乎不走运,mongodb 中的计数很糟糕,并且在不久的将来不会变得更好。请参阅:https: //jira.mongodb.org/browse/SERVER-1752
From experience, you should pretty much never use it unless it's a one time thing, something that occurs very rarely, or your database is pretty small.
根据经验,您几乎不应该使用它,除非它是一次性的,很少发生的事情,或者您的数据库非常小。
As @Andrew Orsich stated, use counters whenever possible (the downfall to counters is the global write lock, but better than count() regardless).
正如@Andrew Orsich 所说,尽可能使用计数器(计数器的缺点是全局写锁,但无论如何都比 count() 好)。