mongodb 溢出排序阶段缓冲的数据使用量超过内部限制
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27023622/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Overflow sort stage buffered data usage exceeds internal limit
提问by sheetal_158
Using the code:
使用代码:
all_reviews = db_handle.find().sort('reviewDate', pymongo.ASCENDING)
print all_reviews.count()
print all_reviews[0]
print all_reviews[2000000]
The count prints 2043484
, and it prints all_reviews[0]
.
计数打印2043484
,它打印all_reviews[0]
。
However when printing all_reviews[2000000]
, I get the error:
但是,在打印时all_reviews[2000000]
,出现错误:
pymongo.errors.OperationFailure: database error: Runner error: Overflow sort stage buffered data usage of 33554495 bytes exceeds internal limit of 33554432 bytes
pymongo.errors.OperationFailure:数据库错误:Runner 错误:溢出排序阶段缓冲的 33554495 字节数据使用量超过了 33554432 字节的内部限制
How do I handle this?
我该如何处理?
回答by A. Jesse Jiryu Davis
You're running into the 32MB limit on an in-memory sort:
您遇到了内存中排序的 32MB 限制:
https://docs.mongodb.com/manual/reference/limits/#Sort-Operations
https://docs.mongodb.com/manual/reference/limits/#Sort-Operations
Add an index to the sort field. That allows MongoDB to stream documents to you in sorted order, rather than attempting to load them all into memory on the server and sort them in memory before sending them to the client.
向排序字段添加索引。这允许 MongoDB 按排序顺序将文档流式传输给您,而不是尝试将它们全部加载到服务器上的内存中并在将它们发送到客户端之前在内存中对它们进行排序。
回答by JERRY
As said by kumar_harsh
in the comments section, i would like to add another point.
正如kumar_harsh
评论部分所说,我想补充一点。
You can view the current buffer usage using the below command over the admin
database:
您可以在admin
数据库上使用以下命令查看当前缓冲区使用情况:
> use admin
switched to db admin
> db.runCommand( { getParameter : 1, "internalQueryExecMaxBlockingSortBytes" : 1 } )
{ "internalQueryExecMaxBlockingSortBytes" : 33554432, "ok" : 1 }
It has a default value of 32 MB(33554432 bytes).In this case you're running short of buffer data so you can increase buffer limit with your own defined optimal value, example 50 MB as below:
它的默认值为32 MB(33554432 字节)。在这种情况下,您的缓冲区数据不足,因此您可以使用自己定义的最佳值增加缓冲区限制,例如 50 MB,如下所示:
> db.adminCommand({setParameter: 1, internalQueryExecMaxBlockingSortBytes:50151432})
{ "was" : 33554432, "ok" : 1 }
We can also set this limit permanently by the below parameter in the mongodb config file:
我们还可以通过 mongodb 配置文件中的以下参数永久设置此限制:
setParameter=internalQueryExecMaxBlockingSortBytes=309715200
Hope this helps !!!
希望这可以帮助 !!!
Note
:This commands supports only after version 3.0 +
Note
:此命令仅在 3.0+ 版本后支持
回答by sheetal_158
solved with indexing
用索引解决
db_handle.ensure_index([("reviewDate", pymongo.ASCENDING)])
回答by poroszd
If you want to avoid creating an index (e.g. you just want a quick-and-dirty check to explore the data), you can use aggregation with disk usage:
如果您想避免创建索引(例如,您只想进行快速检查来探索数据),您可以使用磁盘使用聚合:
all_reviews = db_handle.aggregate([{$sort: {'reviewDate': 1}}], {allowDiskUse: true})
(Not sure how to do this in pymongo, though).
(不过,不确定如何在 pymongo 中执行此操作)。
回答by wytten
JavaScript API syntax for the index:
索引的 JavaScript API 语法:
db_handle.ensureIndex({executedDate: 1})
回答by shilovk
In my case, it was necessary to fix nessary indexes in code and recreate them:
就我而言,有必要修复代码中的必要索引并重新创建它们:
rake db:mongoid:create_indexes RAILS_ENV=production
As the memory overflow does not occur when there is a needed index of field.
因为当有需要的字段索引时不会发生内存溢出。
PSBefore this I had to disable the errors when creating long indexes:
PS在此之前,我必须在创建长索引时禁用错误:
# mongo
MongoDB shell version: 2.6.12
connecting to: test
> db.getSiblingDB('admin').runCommand( { setParameter: 1, failIndexKeyTooLong: false } )
Also may be needed reIndex
:
也可能需要reIndex
:
# mongo
MongoDB shell version: 2.6.12
connecting to: test
> use your_db
switched to db your_db
> db.getCollectionNames().forEach( function(collection){ db[collection].reIndex() } )