Python pymongo.errors.CursorNotFound: 游标 ID '...' 在服务器上无效
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24199729/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pymongo.errors.CursorNotFound: cursor id '...' not valid at server
提问by snake plissken
I am trying to fetch some ids that exist in a mongo database with the following code:
我正在尝试使用以下代码获取 mongo 数据库中存在的一些 id:
client = MongoClient('xx.xx.xx.xx', xxx)
db = client.test_database
db = client['...']
collection = db.test_collection
collection = db["..."]
for cursor in collection.find({ "$and" : [{ "followers" : { "$gt" : 2000 } }, { "followers" : { "$lt" : 3000 } }, { "list_followers" : { "$exists" : False } }] }):
print cursor['screenname']
print cursor['_id']['uid']
id = cursor['_id']['uid']
However, after a short while, I am receive this error:
但是,过了一会儿,我收到此错误:
pymongo.errors.CursorNotFound: cursor id '...' not valid at server.
pymongo.errors.CursorNotFound: 游标 ID '...' 在服务器上无效。
I found this articlewhich refers to that problem. Nevertheless it is not clear to me which solution to take. Is it possible to use find().batch_size(30)
? What exactly does the above command do? Can I take all the database ids using batch_size
?
我发现这篇文章提到了这个问题。然而,我不清楚采取哪种解决方案。可以使用find().batch_size(30)
吗?上面的命令究竟是做什么的?我可以使用所有的数据库 IDbatch_size
吗?
采纳答案by Christian P
You're getting this error because the cursor is timing out on the server (after 10 minutes of inactivity).
您收到此错误是因为光标在服务器上超时(闲置 10 分钟后)。
From the pymongo documentation:
从 pymongo 文档:
Cursors in MongoDB can timeout on the server if they've been open for a long time without any operations being performed on them. This can lead to an CursorNotFound exception being raised when attempting to iterate the cursor.
如果 MongoDB 中的游标在服务器上长时间打开而没有对其执行任何操作,则它们可能会在服务器上超时。这可能会导致在尝试迭代游标时引发 CursorNotFound 异常。
When you call the collection.find
method it queries a collection and it returns a cursor to the documents. To get the documents you iterate the cursor. When you iterate over the cursor the driver is actually making requests to the MongoDB server to fetch more data from the server. The amount of data returned in each request is set by the batch_size()
method.
当您调用该collection.find
方法时,它会查询一个集合并返回一个指向文档的游标。要获取文档,请迭代游标。当您遍历游标时,驱动程序实际上是在向 MongoDB 服务器发出请求以从服务器获取更多数据。每个请求中返回的数据量由batch_size()
方法设置。
From the documentation:
从文档:
Limits the number of documents returned in one batch. Each batch requires a round trip to the server. It can be adjusted to optimize performance and limit data transfer.
限制一批返回的文档数。每个批次都需要往返服务器。可以对其进行调整以优化性能并限制数据传输。
Setting the batch_size to a lower value will help you with the timeout errors?errors, but it will increase the number of times you're going to get access the MongoDB server to get all the documents.
将batch_size 设置为较低的值将帮助您解决超时错误?错误,但它会增加您访问MongoDB 服务器以获取所有文档的次数。
The default batch size:
默认批量大小:
For most queries, the first batch returns 101 documents or just enough documents to exceed 1 megabyte. Batch size will not exceed the maximum BSON document size (16 MB).
对于大多数查询,第一批返回 101 个文档或刚好足以超过 1 兆字节的文档。批处理大小不会超过最大 BSON 文档大小 (16 MB)。
There is no universal "right" batch size. You should test with different values and see what is the appropriate value for your use case i.e. how many documents can you process in a 10 minute window.
没有通用的“正确”批量大小。您应该使用不同的值进行测试,看看什么是适合您的用例的值,即您可以在 10 分钟的窗口内处理多少文档。
The last resort will be that you set no_cursor_timeout=True
. But you need to be sure that the cursor is closed after you finish processing the data.
最后的手段将是你设置no_cursor_timeout=True
。但是您需要确保在处理完数据后关闭游标。
How to avoid it without try/except
:
如何避免它没有try/except
:
cursor = collection.find(
{"x": 1},
no_cursor_timeout=True
)
for doc in cursor:
# do something with doc
cursor.close()
回答by Mani
You can make the cursor not to timeout by using no_cursor_timeout=True
like this:
您可以使用以下方法使光标不超时no_cursor_timeout=True
:
cursor=db.images.find({}, {'id':1, 'image_path':1, '_id':0}, no_cursor_timeout=True)
for i in cursor:
# .....
# .....
cursor.close() # use this or cursor keeps waiting so ur resources are used up
Earlier this was referred to as timeout
which has been replaced as per the docs.For more options on which methods support no_cursor_timeout
refer this search results in pymongo docs.
早些时候,这被称为timeout
已根据文档被替换。有关支持哪些方法的更多选项,no_cursor_timeout
请参阅 pymongo 文档中的此搜索结果。
回答by HISI
You were using the cursor more than the time out (about 10 minutes) so the cursor no longer exists.
您使用光标的时间超过了超时时间(大约 10 分钟),因此光标不再存在。
you should choose a low value of batch_size to fix the issue:
您应该选择一个较低的 batch_size 值来解决此问题:
(with Pymongo for example)
(以 Pymongo 为例)
col.find({}).batch_size(10)
or
或者
set the timeout to false col.find(timeout=False)
and don't forget to close the cursor in the end.
将超时设置为 false col.find(timeout=False)
,最后不要忘记关闭游标。