node.js MongoDB - 错误：getMore 命令失败：未找到光标

Question

提问by Chava Sobreyra

I need to create a new field sidon each document in a collection of about 500K documents. Each sidis unique and based on that record's existing roundedDateand streamfields.

我需要sid在大约 50 万个文档的集合中的每个文档上创建一个新字段。每一个sid都是独特的，基于该记录的现有roundedDate和stream领域。

I'm doing so with the following code:

我正在使用以下代码执行此操作：

var cursor = db.getCollection('snapshots').find();
var iterated = 0;
var updated = 0;

while (cursor.hasNext()) {
    var doc = cursor.next();

    if (doc.stream && doc.roundedDate && !doc.sid) {
        db.getCollection('snapshots').update({ "_id": doc['_id'] }, {
            $set: {
                sid: doc.stream.valueOf() + '-' + doc.roundedDate,
            }
        });

        updated++;
    }

    iterated++;
}; 

print('total ' + cursor.count() + ' iterated through ' + iterated + ' updated ' + updated);

It works well at first, but after a few hours and about 100K records it errors out with:

一开始它运行良好，但几个小时后，大约 100K 记录错误：

Error: getMore command failed: {
    "ok" : 0,
    "errmsg": "Cursor not found, cursor id: ###",
    "code": 43,
}: ...

Answer 1

回答by Danziger

EDIT - Query performance:

编辑 - 查询性能：

As @NeilLunn pointed out in his comments, you should not be filtering the documents manually, but use .find(...)for that instead:

正如@NeilLunn 在他的评论中指出的那样，您不应该手动过滤文档，而是使用.find(...)它：

db.snapshots.find({
    roundedDate: { $exists: true },
    stream: { $exists: true },
    sid: { $exists: false }
})

Also, using .bulkWrite(), available as from MongoDB 3.2, will be far way more performant than doing individual updates.

此外，使用.bulkWrite()，可用 from MongoDB 3.2，将比单独更新性能高得多。

It is possible that, with that, you are able to execute your query within the 10 minutes lifetime of the cursor. If it still takes more than that, you cursor will expire and you will have the same problem anyway, which is explained below:

这样，您就有可能在游标的 10 分钟生命周期内执行查询。如果它仍然需要更多，您的光标将过期，无论如何您都会遇到同样的问题，这将在下面解释：

What is going on here:

这里发生了什么：

Error: getMore command failedmay be due to a cursor timeout, which is related with two cursor attributes:

Error: getMore command failed可能是由于游标超时，这与两个游标属性有关：

Timeout limit, which is 10 minutes by default. From the docs:
By default, the server will automatically close the cursor after 10 minutes of inactivity, or if client has exhausted the cursor.
Batch size, which is 101 documents or 16 MB for the first batch, and 16 MB, regardless of the number of documents, for subsequent batches (as of MongoDB 3.4). From the docs:
find()and aggregate()operations have an initial batch size of 101 documents by default. Subsequent getMoreoperations issued against the resulting cursor have no default batch size, so they are limited only by the 16 megabyte message size.

超时限制，默认为 10 分钟。从文档：
默认情况下，服务器将在 10 分钟不活动或客户端耗尽游标后自动关闭游标。
批次大小，第一批为 101 个文档或 16 MB，而对于后续批次（从 MongoDB 开始3.4），无论文档数量如何，均为 16 MB 。从文档：
find()aggregate()默认情况下，操作的初始批处理大小为 101 个文档。针对结果游标发出的后续getMore操作没有默认批处理大小，因此它们仅受 16 兆字节消息大小的限制。

Probably you are consuming those initial 101 documents and then getting a 16 MB batch, which is the maximum, with a lot more documents. As it is taking more than 10 minutes to process them, the cursor on the server times out and, by the time you are done processing the documents in the second batch and request a new one, the cursor is already closed:

可能您正在使用最初的 101 个文档，然后获得 16 MB 的批处理，这是最大值，还有更多的文档。由于处理它们需要 10 多分钟，服务器上的游标超时，当您处理完第二批中的文档并请求新的文档时，游标已经关闭：

As you iterate through the cursor and reach the end of the returned batch, if there are more results, cursor.next() will perform a getMore operation to retrieve the next batch.

当您遍历游标并到达返回批次的末尾时，如果有更多结果，则 cursor.next() 将执行 getMore 操作以检索下一批。

Possible solutions:

可能的解决方案：

I see 5 possible ways to solve this, 3 good ones, with their pros and cons, and 2 bad one:

我看到了 5 种可能的方法来解决这个问题，3 种好方法，各有利弊，2 种坏方法：

Reducing the batch size to keep the cursor alive.
Remove the timeout from the cursor.
Retry when the cursor expires.
Query the results in batches manually.
Get all the documents before the cursor expires.

减少批量大小以保持游标处于活动状态。
从游标中删除超时。
当游标到期时重试。
手动批量查询结果。
获取游标到期前的所有文档。

Note they are not numbered following any specific criteria. Read through them and decide which one works best for your particular case.

请注意，它们没有按照任何特定标准进行编号。通读它们并决定哪一种最适合您的特定情况。

1. Reducing the batch size to keep the cursor alive

1. 减少批量大小以保持游标处于活动状态

One way to solve that is use cursor.bacthSizeto set the batch size on the cursor returned by your findquery to match those that you can process within those 10 minutes:

解决该问题的一种方法是cursor.bacthSize在find查询返回的游标上设置批处理大小，以匹配您可以在 10 分钟内处理的批处理大小：

const cursor = db.collection.find()
    .batchSize(NUMBER_OF_DOCUMENTS_IN_BATCH);

However, keep in mind that setting a very conservative (small) batch size will probably work, but will also be slower, as now you need to access the server more times.

但是，请记住，设置非常保守（小）的批处理大小可能会起作用，但也会变慢，因为现在您需要更多次访问服务器。

On the other hand, setting it to a value too close to the number of documents you can process in 10 minutes means that it is possible that if some iterations take a bit longer to process for any reason (other processes may be consuming more resources), the cursor will expire anyway and you will get the same error again.

另一方面，将其设置为太接近您可以在 10 分钟内处理的文档数量的值意味着如果某些迭代因任何原因需要更长的时间来处理（其他进程可能会消耗更多资源），无论如何，光标都会过期，您将再次收到相同的错误。

2. Remove the timeout from the cursor

2.从游标中删除超时

Another option is to use cursor.noCursorTimeoutto prevent the cursor from timing out:

另一种选择是使用cursor.noCursorTimeout来防止光标超时：

const cursor = db.collection.find().noCursorTimeout();

This is considered a bad practice as you would need to close the cursor manually or exhaust all its results so that it is automatically closed:

这被认为是一种不好的做法，因为您需要手动关闭游标或耗尽其所有结果以使其自动关闭：

After setting the noCursorTimeoutoption, you must either close the cursor manually with cursor.close()or by exhausting the cursor's results.

设置该noCursorTimeout选项后，您必须手动关闭游标，cursor.close()或者通过耗尽游标的结果。

As you want to process all the documents in the cursor, you wouldn't need to close it manually, but it is still possible that something else goes wrong in your code and an error is thrown before you are done, thus leaving the cursor opened.

由于您要处理游标中的所有文档，因此您不需要手动关闭它，但是您的代码中仍有可能出现其他问题，并在您完成之前抛出错误，从而使游标保持打开状态.

If you still want to use this approach, use a try-catchto make sure you close the cursor if anything goes wrong before you consume all its documents.

如果您仍想使用这种方法，请使用 atry-catch确保在使用所有文档之前关闭光标，如果出现任何问题。

Note I don't consider this a bad solution (therefore the ), as even thought it is considered a bad practice...:

注意我不认为这是一个糟糕的解决方案（因此），因为甚至认为它被认为是一种不好的做法......：

It is a feature supported by the driver. If it was so bad, as there are alternatives ways to get around timeout issues, as explained in the other solutions, this won't be supported.
There are ways to use it safely, it's just a matter of being extra cautious with it.
I assume you are not running this kind of queries regularly, so the chances that you start leaving open cursors everywhere is low. If this is not the case, and you really need to deal with these situations all the time, then it does make sense not to use noCursorTimeout.

这是驱动程序支持的功能。如果情况如此糟糕，因为有其他方法可以解决超时问题，如其他解决方案中所述，将不支持此方法。
有很多方法可以安全地使用它，只是需要格外小心。
我假设您没有定期运行此类查询，因此您开始在任何地方留下打开的游标的可能性很低。如果情况并非如此，并且您确实需要一直处理这些情况，那么不使用noCursorTimeout.

3. Retry when the cursor expires

3. 游标过期时重试

Basically, you put your code in a try-catchand when you get the error, you get a new cursor skipping the documents that you have already processed:

基本上，你把你的代码放在 a 中try-catch，当你得到错误时，你会得到一个新的光标，跳过你已经处理过的文档：

let processed = 0;
let updated = 0;

while(true) {
    const cursor = db.snapshots.find().sort({ _id: 1 }).skip(processed);

    try {
        while (cursor.hasNext()) {
            const doc = cursor.next();

            ++processed;

            if (doc.stream && doc.roundedDate && !doc.sid) {
                db.snapshots.update({
                    _id: doc._id
                }, { $set: {
                    sid: `${ doc.stream.valueOf() }-${ doc.roundedDate }`
                }});

                ++updated;
            } 
        }

        break; // Done processing all, exit outer loop
    } catch (err) {
        if (err.code !== 43) {
            // Something else than a timeout went wrong. Abort loop.

            throw err;
        }
    }
}

Note you need to sort the results for this solution to work.

请注意，您需要对此解决方案的结果进行排序才能起作用。

With this approach, you are minimizing the number of requests to the server by using the maximum possible batch size of 16 MB, without having to guess how many documents you will be able to process in 10 minutes beforehand. Therefore, it is also more robust than the previous approach.

使用这种方法，您可以使用 16 MB 的最大可能批处理大小来最小化对服务器的请求数量，而无需事先猜测您将能够在 10 分钟内处理多少文档。因此，它也比以前的方法更健壮。

4. Query the results in batches manually

4.手动批量查询结果

Basically, you use skip(), limit()and sort()to do multiple queries with a number of documents you think you can process in 10 minutes.

基本上，您可以使用skip()、limit()和sort()对您认为可以在 10 分钟内处理的大量文档进行多次查询。

I consider this a bad solution because the driver already has the option to set the batch size, so there's no reason to do this manually, just use solution 1 and don't reinvent the wheel.

我认为这是一个糟糕的解决方案，因为驱动程序已经可以选择设置批量大小，所以没有理由手动执行此操作，只需使用解决方案 1 并且不要重新发明轮子。

Also, it is worth mentioning that it has the same drawbacks than solution 1,

此外，值得一提的是，它具有与解决方案 1 相同的缺点，

5. Get all the documents before the cursor expires

5.获取游标过期前的所有文档

Probably your code is taking some time to execute due to results processing, so you could retrieve all the documents first and then process them:

由于结果处理，您的代码可能需要一些时间来执行，因此您可以先检索所有文档，然后再处理它们：

const results = new Array(db.snapshots.find());

This will retrieve all the batches one after another and close the cursor. Then, you can loop through all the documents inside resultsand do what you need to do.

这将一个接一个地检索所有批次并关闭游标。然后，您可以遍历里面的所有文档results并执行您需要执行的操作。

However, if you are having timeout issues, chances are that your result set is quite large, thus pulling everything in memory may not be the most advisable thing to do.

但是，如果您遇到超时问题，那么您的结果集可能非常大，因此将所有内容拉入内存可能不是最可取的做法。

Note about snapshot mode and duplicate documents

关于快照模式和重复文档的注意事项

It is possible that some documents are returned multiple times if intervening write operations move them due to a growth in document size. To solve this, use cursor.snapshot(). From the docs:

如果由于文档大小的增长而导致写入操作移动它们，则某些文档可能会被多次返回。要解决此问题，请使用cursor.snapshot(). 从文档：

Append the snapshot() method to a cursor to toggle the “snapshot” mode. This ensures that the query will not return a document multiple times, even if intervening write operations result in a move of the document due to the growth in document size.

将 snapshot() 方法附加到光标以切换“快照”模式。这确保了查询不会多次返回文档，即使由于文档大小的增长，干预写入操作导致文档移动。

However, keep in mind its limitations:

但是，请记住它的局限性：

It doesn't work with sharded collections.
It doesn't work with sort()or hint(), so it will not work with solutions 3 and 4.
It doesn't guarantee isolation from insertion or deletions.

它不适用于分片集合。
它不适用于sort()或hint()，因此它不适用于解决方案 3 和 4。
它不保证与插入或删除的隔离。

Note with solution 5 the time window to have a move of documents that may cause duplicate documents retrieval is narrower than with the other solutions, so you may not need snapshot().

请注意，与其他解决方案相比，解决方案 5 移动可能导致重复文档检索的文档的时间窗口更窄，因此您可能不需要snapshot().

In your particular case, as the collection is called snapshot, probably it is not likely to change, so you probably don't need snapshot(). Moreover, you are doing updates on documents based on their data and, once the update is done, that same document will not be updated again even though it is retrieved multiple times, as the ifcondition will skip it.

在您的特定情况下，当集合被调用时snapshot，它可能不会改变，因此您可能不需要snapshot(). 此外，您正在根据文档的数据对文档进行更新，一旦更新完成，即使多次检索同一文档，也不会再次更新，因为if条件将跳过它。

Note about open cursors

关于打开游标的注意事项

To see a count of open cursors use db.serverStatus().metrics.cursor.

要查看打开游标的计数，请使用db.serverStatus().metrics.cursor.

Answer 2

回答by Vladimir Ishenko

It's a bug in mongodb server session management. Fix currently in progress, should be fixed in 4.0+

这是 mongodb 服务器会话管理中的一个错误。正在修复中，应该在 4.0+ 中修复

SERVER-34810: Session cache refresh can erroneously kill cursors that are still in use

SERVER-34810：会话缓存刷新可能会错误地杀死仍在使用的游标

(reproduced in MongoDB 3.6.5)

（转载于 MongoDB 3.6.5）

adding collection.find().batchSize(20)helped me with about a tiny reduced performance.

添加collection.find().batchSize(20)帮助我降低了性能。

Answer 3

回答by SimonSimCity

I also ran into this problem, but for me it was caused by a bug in the MongDB driver.

我也遇到了这个问题，但对我来说这是由 MongDB 驱动程序中的错误引起的。

It happened in the version 3.0.xof the npm package mongodbwhich is e.g. used in Meteor 1.7.0.x, where I also recorded this issue. It's further described in this comment and the thread contains a sample project which confirms the bug: https://github.com/meteor/meteor/issues/9944#issuecomment-420542042

它发生在3.0.xnpm 包的版本中，mongodb例如在 Meteor 中使用的版本1.7.0.x，我也记录了这个问题。在此评论中进一步描述了该线程，该线程包含一个确认错误的示例项目：https: //github.com/meteor/meteor/issues/9944#issuecomment-420542042

Updating the npm package to 3.1.xfixed it for me, because I already had taken into account the good advises, given here by @Danziger.

更新 npm 包3.1.x为我修复它，因为我已经考虑了@Danziger 在这里给出的好建议。

Answer 4

回答by user1240792

When using Java v3 driver, noCursorTimeout should be set in the FindOptions.

使用 Java v3 驱动程序时，应在 FindOptions 中设置 noCursorTimeout。

DBCollectionFindOptions options =
                    new DBCollectionFindOptions()
                        .maxTime(90, TimeUnit.MINUTES)
                        .noCursorTimeout(true)
                        .batchSize(batchSize)
                        .projection(projectionQuery);        
cursor = collection.find(filterQuery, options);

node.js MongoDB - 错误：getMore 命令失败：未找到光标

提问by Chava Sobreyra

回答by Danziger

EDIT - Query performance:

编辑 - 查询性能：

What is going on here:

这里发生了什么：

Possible solutions:

可能的解决方案：

1. Reducing the batch size to keep the cursor alive

1. 减少批量大小以保持游标处于活动状态

2. Remove the timeout from the cursor

2.从游标中删除超时

3. Retry when the cursor expires

3. 游标过期时重试

4. Query the results in batches manually

4.手动批量查询结果

5. Get all the documents before the cursor expires

5.获取游标过期前的所有文档

Note about snapshot mode and duplicate documents

关于快照模式和重复文档的注意事项

Note about open cursors

关于打开游标的注意事项

回答by Vladimir Ishenko

回答by SimonSimCity

回答by user1240792

相关推荐

最近更新

标签

node.js MongoDB - 错误：getMore 命令失败：未找到光标

提问by Chava Sobreyra

回答by Danziger

EDIT - Query performance:

编辑 - 查询性能：

What is going on here:

这里发生了什么：

Possible solutions:

可能的解决方案：

1. Reducing the batch size to keep the cursor alive

1. 减少批量大小以保持游标处于活动状态

2. Remove the timeout from the cursor

2.从游标中删除超时

3. Retry when the cursor expires

3. 游标过期时重试

4. Query the results in batches manually

4.手动批量查询结果

5. Get all the documents before the cursor expires

5.获取游标过期前的所有文档

Note about snapshot mode and duplicate documents

关于快照模式和重复文档的注意事项

Note about open cursors

关于打开游标的注意事项

回答by Vladimir Ishenko

回答by SimonSimCity

回答by user1240792

相关推荐

node.js AWS lambda api 网关错误“格式错误的 Lambda 代理响应”

在 Windows 10 上安装 Node.js（和 npm）

node.js 类型错误：db.collection 不是函数

node.js 在 Mac OS Sierra 上使用 brew 安装节点失败

相关推荐

最近更新

标签