_id 上的 mongodb 排序顺序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12098815/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 12:48:08  来源:igfitidea点击:

mongodb sort order on _id

mongodb

提问by Sam

I wonder how mongodb compare the "_id" field when doing query like the following:

我想知道 mongodb 在执行如下查询时如何比较“_id”字段:

db.data.find({"_id":{$gt:ObjectId("502aa46c0674d23e3cee6152")}}).sort({"_id":1}).limit(10);

Is it purely based on timestamp portion of the id?

它纯粹是基于 id 的时间戳部分吗?

回答by Adam Comerford

To expand slightly on what Andre said:

稍微扩展安德烈所说的话:

Since the ObjectID timestamp is only to the second, two (or more) ObjectIDs could easily be created with the same value for the timestamp (the first 4 bytes). If these were created on the same machine (machine ID - the next 3 bytes), by the same process (PID - the next 2 bytes), then the only thing to differentiate them would be the "inc" field, the last 3 bytes at the end.

由于 ObjectID 时间戳仅到第二个,因此可以使用相同的时间戳值(前 4 个字节)轻松创建两个(或更多)ObjectID。如果这些是在同一台机器上创建的(机器 ID - 接下来的 3 个字节),由相同的进程(PID - 接下来的 2 个字节)创建,那么唯一可以区分它们的是“inc”字段,最后 3 个字节在末尾。

Update: Jan 2020

更新:2020 年 1 月

This answer continues to be popular so it is worth updating a little. The ObjectID spec has evolved since this answer was written 8 years ago and the 5 bytes after the timestamp are now simply random, which will greatly decrease the likelihood of any collisions. The last three bytes are still incremental, but initialised at a random value to start, again making collisions less likely. The ObjectID now contains less context (you can't easily tell where it was generated and by what process) but I would guess that the information was not being used in any meaningful way and has been deprecated in favor of better randomisation of the ID.

这个答案继续流行,所以值得更新一下。ObjectID 规范自 8 年前编写此答案以来一直在发展,时间戳后面的 5 个字节现在只是随机的,这将大大降低任何冲突的可能性。最后三个字节仍然是增量的,但以随机值初始化以开始,再次降低冲突的可能性。ObjectID 现在包含较少的上下文(您无法轻易判断它是从哪里生成的以及通过什么过程生成的),但我猜想该信息没有以任何有意义的方式使用并且已被弃用,以支持更好的 ID 随机化。

End Update

结束更新

See here for the full spec:

有关完整规格,请参见此处:

https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification

https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification

That "inc" field is either an ever incrementing field (then you can reasonably expect the sort to be in the insert/create order) or a random value (then likely unique, but not ordered), assuming the spec is implemented correctly of course. Note that the ObjectIDs may be generated by the driver, or the application (or indeed manually) rather than by MongoDB itself, so unless you have full control over how they are generated, then any or all of the above may apply.

这个“inc”字段要么是一个不断增加的字段(那么你可以合理地期望排序处于插入/创建顺序)或一个随机值(然后可能是唯一的,但不是有序的),当然假设规范实现正确. 请注意,ObjectID 可能由驱动程序或应用程序(或实际上手动)生成,而不是由 MongoDB 本身生成,因此除非您完全控制它们的生成方式,否则上述任何或所有内容都可能适用。

回答by Andre de Frere

In a way you are correct, if you sort by the _idyou will sort by the insertion time. This does not mean that the only comparison is done on the timestamp portion. ObjectID's are a BSON object type in their own right, they can be directly compared with each other. As they start with a timestamp, it follows logically that those in the past will be less than those in the future.

在某种程度上,您是正确的,如果您_id按插入时间排序,则将按插入时间排序。这并不意味着唯一的比较是在时间戳部分进行的。ObjectID 本身就是一个 BSON 对象类型,它们可以直接相互比较。由于它们以时间戳开头,因此从逻辑上讲,过去的将小于未来的。

You can find more detail in the documentation

您可以在文档中找到更多详细信息

回答by andreyro

copy paste from Mongo specs https://docs.mongodb.com/manual/reference/bson-types/#objectid

从 Mongo 规范复制粘贴 https://docs.mongodb.com/manual/reference/bson-types/#objectid

The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process.

ObjectId 值的顺序和生成时间之间的关系并不严格在一秒内。如果多个系统或单个系统上的多个进程或线程在一秒内生成值;ObjectId 值不代表严格的插入顺序。即使对于值,客户端之间的时钟偏差也会导致非严格排序,因为客户端驱动程序生成 ObjectId 值,而不是 mongod 进程。