mongodb 数据库中集合的数量限制

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9858393/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 12:33:43  来源:igfitidea点击:

limits of number of collections in databases

mongodbmulti-tenant

提问by Oleg

Can anyone say are there any practical limits for the number of collections in mongodb? They write here https://docs.mongodb.com/manual/core/data-model-operations/#large-number-of-collections:

谁能说 mongodb 中的集合数量有任何实际限制吗?他们在这里写https://docs.mongodb.com/manual/core/data-model-operations/#large-number-of-collections

Generally, having a large number of collections has no significant performance penalty, and results in very good performance.

通常,拥有大量集合不会造成显着的性能损失,并且会产生非常好的性能。

But for some reason mongodb set limit 24000 for the number of namespaces in the database, it looks like it can be increased, but I wonder why it has some the limit in default configuration if having many collections in the database doesn't cause any performance penalty?

但是由于某种原因,mongodb 将数据库中的命名空间数量设置了限制 24000,看起来可以增加,但是我想知道如果数据库中有很多集合不会导致任何性能,为什么它在默认配置中有一些限制惩罚?

Does it mean that it's a viable solution to have a practically unlimited number of collections in one database, for example, to have one collection of data of one account in a database for the multitenant application, having, for example, hundreds of thousands of collections in the database? If it's the viable solution to have a very large number of collections for a database for every tenant, what's the benefits of it for example versus having documents of each tenant in one collection? Thank you very much for your answers.

这是否意味着在一个数据库中拥有几乎无限数量的集合是一种可行的解决方案,例如,在多租户应用程序的数据库中拥有一个帐户的一个数据集合,例如,拥有数十万个集合在数据库中?如果为每个租户拥有大量数据库集合是可行的解决方案,那么与将每个租户的文档放在一个集合中相比,它有什么好处?非常感谢您的回答。

采纳答案by Sammaye

This answer is late however the other answers seem a bit...weak in terms of reliability and factual information so I will attempt to remedy that a little.

这个答案来晚了,但其他答案似乎有点……在可靠性和事实信息方面较弱,因此我将尝试稍微纠正一下。

But for some reason mongodb set limit 24000 for the number of namespaces in the database,

但是由于某种原因,mongodb 将数据库中命名空间的数量设置为 24000,

That is merely the default setting. Yes, there is a default setting.

那只是默认设置。是的,有一个默认设置。

It does say on the limits page that 24000 is the limit ( http://docs.mongodb.org/manual/reference/limits/#Number%20of%20Namespaces), as though there is no way to expand that but there is.

它确实在限制页面上说 24000 是限制(http://docs.mongodb.org/manual/reference/limits/#Number%20of%20Namespaces),好像没有办法扩展它,但有。

However there is a maximum limit on how big a namespace file can be ( http://docs.mongodb.org/manual/reference/limits/#Size%20of%20Namespace%20File) which is 2GB. That gives you roughly 3 million namespaces to play with in most cases which is quite impressive and I am unsure if many people will hit that limit quickly.

但是,命名空间文件的大小有一个最大限制(http://docs.mongodb.org/manual/reference/limits/#Size%20of%20Namespace%20File),即 2GB。在大多数情况下,这为您提供了大约 300 万个命名空间,这令人印象深刻,我不确定是否有很多人会很快达到这个限制。

You can modify the default value to go higher than 16MB by using the nssize parameter either within the configuration ( http://docs.mongodb.org/manual/reference/configuration-options/#nssize) or at runtime by manipulating the command used to run MongoDB ( http://docs.mongodb.org/manual/reference/mongod/#cmdoption-mongod--nssize).

您可以通过在配置中使用 nssize 参数(http://docs.mongodb.org/manual/reference/configuration-options/#nssize)或在运行时通过操作使用的命令将默认值修改为高于 16MB运行 MongoDB ( http://docs.mongodb.org/manual/reference/mongod/#cmdoption-mongod--nssize)。

There is no real reason for why MongoDB implements 16MB by default for its nssize as far as I know, I have never heard about the motto of "not bother the user with every single detail" so I don't buy that one.

据我所知,MongoDB 在默认情况下为其 nssize 实现 16MB 并没有真正的理由,我从未听说过“不打扰用户每一个细节”的座右铭,所以我不买那个。

I think, in my opinion, the main reason why MongoDB hides this is because even though, as the documentation states:

我认为,在我看来,MongoDB 隐藏这一点的主要原因是因为尽管如此,正如文档所述:

Distinct collections are very important for high-throughput batch processing.

不同的集合对于高吞吐量批处理非常重要。

Using multiple collections as a means to scale vertically rather than horizontally through a cluster, as MongoDB is designed to, is considered (quite often) bad practice for large scale websites; as such 12K collections is normally considered something that people will never, and should never, ascertain.

使用多个集合作为垂直扩展而不是水平扩展的方法,正如 MongoDB 设计的那样,被认为(通常)对于大型网站来说是不好的做法;因此,12K 集合通常被认为是人们永远不会也不应该确定的东西。

回答by user3413723

No More Limits!

没有更多的限制!

As other answers have stated - this is determined by the size of the namespace file. This was previously an issue, because it had a default limit of 16mb and a max of 2gb. However with the release of MongoDB 3.0 and the WiredTiger storage engine, it looks like this limit has been removed. WiredTiger seems to be better in almost every way, so I see little reason for anyone to use the old engine, except for legacy support reasons. From the site:

正如其他答案所述 - 这取决于命名空间文件的大小。这以前是一个问题,因为它的默认限制为 16mb,最大为 2gb。然而,随着 MongoDB 3.0 和 WiredTiger 存储引擎的发布,这个限制似乎已被取消。WiredTiger 似乎在几乎所有方面都更好,所以我认为任何人都没有理由使用旧引擎,除了遗留支持的原因。从网站:

For the MMAPv1 storage engine, namespace files can be no larger than 2047 megabytes.

By default namespace files are 16 megabytes. You can configure the size using the nsSize option.

The WiredTiger storage engine is not subject to this limitation.

对于 MMAPv1 存储引擎,命名空间文件不能大于 2047 兆字节。

默认情况下,命名空间文件为 16 兆字节。您可以使用 nsSize 选项配置大小。

WiredTiger 存储引擎不受此限制。

http://docs.mongodb.org/manual/reference/limits/

http://docs.mongodb.org/manual/reference/limits/

回答by Sid

A little background:

一点背景:

Every time mongo creates a database, it creates a namespace (db.ns) file for it. The namespace (or collections as you might want to call it) file holds the metadata about the collection. By default the namespace file is 16MB in size, though you can increase the size manually. The metadata for each collections is 648 bytes + some overhead bytes. Divide that by 16MB and you get approximately 24000 namespaces per database. You can start mongo by specifying a larger namespace file and that will let you create more collections per database.

每次 mongo 创建一个数据库时,它都会为其创建一个命名空间(db.ns)文件。命名空间(或您可能想要调用的集合)文件保存有关集合的元数据。默认情况下,命名空间文件的大小为 16MB,但您可以手动增加大小。每个集合的元数据是 648 字节 + 一些开销字节。将其除以 16MB,您将获得每个数据库大约 24000 个命名空间。您可以通过指定更大的命名空间文件来启动 mongo,这将使您可以为每个数据库创建更多集合。

The idea behind any default configuration is to not bother the user with every single detail (and configurable knob) and choose one that generally works for most people. Also, viability does go hand in hand with best/good design practices. As Chris said, consider the shape of your data and decide accordingly.

任何默认配置背后的想法都是不要用每一个细节(和可配置的旋钮)来打扰用户,并选择一个通常适用于大多数人的方法。此外,可行性确实与最佳/良好的设计实践密切相关。正如克里斯所说,请考虑数据的形状并做出相应的决定。

回答by Matt Connolly

As others mention, the default namespace size is 16MB and you can get about 24000 namespace entries. Actually my 64 bit instance in Ubuntu topped out at 23684 using the default 16MB namespace file.

正如其他人提到的,默认命名空间大小为 16MB,您可以获得大约 24000 个命名空间条目。实际上,我在 Ubuntu 中的 64 位实例使用默认的 16MB 命名空间文件达到了 23684。

One important thing that isn't mentioned in the FAQ is that indexes also use namespace slots.

FAQ 中没有提到的一件重要事情是索引也使用命名空间槽。

You can count the namespace entries with:

您可以使用以下方法计算命名空间条目:

db.system.namespaces.count()

And it's also interesting to actually take a look at what's in there:

实际看看里面的内容也很有趣:

db.system.namespaces.find()

Set your limit higher than what you think you need because once a database is created, the namespace file cannot be extended (as far as I understand - if there is a way, please tell me!!!).

将您的限制设置为高于您认为需要的限制,因为一旦创建了数据库,命名空间文件就无法扩展(据我所知 - 如果有办法,请告诉我!!!)。

回答by Nicolas78

There seems to be a massive overhead for maintaining collections. I've just reduced a database which had around 1.5mio documents in 11000 collections to one with the same number of documents in around 300 collections; this has reduced the size of the database from 8GB to 1GB. I'm not familiar with the inner workings of MongoDB so this may be obvious but I thought might be worth noting in this context.

维护集合的开销似乎很大。我刚刚将一个在 11000 个集合中包含大约 1.5mio 文档的数据库减少到一个在大约 300 个集合中包含相同数量的文档的数据库;这将数据库的大小从 8GB 减少到 1GB。我不熟悉 MongoDB 的内部工作原理,所以这可能很明显,但我认为在这种情况下可能值得注意。

回答by Christopher WJ Rueber

Practically, I have never run across a maximum. But I've definitely never gone beyond the 24,000 collection limit. I'm pretty sure I've never hit more than 200, other than when I was performance testing the thing. I have to admit, I think it sounds like an awful lot of chaos to have that many collections in a single database, rather than grouping like data in to their own collections.

实际上,我从未遇到过最大值。但我绝对没有超过 24,000 的收集限制。我很确定我从来没有超过 200 次,除了我在测试这个东西的时候。我不得不承认,我认为在一个数据库中拥有这么多集合,而不是将类似的数据分组到它们自己的集合中,这听起来非常混乱。

Consider the shape of your data and business rules. If your data needs to be laid out such that you must have the data separated in to different logical groupings for your multi-tenant app, then you probably should consider other data stores. Because while Mongo is great, the fact that they put a limit on the amount of collections at all tells me that they know there is some theoretical limit where performance is effected.

考虑您的数据和业务规则的形状。如果您的数据需要进行布局,以至于您必须将数据分成不同的逻辑分组以供多租户应用程序使用,那么您可能应该考虑其他数据存储。因为虽然 Mongo 很棒,但他们对集合的数量设置了限制这一事实告诉我,他们知道在影响性能的地方存在一些理论上的限制。

Perhaps you should consider a store that would match the data shape? Riak, for example, has an unlimited number of 'buckets' (without theoretical maximum) that you can have in your application. One bucket per account is perfectly doable, but you sacrifice some querability by going that direction.

也许您应该考虑一个与数据形状匹配的存储?例如,Riak 拥有无限数量的“存储桶”(没有理论最大值),您可以在应用程序中使用它们。每个账户一个存储桶是完全可行的,但你会通过这个方向牺牲一些可查询性。

Otherwise, you may want to follow a more relational model of grouping like with like. In my view, Mongo feels like a half-way point between relational databases and key-value stores. That means that it's more easy to conceptualize it coming from a relational database world.

否则,您可能希望遵循更相关的分组模型 like with like。在我看来,Mongo 感觉像是关系数据库和键值存储之间的中间点。这意味着更容易将其概念化为来自关系数据库世界。