为什么 MongoDB 这么快

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5186707/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 12:00:15  来源:igfitidea点击:

Why Is MongoDB So Fast

mongodb

提问by Justin

I was showing my co-worker performance benchmarks of MongoDB vs SQL 2008 and while he believes MongoDB is faster, he doesn't understand how its possible. His logic, was that SQL has been around for decades, and has some of the smartest people working on it, and how can MongoDB; a relatively new kid on the block be so superior in performance? I wasn't able to really provide a solid and technical answer, and I was hoping you guys could assist.

我向我的同事展示了 MongoDB 与 SQL 2008 的性能基准测试,虽然他认为 MongoDB 更快,但他不明白它是如何实现的。他的逻辑是,SQL 已经存在了几十年,并且有一些最聪明的人在研究它,而 MongoDB 又如何;一个相对较新的街区的孩子在表现上如此优越?我无法真正提供可靠和技术性的答案,我希望你们能提供帮助。

采纳答案by Yichaoz

MongoDB isn't like a traditional relational database. It's noSQLor document based, it provides weak consistency guarantees, and it doesn't have to guarantee consistency like SQL.

MongoDB 不像传统的关系数据库。它是noSQL或基于文档的,它提供弱一致性保证,并且不必像 SQL 那样保证一致性。

回答by Will

MongoDB is fast because its web scale!

MongoDB 之所以很快,是因为它的网络规模!

Its a fun video and well worth everyone watching, but it does answer your question - that most of the noSQL engines like MongoDB are not robust and not resilient to crashes and other outages. This security is what they sacrifice to gain speed.

这是一个有趣的视频,非常值得每个人观看,但它确实回答了您的问题 - 大多数像 MongoDB 这样的 noSQL 引擎并不健壮,并且对崩溃和其他中断没有弹性。这种安全性是他们为获得速度而牺牲的。

回答by Martin Beckett

SQL has to do quite a lot, Mongo just has to drop bits onto disk (almost)

SQL 必须做很多事情,Mongo 只需要将位放到磁盘上(几乎)

回答by Mike M.

As it has been mentioned MongoDB isn't created and shouldn't be used the same as a SQL database. SQL (and other relational databased) store relationaldata, that is that data in table X can be set up to have direct relations to information in table Y. MongoDB doesn't have this ability, and can therefore drop a lot of overhead. Hence why MongoDB is usually used to store lists, not relations.

如前所述,MongoDB 不是创建的,不应与 SQL 数据库一样使用。SQL(和其他关系数据库)存储关系数据,即可以将表 X 中的数据设置为与表 Y 中的信息有直接关系。MongoDB 没有这种能力,因此可以降低很多开销。因此,为什么 MongoDB 通常用于存储列表,而不是关系。

Add in the fact that it isn't not quite ACID compliant yet (though it has taken large strides since it was first introduced) and that's the bulk of the speed differences.

加上它还不完全符合 ACID 的事实(尽管它自首次引入以来已经取得了很大的进步),这就是速度差异的主要部分。

Here are the differences outlined on the actual site between a full transactional model and their model.

以下是在实际站点上概述的完整事务模型与其模型之间的差异。

In practice, the non-transactional model of MongoDB has the following implications:

在实践中,MongoDB 的非事务模型具有以下含义:

  • No rollbacks. Your code must function without rollbacks. Check all programmatic conditions before performing the first database write operation. Order your write operations such that the most important operation occurs last.
  • Explicit locking. Your code may explicitly lock objects when performing operations. Thus, the application programmer has the capability to ensure "serializability" when required. Locking functionality will be available in late alpha / early beta release of MongoDB.
  • Database check on startup. Should the database abnormal terminate (rare), a database check procedure will automatically run on startup (similar to fschk).
  • 没有回滚。您的代码必须在没有回滚的情况下运行。在执行第一个数据库写入操作之前检查所有编程条件。对写操作进行排序,以便最重要的操作最后发生。
  • 显式锁定。您的代码在执行操作时可能会显式锁定对象。因此,应用程序员有能力在需要时确保“可串行化”。锁定功能将在 MongoDB 的后期 alpha / 早期 beta 版本中可用。
  • 启动时的数据库检查。如果数据库异常终止(罕见),数据库检查程序将在启动时自动运行(类似于 fschk)。

回答by maxdec

While the other answers are interesting I would add that one of the reasons MongoDB is "so fast", at least in benchmarks, is the write concern.

虽然其他答案很有趣,但我想补充一点,MongoDB“如此快”的原因之一,至少在基准测试中,是write concern.

You can read more about the different write concerns herebut basically you can define the level of "security" you want when writing data.

您可以在此处阅读有关不同写入问题的更多信息,但基本上您可以在写入数据时定义所需的“安全”级别。

The default level used to be unacknowledged, which means the write operation is just triggered but the driver does not check if it performed successfully. It is faster, but way less reliable.

以前的默认级别是unacknowledged,这意味着写操作只是被触发,但驱动程序不会检查它是否成功执行。它更快,但不太可靠。

They changed it about one year agoto acknowledged. But I guess most of the benchmarks out there still use the 'unacknowledged` mode for better results.

他们大约在一年前将其更改为acknowledged. 但我想大多数基准测试仍然使用“未确认”模式以获得更好的结果。

If you want to see the difference in term of performance, you can check this article(a bit old but it still gives an idea).

如果你想看到性能方面的差异,你可以查看这篇文章(有点旧,但它仍然提供了一个想法)。

回答by Amit Tripathi

MongoDB is fast because:

MongoDB 很快,因为:

  1. Not ACID and availability is given preference over consistency.
  2. Asynchronous insert and update: What it means is MongoDB doesn't insert data to DB as soon as insert query is processed. Same is true for updates.
  3. No Joins overhead: When they say MongoDB is a document database, what they mean is a database that contains data that is self sufficient and all the information is embedded like a real document.
  1. 不是 ACID,可用性优先于一致性。
  2. 异步插入和更新:这意味着 MongoDB 不会在处理插入查询后立即将数据插入数据库。更新也是如此。
  3. 无连接开销:当他们说 MongoDB 是一个文档数据库时,他们的意思是一个包含自给自足的数据并且所有信息都像真实文档一样嵌入的数据库。

回答by Eugene Bosikov

MongoDb is faster because: 1. No transactions; 2. No relations between tables;

MongoDb 更快,因为: 1. 没有事务;2.表之间没有关系;

If you will try to do exact the same logic on SQL server, for example : 1. Do not use Select with locks ; 2. No relations between tables; It will not be so big gap in speed between SQL Server and MongoDB. Only one place definitely will be faster , write and update records, because SQL doing insert and update table in the queue and in a transaction, on MondoDB it happens asynchronously.In my projections I could not gain any big differences in speed between SQL SERVER and MongoDB, because business logic was very similar between 2 projects. Real speed gain on MongoDb you can get on Analytical projects with bid data, or on big content management engines, like news papers, online stores and etc. Again no optimization on MongoDB and good optimization on SQL server can make these databases almost equal.

如果您将尝试在 SQL 服务器上执行完全相同的逻辑,例如: 1. 不要使用带锁的 Select ;2.表之间没有关系;SQL Server 和 MongoDB 的速度差距不会那么大。 只有一个地方肯定会更快,写入和更新记录,因为 SQL 在队列和事务中执行插入和更新表,在 MondoDB 上它是异步发生的。在我的预测中,我无法在 SQL SERVER 和 MongoDB 之间获得任何大的速度差异,因为两个项目之间的业务逻辑非常相似。在 MongoDb 上真正的速度提升,您可以在具有投标数据的分析项目或大型内容管理引擎(如报纸、在线商店等)上获得。同样,没有对 MongoDB 的优化和对 SQL Server 的良好优化可以使这些数据库几乎相等。

回答by cody.tv.weber

I will also add that another difference is less about speed and more about conceptualization (although I believe that it might help with speed because there is less room for joining issues) is the document-based storage is very similar to object oriented mindset.

我还要补充一点,另一个区别不是关于速度而是关于概念化(尽管我相信它可能有助于速度,因为加入问题的空间较小)是基于文档的存储与面向对象的思维方式非常相似

The document-based might not be perfectly ACID, but I believe MongoDB is easier to get what you want by just getting the whole document rather than messing with all the joins of a SQL DB, risking some bad joins as well.

基于文档的可能不是完美的 ACID,但我相信 MongoDB 更容易通过获取整个文档而不是搞乱 SQL 数据库的所有连接来获得您想要的东西,也冒着一些错误连接的风险。

Apologiesto any SQL die-hard fans.

向任何 SQL 铁杆粉丝道歉

回答by Marc B

Mongo's not ACID compliant, so it doesn't have to deal with nearly as much "cruft" to make sure that what you try to put into the DB can come back out again later.

Mongo 不符合 ACID,因此它不必处理几乎同样多的“杂物”以确保您尝试放入数据库的内容稍后可以再次返回。

If you don't mind losing some functionality and possibly losing data in exchange for speed, then Mongo's good. If you absolutely need to guarantee data integrity and/or have complex join requirements, then avoid Mongo-type systems like the plague.

如果您不介意丢失某些功能并可能丢失数据以换取速度,那么 Mongo 很好。如果你绝对需要保证数据完整性和/或有复杂的连接要求,那么避免像瘟疫这样的 Mongo 类型的系统。

回答by Lord

According to MongoDB's website, MongoDB is a document database with the scalability and flexibility that you want and with querying and indexing that you need.

根据 MongoDB 的网站,MongoDB 是一个文档数据库,具有您想要的可扩展性和灵活性,以及​​您需要的查询和索引。

Let's try tho understand what this actually means. So as we know MongoDB is a document-based so it stores data in documents which are field value paired data structures like JSON. So again, it stores data in these documents instead of rows in a table like in traditional relational databases. It's therefore a NoSQL database and not a relational one.

让我们试着理解这实际上意味着什么。众所周知,MongoDB 是基于文档的,因此它将数据存储在文档中,文档是字段值配对的数据结构,如 JSON。因此,它再次将数据存储在这些文档中,而不是像传统关系数据库那样在表中存储行。因此,它是一个 NoSQL 数据库,而不是关系数据库。

Also, MongoDB has built-in scalability, making it very easy to distribute data across multiple machines as your apps get more and more users and start generating a ton of data. So whatever you do, MongoDB will make it very easy for you to grow.

此外,MongoDB 具有内置的可扩展性,当您的应用程序获得越来越多的用户并开始生成大量数据时,可以非常轻松地将数据分布在多台机器上。所以无论你做什么,MongoDB 都会让你的成长变得非常容易。

Another big feature of MongoDB is its great flexibility. There is no need to define a document data schema before filling it with data, meaning that each document can have a different number and type of fields. And we can also change these fields all the time. All this is really in line with some real-world business situations, therefore it can become pretty useful.

MongoDB 的另一大特点是其极大的灵活性。在填充数据之前不需要定义文档数据模式,这意味着每个文档可以具有不同数量和类型的字段。我们也可以随时更改这些字段。所有这些都非常符合一些现实世界的商业情况,因此它可以变得非常有用。

MongoDB is also a very performant database system, thanks to features like embedded data models, indexing, sharding, the flexible documents that you know I believe, native duplication and so much more. And it is a free and open-source database, published under the SSPL license.

MongoDB 也是一个非常高性能的数据库系统,这要归功于嵌入式数据模型、索引、分片、我相信的灵活文档、本机复制等功能。它是一个免费的开源数据库,在 SSPL 许可下发布。

In summary, we can say that MongoDB is a great database system to build many types of modern, scalable, and flexible web applications. In fact, Mongo is probably the most used database with node JS.

总之,我们可以说 MongoDB 是一个很棒的数据库系统,可以构建多种类型的现代、可扩展和灵活的 Web 应用程序。事实上,Mongo 可能是 Node JS 中使用最多的数据库。

Now let's know about a bit deeper about these documents considering a blog post example, here is how that exact same data could look like as a row in a relational database like MySQL, or even in an Excel spreadsheet.

现在让我们通过一个博客文章示例来更深入地了解这些文档,以下是完全相同的数据在关系数据库(如 MySQL)甚至 Excel 电子表格中如何看起来像一行。

enter image description here

在此处输入图片说明

MongoDB uses a data format similar to JSON for data storage called BSON. IT looks basically the same as JSON, but it's typed, meaning that all values will have a data type such as String, Boolean, Date, and Object (such as Teacher Object, Double Object) and more. So what this means is that all MongoDB documents will actually be typed, which is different from JSON.

MongoDB 使用类似于 JSON 的数据格式进行数据存储,称为 BSON。IT 看起来与 JSON 基本相同,但它是类型化的,这意味着所有值都将具有数据类型,例如 String、Boolean、Date 和 Object(例如 Teacher Object、Double Object)等等。所以这意味着所有 MongoDB 文档实际上都会被键入,这与 JSON 不同。

Now just like JSON, these BSON documents will also have fields, and data is stored in key-value pairs. On the other hand in a relational database, each field is called a column, and database arranges data in table structures while our JSON data is so much more flexible.

现在就像 JSON 一样,这些 BSON 文档也会有字段,并且数据存储在键值对中。另一方面,在关系数据库中,每个字段称为一列,数据库以表结构排列数据,而我们的 JSON 数据则更加灵活。

Take for example the tags field in the above picture, where we actually have an array, so we have basically multiple values for one field, but in relational databases, that's not really allowed, we cannot have multiple values in one field. So we would actually have to find workarounds for this in a relational database, which could then involve more work and even more overall complication.

以上图中的tags字段为例,我们实际上有一个数组,所以我们基本上一个字段有多个值,但是在关系数据库中,这是不允许的,我们不能在一个字段中有多个值。所以我们实际上必须在关系数据库中找到解决方法,这可能会涉及更多的工作,甚至更多的整体复杂性。

Now another extremely important feature in MongoDB is the concept of embedded documents, which is something not present in relational databases. So in our comments field here we have an array that contains three objects, one for each document. So just imagine we have a comments collection which contained a bunch of comment documents, each of them could actually look exactly like this, so with an author and with the comment text, but instead of doing that, we include these comments right into that blog post document, so in other words, we embed the comment documents right into the post document, this is the process of embedding or de-normalizing which is basically to include some related data all into one single document.

现在 MongoDB 中另一个极其重要的特性是嵌入式文档的概念,这在关系数据库中是不存在的。因此,在我们的注释字段中,我们有一个包含三个对象的数组,每个文档一个。所以想象一下我们有一个包含一堆评论文档的评论集合,每个评论文档实际上看起来都像这样,所以有一个作者和评论文本,但不是这样做,我们将这些评论直接包含在那个博客中post文档,换句话说,我们将评论文档直接嵌入到post文档中,这是嵌入或反规范化的过程,基本上是将一些相关数据全部包含在一个文档中。

In the above example the comments are related to the post and os they are included in the same document which makes a database more performant in some situations because this way it can be easier to read all the data that we need all at once.

在上面的例子中,评论与 post 和 os 相关,它们包含在同一个文档中,这使得数据库在某些情况下性能更高,因为这样可以更容易地一次读取我们需要的所有数据。

Now the opposite of embedding or de-normalizing is normalizing, and that's how the data is always modeled in a relational database. In the above example case it's not possible to embed data in a relational system, solution is to create a whole new table for the comments and then join the tables by referencing the ID field of the comments table.

现在,嵌入或反规范化的反面是规范化,这就是在关系数据库中始终对数据建模的方式。在上面的示例中,无法在关系系统中嵌入数据,解决方案是为评论创建一个全新的表,然后通过引用评论表的 ID 字段来连接这些表。

Two things about BSON documents you need to know:

关于 BSON 文档你需要知道的两件事:

First, the maximum size for each document is currently 16 MB

首先,每个文档的最大大小目前为 16 MB

Second, each document contains a unique ID, which acts as a primary key of that document, it's automatically generated with the object ID data type each time there is a new document, you don't have to worry about it.

其次,每个文档都包含一个唯一的 ID,它作为该文档的主键,每次有新文档时它会自动生成对象 ID 数据类型,您不必担心。