DynamoDB 与 MongoDB NoSQL

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17931073/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 13:19:44  来源:igfitidea点击:

DynamoDB vs MongoDB NoSQL

mongodbamazon-web-servicesnosqlamazon-dynamodb

提问by Hyman.the.ripper

I'm trying to figure it out what can I use for a future project, we plan to store from about 500k records per month in the first year and maybe more for the next years this is a vertical application so there's no need to use a database for this, that's the reason why I decided to choose a noSQL data storage.

我正在尝试弄清楚我可以在未来的项目中使用什么,我们计划在第一年每月存储大约 50 万条记录,并且在接下来的几年中可能会更多,这是一个垂直应用程序,因此无需使用数据库,这就是我决定选择 noSQL 数据存储的原因。

The first option that came to my mind was mongo db since is a very mature product with a lot of support from the community but in the other hand we got a brand new product that offers a managed service at top performance, I'll develop this application but there's no maintenance plan (at least for now) so I think that will be a huge advantage since amazon provides a elastic way to scale.

我想到的第一个选项是 mongo db,因为它是一个非常成熟的产品,得到了社区的大量支持,但另一方面,我们得到了一个全新的产品,它以最高性能提供托管服务,我将开发这个应用程序,但没有维护计划(至少现在),所以我认为这将是一个巨大的优势,因为亚马逊提供了一种弹性的扩展方式。

My major concern is about the query structure, I haven't looked at the dynamoDB query capabilities yet but since is a k/v data storage I feel that this could be more limited than mongo db.

我主要关心的是查询结构,我还没有研究 dynamoDB 查询功能,但由于是 ak/v 数据存储,我觉得这可能比 mongo db 更有限。

If someone had the experience of moving a project from mongoDB to DynamoDB, any advice will be totally appreciated.

如果有人有将项目从 mongoDB 迁移到 DynamoDB 的经验,我们将不胜感激。

采纳答案by Mason Zhang

I recently migrated my MongoDB to DynamoDB, and wrote 3 blogs to share some experience and data about performance, cost.

最近把自己的MongoDB迁移到了DynamoDB,写了3篇博客分享一些性能、成本方面的经验和数据。

Migrate from MongoDB to AWS DynamoDB + SimpleDB

从 MongoDB 迁移到 AWS DynamoDB + SimpleDB

7 Reasons You Should Use MongoDB over DynamoDB

您应该使用 MongoDB 而不是 DynamoDB 的 7 个理由

3 Reasons You Should Use DynamoDB over MongoDB

您应该使用 DynamoDB 而不是 MongoDB 的 3 个理由

回答by CargoMeister

I know this is old, but it still comes up when you search for the comparison. We were using Mongo, have moved almost entirely to Dynamo, which is our first choice now. Not because it has more features, it doesn't. Mongo has a better query language, you can index within a structure, there's lots of little things. The superiority of Dynamo is in what the OP stated in his comment: it's easy. You don't have to take care of any servers. When you start to set up a Mongo sharded solution, it gets complicated. You can go to one of the hosting companies, but that's not cheap either. With Dynamo, if you need more throughput, you just click a button. You can write scripts to scale automatically. When it's time to upgrade Dynamo, it's done for you. That is all a lot of precious stress and time not spent. If you don't have dedicated ops people, Dynamo is excellent.

我知道这是旧的,但是当您搜索比较时它仍然会出现。我们使用的是 Mongo,几乎完全转移到 Dynamo,这是我们现在的首选。不是因为它有更多功能,它没有。Mongo 有更好的查询语言,你可以在一个结构中索引,有很多小东西。Dynamo 的优势在于 OP 在他的评论中所说的:它很容易。您不必照顾任何服务器。当您开始设置 Mongo 分片解决方案时,它会变得复杂。您可以去其中一家托管公司,但这也不便宜。使用 Dynamo,如果您需要更多吞吐量,只需单击一个按钮即可。您可以编写脚本来自动缩放。当需要升级 Dynamo 时,它已为您完成。那是很多宝贵的压力和时间没有花费。如果你不

So we are now going on Dynamo by default. Mongo maybe, if the data structure is complicated enough to warrant it, but then we'd probably go back to a SQL database. Dynamo is obtuse, you really need to think about how you're going to build it, and likely you'll use Redis in Elasticcache to make it work for complex stuff. But it sure is nice to not have to take care of it. You code. That's it.

所以我们现在默认使用 Dynamo。Mongo 也许,如果数据结构足够复杂以保证它,但那么我们可能会回到 SQL 数据库。Dynamo 是迟钝的,你真的需要考虑你将如何构建它,而且你可能会在 Elasticcache 中使用 Redis 来使它适用于复杂的东西。但是不用管它确实很好。你编码。就是这样。

回答by Derick

With 500k documents, there is no reason to scale whatsoever. A typical laptop with an SSD and 8GB of ram can easily do 10s of millions of records, so if you are trying to pick because of scaling your choice doesn't really matter. I would suggest you pick what you like the most, and perhaps where you can find the most online support with.

对于 50 万份文档,没有任何理由进行扩展。一台配备 SSD 和 8GB 内存的典型笔记本电脑可以轻松完成数百万条记录,因此,如果您因为扩展而尝试选择,那么您的选择并不重要。我建议你选择你最喜欢的,也许你可以找到最多的在线支持。

回答by AnneTheAgile

For quick overview comparisons, I really like this website, that has many comparison pages, eg AWS DynamoDB vs MongoDB; http://db-engines.com/en/system/Amazon+DynamoDB%3BMongoDB

对于快速概览比较,我真的很喜欢这个网站,它有很多比较页面,例如 AWS DynamoDB 与 MongoDB;http://db-engines.com/en/system/Amazon+DynamoDB%3BMongoDB

回答by Deemoe

Short answer: Start with SQL and add NoSQL only when/if needed. (unless you don't need anything beyond very simple queries)

简短回答:从 SQL 开始,仅在需要时添加 NoSQL。(除非除了非常简单的查询之外你不需要任何东西)

My personal experience: I haven't used MongoDB for queries but as of April 2015 DynamoDB is still very crippled when it comes to anything beyond the most basic key/value queries. I love it for the basic stuff but if you want query language then look to a real SQL database solution.

我的个人经验:我没有使用 MongoDB 进行查询,但截至 2015 年 4 月,DynamoDB 在涉及最基本的键/值查询之外的任何内容时仍然非常瘫痪。我喜欢它的基本内容,但如果您想要查询语言,那么请查看真正的 SQL 数据库解决方案。

In DynamoDB you can query on a hash or on a hash and range key, and you can have multiple secondary global indexes. I'm doing queries on a single table with 4 possible filter parameters and sorting the results, this is supported (barely) through the use of the global secondary indexes with filter expressions. The problem comes in when you try to get the total results matching the filter, you can't just search for the first 10 items matching the filter, but rather it checks 10 items and you may get 0 valid results forcing you to keep re-scanning from the continue key - pain in the neck and consumes too much of your table read quota for a simple scenario.

在 DynamoDB 中,您可以查询散列或散列和范围键,并且您可以拥有多个二级全局索引。我正在使用 4 个可能的过滤器参数对单个表进行查询并对结果进行排序,这是通过使用带有过滤器表达式的全局二级索引来支持的(勉强)。当您尝试获取与过滤器匹配的总结果时,问题就出现了,您不能只搜索与过滤器匹配的前 10 个项目,而是它会检查 10 个项目,您可能会得到 0 个有效结果,这迫使您继续重新 -从继续键扫描 - 颈部疼痛并且在一个简单的场景中消耗了太多的表读取配额。

To be specific about the limit problem with filters in the query, this is from the docs (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#ScanQueryLimit):

要具体说明查询中过滤器的限制问题,这是来自文档(http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#ScanQueryLimit):

In a response, DynamoDB returns all the matching results within
the scope of the Limit value. For example, if you issue a Query 
or a Scan request with a Limit value of 6 and without a filter
expression, the operation returns the first six items in the 
table that match the request parameters. If you also supply a
FilterExpression, the operation returns the items within the 
first six items in the table that match the filter requirements.

My conclusion is that queries involving FilterExpressions are only usable on very rare occasions and are not scalable because each query can easily read most or all of your of your table which consumes far too many DynamoDB read units. Once you use too many read units you'll get throttled and see poor performance.

我的结论是,涉及 FilterExpressions 的查询仅在极少数情况下可用并且不可扩展,因为每个查询都可以轻松读取您的大部分或全部表,而这消耗了太多 DynamoDB 读取单元。一旦您使用过多的读取单元,您就会受到限制并看到性能不佳。

Expert opinion: In the AWS summit on Apr 9, 2015 Brett Hollman, Manager, Solutions Architecture, AWS in his talk on scalling to your first 10 million users advocates starting with a SQL database and then using NoSQL only when and if it makes sense. Because sooner or later you'll probably need a SQL server somewhere in your stack. His slides are here: http://www.slideshare.net/AmazonWebServices/deep-dive-scaling-up-to-your-first-10-million-usersSee slide 28.

专家意见:在 2015 年 4 月 9 日举行的 AWS 峰会上,AWS 解决方案架构经理 Brett Hollman 在他关于向前 1000 万用户扩展的演讲中提倡从 SQL 数据库开始,然后仅在有意义的情况下使用 NoSQL。因为迟早你可能需要在你的堆栈中的某个地方安装一个 SQL 服务器。他的幻灯片在这里:http: //www.slideshare.net/AmazonWebServices/deep-dive-scaling-up-to-your-first-10-million-users见幻灯片 28。

回答by Steffan Perry

We chose a combination of Mongo/Dynamo for a healthcare product. Basically mongo allows better searching, but the hosted Dynamo is great because its HIPAA compliant without any extra work. So we host the mongo portion with no personal data on a standard setup and allow amazon to deal with the HIPAA portion in terms of infrastructure. We can query certain items from mongo which bring up documents with pointers (ID's) of the relatable Dynamo document.

我们为保健产品选择了 Mongo/Dynamo 的组合。基本上 mongo 允许更好的搜索,但托管的 Dynamo 很棒,因为它符合 HIPAA 标准,无需任何额外工作。因此,我们在标准设置上托管没有个人数据的 mongo 部分,并允许亚马逊在基础设施方面处理 HIPAA 部分。我们可以从 mongo 查询某些项目,这些项目会调出带有相关 Dynamo 文档的指针(ID)的文档。

The main reason we chose to do this using mongo instead of hosting the entire application on dynamo was for 2 reasons. First, we needed to preform location based searches which mongo is great at and at the time, Dynamo was not, but they do have an option now.

我们选择使用 mongo 而不是在 dynamo 上托管整个应用程序的主要原因有两个。首先,我们需要执行基于位置的搜索,mongo 在当时非常擅长,而 Dynamo 则不是,但他们现在有一个选择。

Secondly was that some documents were unstructured and we did not know ahead of time what the data would be, so for example lets say user a inputs a document in the "form" collection like this: {"username": "user1", "email": "[email protected]"}. And another user puts this in the same collection {"phone": "813-555-3333", "location": [28.1234,-83.2342]}. With mongo we can search any of these dynamic and unknown fields at any time, with Dynamo, you could do this but would have to make a index every time a new field was added that you wanted searchable. So if you have never had a phone field in your Dynamo document before and then all of the sudden, some one adds it, its completely unsearchable.

其次是一些文档是非结构化的,我们事先不知道数据是什么,所以例如假设用户 a 在“表单”集合中输入一个文档,如下所示:{“用户名”:“用户1”,“电子邮件”:“[email protected]”}。另一个用户将其放在同一个集合中 {"phone": "813-555-3333", "location": [28.1234,-83.2342]}。使用 mongo,我们可以随时搜索这些动态和未知字段中的任何一个,使用 Dynamo,您可以这样做,但每次添加您想要搜索的新字段时都必须创建索引。因此,如果您之前从未在 Dynamo 文档中包含电话字段,然后突然之间,有人添加了它,它完全无法搜索。

Now this brings up another point in which you have mentioned. Sometimes choosing the right solution for the job does not always mean choosing the best product for the job. For example you may have a client who needs and will use the system you created for 10+ years. Going with a SaaS/IaaS solution that is good enough to get the job done may be a better option as you can rely on amazon to have up-kept and maintained their systems over the long haul.

这又引出了你提到的另一点。有时,为工作选择正确的解决方案并不总是意味着为工作选择最好的产品。例如,您可能有一个客户需要并且将使用您创建的系统超过 10 年。使用足以完成工作的 SaaS/IaaS 解决方案可能是更好的选择,因为您可以依靠亚马逊长期维护和维护他们的系统。

回答by Rahul Kumar

I have worked on both and kind of fan of both.

我已经为两者工作过,并且是两者的粉丝。

But you need to understand when to use what and for what purpose.

但是您需要了解何时使用什么以及用于什么目的。

I don't think It's a great idea to move all your database to DynamoDB, reason being querying is difficult except on primary and secondary keys, Indexing is limited and scanning in DynamoDB is painful.

我不认为将所有数据库移动到 DynamoDB 是一个好主意,原因是除了主键和辅助键之外查询很困难,索引有限并且在 DynamoDB 中扫描很痛苦。

I would go for a hybrid sort of DB, where extensive query-able data should be there is MongoDB, with all it's feature you would never feel constrained to provide enhancements or modifications.

我会选择一种混合类型的数据库,其中应该有大量的可查询数据是 MongoDB,具有所有它的功能,您永远不会觉得必须提供增强或修改。

DynamoDB is lightning fast (faster than MongoDB) so DynamoDB is often used as an alternative to sessions in scalable applications. DynamoDB best practices also suggests that if there are plenty of data which are less being used, move it to other table.

DynamoDB 快如闪电(比 MongoDB 快),因此 DynamoDB 通常用作可扩展应用程序中会话的替代方案。DynamoDB 最佳实践还建议,如果有大量使用较少的数据,请将其移至其他表。

So suppose you have a articles or feeds. People are more likely to look for last week stuff or this month's stuff. chances are really rare for people to visit two year old data. For these purposes DynamoDB prefers to have data stored by month or years in different tables.

因此,假设您有文章或提要。人们更有可能寻找上周的东西或本月的东西。人们访问两年前的数据的机会真的很少。出于这些目的,DynamoDB 更喜欢将数据按月或按年存储在不同的表中。

DynamoDB is seemlessly scalable, something you will have to do manually in MongoDB. however you would lose on performance of DynamoDB, if you don't understand about throughput partition and how scaling works behind the scene.

DynamoDB 具有无缝可扩展性,您必须在 MongoDB 中手动执行此操作。但是,如果您不了解吞吐量分区以及幕后扩展的工作原理,您将失去 DynamoDB 的性能。

DynamoDB should be used where speed is critical, MongoDB on the other hand has too many hands and features, something DynamoDB lacks.

DynamoDB 应该在速度至关重要的地方使用,而另一方面,MongoDB 有太多的手和功能,而 DynamoDB 则缺乏这些。

for example, you can have a replica set of MongoDB in such a way that one of the replica holds data instance of 8(or whatever) hours old. Really useful, if you messed up something big time in your DB and want to get the data as it is before.

例如,您可以拥有一个 MongoDB 的副本集,其中一个副本保存 8(或其他)小时前的数据实例。真的很有用,如果您在数据库中搞砸了一些大事并希望获得以前的数据。

That's my opinion though.

不过这是我的意见。

回答by AndrewSouthpaw

Bear in mind, I've only experimented with MongoDB...

请记住,我只尝试过 MongoDB ...

From what I've read, DynamoDB has come a long way in terms of features. It used to be a super-basic key-value store with extremely limited storage and querying capabilities. It has since grown, now supporting bigger document sizes + JSON supportand global secondary indices. The gap between what DynamoDB and MongoDB offers in terms of features grows smaller with every month. The new features of DynamoDB are expanded on here.

据我所知,DynamoDB 在功能方面取得了长足的进步。它曾经是一个超级基本的键值存储,具有极其有限的存储和查询能力。它已经发展壮大,现在支持更大的文档大小 + JSON 支持全局二级索引。DynamoDB 和 MongoDB 在功能方面的差距逐月缩小。DynamoDB 的新功能在此处进行了扩展。

Much of the MongoDB vs. DynamoDB comparisons are out of date due to the recent addition of DynamoDB features. However, this postoffers some other convincing points to choose DynamoDB, namely that it's simple, low maintenance, and often low cost. Another discussion hereof database choices was interesting to read, though slightly old.

由于最近添加了 DynamoDB 功能,很多 MongoDB 与 DynamoDB 的比较已经过时。然而,这篇文章提供了一些其他令人信服的选择 DynamoDB 的观点,即它简单、低维护且通常成本低。这里关于数据库选择的另一个讨论读起来很有趣,虽然有点旧。

My takeaway: if you're doing serious database queries or working in languages not supported by DynamoDB, use MongoDB. Otherwise, stick with DynamoDB.

我的收获:如果您正在执行严肃的数据库查询或使用 DynamoDB 不支持的语言,请使用 MongoDB。否则,请坚持使用 DynamoDB。