database 何时使用 CouchDB 与 RDBMS
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1307100/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
When to use CouchDB vs RDBMS
提问by Andrew Whitehouse
I am looking at CouchDB, which has a number of appealing features over relational databases including:
我正在研究 CouchDB,它比关系数据库具有许多吸引人的特性,包括:
- intuitive REST/HTTP interface
- easy replication
- data stored as documents, rather than normalised tables
- 直观的 REST/HTTP 界面
- 容易复制
- 数据存储为文档,而不是标准化表
I appreciate that this is not a mature product so should be adopted with caution, but am wondering whether it is actually a viable replacement for an RDBMS (in spite of the intro page saying otherwise - http://couchdb.apache.org/docs/intro.html).
我很欣赏这不是一个成熟的产品,所以应该谨慎采用,但我想知道它是否真的是 RDBMS 的可行替代品(尽管介绍页面另有说明 - http://couchdb.apache.org/docs /intro.html)。
- Under what circumstances would CouchDB be a better choice of database than an RDBMS (e.g. MySQL), e.g. in terms of scalability, design + development time, reliability and maintenance.
- Are there still cases where an RDBMS is still clearly the right choice?
- Is this an either-or choice, or is a hybrid solution more likely to emerge as best practice?
- 在什么情况下,CouchDB 是比 RDBMS(例如 MySQL)更好的数据库选择,例如在可扩展性、设计 + 开发时间、可靠性和维护方面。
- 是否仍然存在 RDBMS 显然仍然是正确选择的情况?
- 这是一个非此即彼的选择,还是混合解决方案更有可能成为最佳实践?
回答by Andrew Whitehouse
I recently attended the NoSQL conference in London and think I have a better idea now how to answer the original question. I also wrote a blog post, and there are a couple of other goodones.
我最近参加了在伦敦举行的 NoSQL 会议,我想我现在对如何回答最初的问题有了更好的想法。我还写了一篇博客文章,还有其他一些不错的文章。
Key points:
关键点:
- We have accumulated probably 30 years knowledge of adminstering relational databases, so shouldn't replace them without careful consideration; non-relational data stores are less mature than relational ones, and so are inherently more risky to adopt
- There are different types of non-relational data store; some are key-value stores, some are document stores, some are graph databases
- You could use a hybrid approach, e.g. a combination of RDBMS and graph data store for a social software site
- Document data stores (e.g. CouchDB and MongoDB) are probably the closest to relational databases and provide a JSON data structure with all the fields presented hierarchically which avoids having to do table joins, and (some might argue) is an improvement on the traditional object-relational mapping that most applications currently use
- Non-relational databases support replication (including master-master); relational databases support replication too but it may not be as comprehensive as the non-relational option
- Very large sites such as Twitter, Digg and Facebook use Cassandra, which is built from the ground up to support clustering
- Relational databases are probably suitable for 90% of cases
- 我们在管理关系数据库方面积累了大约 30 年的知识,因此不应该在没有仔细考虑的情况下替换它们;非关系型数据存储不如关系型数据存储成熟,因此采用固有的风险更大
- 有不同类型的非关系数据存储;有些是键值存储,有些是文档存储,有些是图数据库
- 您可以使用混合方法,例如社交软件站点的 RDBMS 和图形数据存储的组合
- 文档数据存储(例如 CouchDB 和 MongoDB)可能是最接近关系数据库的,它提供了一个 JSON 数据结构,其中所有字段都分层呈现,避免了必须进行表连接,并且(有些人可能会争辩说)是对传统对象的改进 -大多数应用程序当前使用的关系映射
- 非关系型数据库支持复制(包括master-master);关系数据库也支持复制,但它可能不如非关系选项全面
- Twitter、Digg 和 Facebook 等非常大的站点使用 Cassandra,它是从头开始构建以支持集群
- 关系型数据库可能适用于 90% 的情况
In summary, consensus seems to be "proceed with caution".
总之,共识似乎是“谨慎行事”。
回答by Zed
Until someone gives a more in-depth answer, here are some pros and cons for CouchDB
在有人给出更深入的答案之前,这里有一些 CouchDB 的优缺点
Pros:
优点:
- you don't need to fit your data into one of those pesky higher-order normal forms
- you can change the "schema" of your data at any time
- your data will be indexed exactly for your queries, so you will get results in constant time.
- 你不需要将你的数据放入那些讨厌的高阶范式之一
- 您可以随时更改数据的“架构”
- 您的数据将准确地为您的查询编制索引,因此您将在恒定时间内获得结果。
Cons:
缺点:
- you need to create views for each and every query, i.e. ad-hoc like queries (such as concatenating dynamic WHERE's and SORT's in an SQL) queries are not available.
- you will either have redundant data, or you will end up implementing join and sort logic yourself on "client-side" (e.g. sorting a many-to-many relationship on multiple fields)
- 您需要为每个查询创建视图,即临时查询(例如在 SQL 中连接动态 WHERE 和 SORT)查询不可用。
- 您要么有冗余数据,要么最终在“客户端”上自己实现连接和排序逻辑(例如,在多个字段上对多对多关系进行排序)
Pros or Cons:
优点或缺点:
- creating your views are not as straightforward as in SQL, it's more like solving a puzzle. Depends on your type if this is a pro or a con :)
- 创建视图不像在 SQL 中那么简单,它更像是解决一个难题。取决于您的类型,这是赞成还是反对:)
回答by Javier
CouchDB is one of several available 'key/value stores', others include oldies like BDB, web-oriented ones like Persevere, MongoDBand CouchDB, new super-fast like memcached(RAM-only) and Tokyo Cabinet, and huge stores like Hadoop and Google's BigTable (MongoDB also claims to be on this space).
CouchDB 是几种可用的“键/值存储”之一,其他包括像BDB这样的老式存储,像Persevere、MongoDB和 CouchDB这样的面向 Web 的存储,像memcached(仅限 RAM)和Tokyo Cabinet等新的超快速存储,以及像 Hadoop 这样的大型存储和 Google 的 BigTable(MongoDB 也声称在这个领域)。
There's certainly space for both key/value stores and relational DBs. Traditionally, most RDBs are considered a layer above key/value. For example, MySQL used to use BDB as an optional backend for tables. In short, key/values know nothing about fields and relationships, which are the foundations of SQL.
键/值存储和关系数据库肯定都有空间。传统上,大多数 RDB 被认为是键/值之上的一层。例如,MySQL 过去使用 BDB 作为表的可选后端。简而言之,键/值对字段和关系一无所知,它们是 SQL 的基础。
Key/value stores typically are easier to scale, which makes them an attractive choice when growing explosively, like Twitter did. Of course, that means that any relationships between the stored values have to be managed on your code, instead of just declared in SQL. CouchDB's approach is to store big 'documents' in the value part, making them (mostly) self contained, so you can get most of the needed data in a single query. Many use cases fit on this idea, others don't.
键/值存储通常更容易扩展,这使得它们在爆炸性增长时成为一个有吸引力的选择,就像 Twitter 那样。当然,这意味着存储值之间的任何关系都必须在您的代码中进行管理,而不仅仅是在 SQL 中声明。CouchDB 的方法是在值部分存储大“文档”,使它们(大部分)自包含,因此您可以在单个查询中获取大部分所需数据。许多用例符合这个想法,其他用例则不适合。
The current theme I see is that after the "Rails doesn't scale!!" scare, now many people is realizing that it's not about your web framework; but about intelligent cacheing, to avoid hitting the database, and even the webapp when possible. The rising star there is memcached.
我看到的当前主题是在“Rails 无法扩展!!”之后。可怕的是,现在很多人意识到这与您的 Web 框架无关;但是关于智能缓存,以避免访问数据库,甚至可能时访问 web 应用程序。那里的后起之秀是memcached。
As always, it all depends on your needs.
与往常一样,这一切都取决于您的需求。
回答by Jeremy Wall
This one is a hard question to answer. So I'll try to highlight the areas where CouchDB might work against you.
这是一个很难回答的问题。因此,我将尝试强调 CouchDB 可能对您不利的领域。
The two greatest sources of difficulty on the Couch Users and Dev mailing lists that people have are:
人们拥有的 Couch Users 和 Dev 邮件列表上的两个最大困难来源是:
- Complex Joins of Data.
- Multi-Step Map/Reduce.
- 数据的复杂连接。
- 多步映射/减少。
Couch Views are pretty much islands unto themselves. If you need to aggregate/merge/intersect a set of views you pretty much have to do so in the application layer for now. There are some tricks you can do with view collation and complex keys to help with joins but these only go so far for some types of data. This may or may not be livable for different applications. That being said many times this problem can reduced or eliminated by structuring your data differently.
Couch Views 本身就是一个孤岛。如果您需要聚合/合并/交叉一组视图,您现在几乎必须在应用程序层中执行此操作。您可以使用视图整理和复杂键来帮助连接,但这些技巧仅适用于某些类型的数据。这对于不同的应用可能适合也可能不适合。话虽如此,这个问题可以通过以不同的方式构建数据来减少或消除。
The comments of the other folks on this question demonstrate some of the different types of data that are well suited to CouchDB.
其他人对这个问题的评论展示了一些非常适合 CouchDB 的不同类型的数据。
One other thing to keep in mind is that a lot of times the data you might need to combine/merge/intersect would be data that you would do offline in an RDBMS database anyway so you might not lose anything by doing the same in CouchDB.
要记住的另一件事是,很多时候您可能需要组合/合并/交叉的数据都是您将在 RDBMS 数据库中离线处理的数据,因此在 CouchDB 中执行相同操作可能不会丢失任何内容。
Short Answer: I think eventually CouchDB will be able to handle any kind of problem you want to throw at it. But the comfort level you have using it may differ from developer to developer. It's somewhat subjective I think. I happen to like using a turing complete language to query my data with and keeping more logic in the application layer. Your mileage may vary.
简短回答:我认为最终 CouchDB 将能够处理您想抛出的任何类型的问题。但是您使用它的舒适程度可能因开发人员而异。我觉得这有点主观。我碰巧喜欢使用图灵完备语言来查询我的数据并在应用程序层保留更多逻辑。你的旅费可能会改变。
回答by Filippo
Sam you have to take another approch with CouchDB and in general with map or document based database. You can't define a constraint, such a unique, but you can query data to check if that email is used and if that login is used too. That's the right approch, you have to change your mind.
Sam,您必须使用 CouchDB 以及通常使用基于地图或文档的数据库采取另一种方法。您无法定义约束,例如唯一性,但您可以查询数据以检查是否使用了该电子邮件以及是否也使用了该登录名。这是正确的方法,你必须改变主意。
回答by Sam
Correct me if I am wrong. Couchdb is useless for the cases when you need to validate uniqueness of docs over multiple fields. For example it's impossible to enforce validation rule like "both login and email required to be unique" and keep data in consustent state. You can check that before saving the doc, but someone can push before you and data becomes inconsistent.
如果我错了,请纠正我。当您需要验证文档在多个字段上的唯一性时,Couchdb 是无用的。例如,不可能强制执行“登录名和电子邮件都必须是唯一的”等验证规则并使数据保持一致状态。您可以在保存文档之前进行检查,但有人可以在您之前推送并且数据变得不一致。
回答by Dana the Sane
If you are working with tabular data where there is only a shallow data hierarchy, than an RDBMS system is probably your best choice. This is the main use for RDBMS systems, and the documentation and tool support is very good.
如果您正在处理只有浅层数据层次结构的表格数据,那么 RDBMS 系统可能是您的最佳选择。这是RDBMS系统的主要用途,文档和工具支持非常好。
For more nested data like xml, a document database should provide faster access to your data. Also, the storage model more closely resembles that of the data, so retrieval should be more straight forward.
对于 xml 等更多嵌套数据,文档数据库应该提供对数据的更快访问。此外,存储模型更类似于数据的存储模型,因此检索应该更直接。