database 什么是 NoSQL,它是如何工作的,它提供了什么好处?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1145726/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:24:12  来源:igfitidea点击:

What is NoSQL, how does it work, and what benefits does it provide?

databasenosql

提问by Matt

I've been hearing things about NoSQL and that it may eventually become the replacement for SQL DB storage methods due to the fact that DB interaction is often a bottle neck for speed on the web.

我一直在听说 NoSQL 并且它可能最终成为 SQL DB 存储方法的替代品,因为 DB 交互通常是网络速度的瓶颈。

So I just have a few questions:

所以我只有几个问题:

  1. What exactly is it?

  2. How does it work?

  3. Why would it be better than using a SQL Database? And how much better is it?

  4. Is the technology too new to start implementing yet or is it worth taking a look into?

  1. 究竟是什么?

  2. 它是如何工作的?

  3. 为什么它比使用 SQL 数据库更好?它好多少?

  4. 该技术是否太新而无法开始实施,还是值得研究一下?

采纳答案by Michael Borgwardt

  1. What exactly is it?

    On one hand, a specific system, but it has also become a generic word for a variety of new data storage backendsthat do not follow the relational DB model.

  2. How does it work?

    Each of the systems labelled with the generic name works differently, but the basic idea is to offer better scalability and performance by using DB models that don't support all the functionality of a generic RDBMS, but still enough functionality to be useful. In a way it's like MySQL, which at one time lacked support for transactions but, exactly becauseof that, managed to outperform other DB systems. If you could write your app in a way that didn't require transactions, it was great.

  3. Why would it be better than using a SQL Database? And how much better is it?

    It would be better when your site needs to scale so massively that the best RDBMS running on the best hardware you can afford and optimized as much as possible simply can't keep up with the load. How much better it is depends on the specific use case (lots of update activity combined with lots of joins is very hard on "traditional" RDBMSs) - could well be a factor of 1000 in extreme cases.

  4. Is the technology too new to start implementing yet or is it worth taking a look into?

    Depends mainly on what you're trying to achieve. It's certainly mature enough to use. But few applications really need to scale that massively. For most, a traditional RDBMS is sufficient. However, with internet usage becoming more ubiquitous all the time, it's quite likely that applications that do will become more common (though probably not dominant).

  1. 究竟是什么?

    一方面,一个特定的系统,但它也成为了各种不遵循关系数据库模型的新数据存储后端的通用词。

  2. 它是如何工作的?

    每个标有通用名称的系统的工作方式都不同,但基本思想是通过使用不支持通用 RDBMS 的所有功能但仍具有足够有用的功能的 DB 模型来提供更好的可伸缩性和性能。从某种意义上说,它就像 MySQL,它一度缺乏对事务的支持,但正因为如此,它成功地超越了其他数据库系统。如果您能以不需要事务的方式编写应用程序,那就太好了。

  3. 为什么它比使用 SQL 数据库更好?它好多少?

    如果您的站点需要如此大规模地扩展,以至于在您能负担得起并尽可能优化的最佳硬件上运行的最佳 RDBMS 根本无法跟上负载,那就更好了。它有多好取决于特定用例(大量更新活动与大量连接相结合对于“传统”RDBMS 来说非常困难) - 在极端情况下很可能是 1000 的因数。

  4. 该技术是否太新而无法开始实施,还是值得研究一下?

    主要取决于您要实现的目标。它当然足够成熟可以使用。但是很少有应用程序真正需要大规模扩展。对于大多数人来说,传统的 RDBMS 就足够了。然而,随着互联网使用变得越来越普遍,这样做的应用程序很可能会变得更加普遍(尽管可能并不占主导地位)。

回答by Philipp

There is no such thing as NoSQL!

没有 NoSQL 这样的东西!

NoSQL is a buzzword.

NoSQL 是一个流行词。

For decades, when people were talking about databases, they meant relational databases. And when people were talking about relational databases, they meant those you control with Edgar F. Codd's Structured Query Language. Storing data in some other way? Madness! Anything else is just flatfiles.

几十年来,当人们谈论数据库时,他们指的是关系数据库。当人们谈论关系数据库时,他们指的是您使用 Edgar F. Codd 的结构化查询语言控制的数据库。以其他方式存储数据?疯狂!其他任何东西都只是平面文件。

But in the past few years, people started to question this dogma. People wondered if tables with rows and columns are really the only way to represent data. People started thinking and coding, and came up with many new concepts how data could be organized. And they started to create new database systems designed for these new ways of working with data.

但在过去的几年里,人们开始质疑这个教条。人们想知道带有行和列的表格是否真的是表示数据的唯一方式。人们开始思考和编码,并提出了许多如何组织数据的新概念。他们开始创建新的数据库系统,专为这些新的数据处理方式而设计。

The philosophies of all these databases were different. But one thing all these databases had in common, was that the Structured Query Language was no longer a good fit for using them. So each database replaced SQL with their own query languages. And so the term NoSQL was born, as a label for all database technologies which defy the classic relational database model.

所有这些数据库的理念都是不同的。但是所有这些数据库的一个共同点是结构化查询语言不再适合使用它们。所以每个数据库都用自己的查询语言替换了 SQL。因此,NoSQL 一词诞生了,它是所有无视经典关系数据库模型的数据库技术的标签。

So what do NoSQL databases have in common?

那么 NoSQL 数据库有什么共同点呢?

Actually, not much.

其实,不多。

You often hear phrases like:

你经常听到这样的短语:

  • NoSQL is scalable!
  • NoSQL is for BigData!
  • NoSQL violates ACID!
  • NoSQL is a glorified key/value store!
  • NoSQL 是可扩展的!
  • NoSQL 适用于大数据!
  • NoSQL 违反了 ACID!
  • NoSQL 是一个荣耀的键/值存储!

Is that true? Well, some of these statements might be true for some databases commonly called NoSQL, but every single one is also false for at least one other. Actually, the only thing NoSQL databases have in common, is that they are databases which do not use SQL. That's it. The only thing that defines them is what sets them apart from each other.

真的吗?好吧,其中一些陈述对于通常称为 NoSQL 的某些数据库来说可能是正确的,但对于至少另一种来说,每一个陈述也是错误的。实际上,NoSQL 数据库唯一的共同点是它们都是不使用 SQL 的数据库。就是这样。唯一定义它们的是使它们彼此区分开来的东西。

So what sets NoSQL databases apart?

那么是什么让 NoSQL 数据库与众不同呢?

So we made clear that all those databases commonly referred to as NoSQL are too different to evaluate them together. Each of them needs to be evaluated separately to decide if they are a good fit to solve a specific problem. But where do we begin? Thankfully, NoSQL databases can be grouped into certain categories, which are suitable for different use-cases:

所以我们明确指出,所有这些通常被称为 NoSQL 的数据库差别太大,无法一起评估它们。它们中的每一个都需要单独评估,以确定它们是否适合解决特定问题。但是我们从哪里开始呢?值得庆幸的是,NoSQL 数据库可以分为适用于不同用例的特定类别:

Document-oriented

面向文档

Examples: MongoDB, CouchDB

示例:MongoDB、CouchDB

Strengths: Heterogenous data, working object-oriented, agile development

优势:异构数据、面向对象的工作、敏捷开发

Their advantage is that they do not require a consistent data structure. They are useful when your requirements and thus your database layout changes constantly, or when you are dealing with datasets which belong together but still look very differently. When you have a lot of tables with two columns called "key" and "value", then these might be worth looking into.

它们的优点是它们不需要一致的数据结构。当您的需求和数据库布局不断变化时,或者当您处理属于一起但看起来仍然非常不同的数据集时,它们很有用。当您有很多包含名为“key”和“value”的两列的表时,这些可能值得研究。

Graph databases

图数据库

Examples: Neo4j, GiraffeDB.

示例:Neo4j、GiraffeDB。

Strengths: Data Mining

优势:数据挖掘

While most NoSQL databases abandon the concept of managing data relations, these databases embrace it even more than those so-called relational databases.

虽然大多数 NoSQL 数据库放弃了管理数据关系的概念,但这些数据库比那些所谓的关系数据库更能接受它。

Their focus is at defining data by its relation to other data. When you have a lot of tables with primary keys which are the primary keys of two other tables (and maybe some data describing the relation between them), then these might be something for you.

他们的重点是通过数据与其他数据的关系来定义数据。当您有很多带有主键的表时,这些表是另外两个表的主键(可能还有一些描述它们之间关系的数据),那么这些可能适合您。

Key-Value Stores

键值存储

Examples: Redis, Cassandra, MemcacheDB

示例:Redis、Cassandra、MemcacheDB

Strengths: Fast lookup of values by known keys

优点:通过已知键快速查找值

They are very simplistic, but that makes them fast and easy to use. When you have no need for stored procedures, constraints, triggers and all those advanced database features and you just want fast storage and retrieval of your data, then those are for you.

它们非常简单,但这使它们快速且易于使用。如果您不需要存储过程、约束、触发器和所有这些高级数据库功能,而只想快速存储和检索数据,那么这些就是为您准备的。

Unfortunately they assume that you know exactly what you are looking for. You need the profile of User157641? No problem, will only take microseconds. But what when you want the names of all users who are aged between 16 and 24, have "waffles" as their favorite food and logged in in the last 24 hours? Tough luck. When you don't have a definite and unique key for a specific result, you can't get it out of your K-V store that easily.

不幸的是,他们假设您确切地知道自己在寻找什么。您需要 User157641 的个人资料吗?没问题,只需要微秒。但是,当您想要所有年龄在 16 到 24 岁之间、将“华夫饼”作为他们最喜欢的食物并在过去 24 小时内登录的用户的姓名时,该怎么办?倒霉。当您没有特定结果的明确且唯一的密钥时,您无法轻易将其从 KV 存储中取出。

Is SQL obsolete?

SQL 过时了吗?

Some NoSQL proponents claim that their favorite NoSQL database is the new way of doing things, and SQL is a thing of the past.

一些 NoSQL 支持者声称他们最喜欢的 NoSQL 数据库是一种新的做事方式,而 SQL 已成为过去。

Are they right?

他们是对的吗?

No, of course they aren't. While there are problems SQL isn't suitable for, it still got its strengths. Lots of data models are simply best represented as a collection of tables which reference each other. Especially because most database programmers were trained for decades to think of data in a relational way, and trying to press this mindset onto a new technology which wasn't made for it rarely ends well.

不,他们当然不是。尽管存在 SQL 不适合的问题,但它仍然有其优势。许多数据模型最好表示为相互引用的表的集合。尤其是因为大多数数据库程序员都接受了数十年的培训,以关系方式思考数据,并试图将这种思维方式强加于一项并非专为它而生的新技术上,但结果很少。

NoSQL databases aren't a replacement for SQL - they are an alternative.

NoSQL 数据库不是 SQL 的替代品——它们是一种替代品。

Most software ecosystems around the different NoSQL databases aren't as mature yet. While there are advances, you still haven't got supplemental tools which are as mature and powerful as those available for popular SQL databases.

围绕不同 NoSQL 数据库的大多数软件生态系统还没有那么成熟。尽管有进步,但您仍然没有像可用于流行 SQL 数据库的工具那样成熟和强大的补充工具。

Also, there is much more know-how for SQL around. Generations of computer scientists have spent decades of their careers into research focusing on relational databases, and it shows: The literature written about SQL databases and relational data modelling, both practical and theoretical, could fill multiple libraries full of books. How to build a relational database for your data is a topic so well-researched it's hard to find a corner case where there isn't a generally accepted by-the-book best practice.

此外,还有更多关于 SQL 的知识。几代计算机科学家在他们的职业生涯中花费了数十年的时间专注于关系数据库的研究,它表明:关于 SQL 数据库和关系数据建模的文献,无论是实践的还是理论的,都可以填满多个图书馆。如何为您的数据构建关系数据库是一个经过深入研究的主题,很难找到没有普遍接受的书本最佳实践的极端案例。

Most NoSQL databases, on the other hand, are still in their infancy. We are still figuring out the best way to use them.

另一方面,大多数 NoSQL 数据库仍处于起步阶段。我们仍在寻找使用它们的最佳方式。

回答by Carlo Strozzi

Since someone said that my previous post was off-topic, I'll try to compensate :-) NoSQL is not, and never was, intended to be a replacement for more mainstream SQL databases, but a couple of words are in order to get things in the right perspective.

由于有人说我之前的帖子是题外话,我会尽量弥补 :-) NoSQL 不是,也从来不是,旨在替代更主流的 SQL 数据库,但是为了得到从正确的角度看事情。

At the very heart of the NoSQL philosophylies the consideration that, possibly for commercial and portability reasons, SQL engines tend to disregard the tremendous power of the UNIX operating system and its derivatives.

在的心脏NoSQL的理念在于这样的考虑,可能用于商业性和便携性的原因,SQL引擎往往忽略UNIX操作系统及其衍生物的巨大威力。

With a filesystem-based database, you can take immediate advantage of the ever-increasing capabilities and power of the underlying operating system, which have been steadily increasing for many years now in accordance with Moore's law. With this approach, many operating-system commands become automatically also "database operators" (think of "ls" "sort", "find" and the other countless UNIX shell utilities).

使用基于文件系统的数据库,您可以立即利用底层操作系统不断增长的功能和强大的功能,这些功能多年来一直按照摩尔定律稳步增长。使用这种方法,许多操作系统命令也自动成为“数据库操作符”(想想“ls”、“sort”、“find”和其他无数的 UNIX shell 实用程序)。

With this in mind, and a bit of creativity, you can indeed devise a filesystem-based database that is able to overcome the limitations of many common SQL engines, at least for specific usage patterns, which is the whole point behind NoSQL's philosophy, the way I see it.

考虑到这一点,再加上一点创造力,您确实可以设计一个基于文件系统的数据库,该数据库能够克服许多常见 SQL 引擎的局限性,至少对于特定的使用模式,这是 NoSQL 哲学背后的全部要点,我看的方式。

I run hundreds of web sites and they all use NoSQL to a greater or lesser extent. In fact, they do not host huge amounts of data, but even if some of them did I could probably think of a creative use of NoSQL and the filesystem to overcome any bottlenecks. Something that would likely be more difficult with traditional SQL "jails". I urge you to google for "unix", "manis" and "shaffer" to understand what I mean.

我运行着数百个网站,它们都或多或少地使用 NoSQL。事实上,它们并不承载大量数据,但即使其中一些存储了,我也可能会想到创造性地使用 NoSQL 和文件系统来克服任何瓶颈。使用传统的 SQL“监狱”可能会更困难。我强烈建议你在谷歌上搜索“unix”、“manis”和“shaffer”来理解我的意思。

回答by CoderTao

If I recall correctly, it refers to types of databases that don't necessarily follow the relational form. Document databases come to mind, databases without a specific structure, and which don't use SQL as a specific query language.

如果我没记错的话,它指的是不一定遵循关系形式的数据库类型。我想到了文档数据库,即没有特定结构且不使用 SQL 作为特定查询语言的数据库。

It's generally better suited to web applications that rely on performance of the database, and don't need more advanced features of Relation Database Engines. For example, a Key->Value store providing a simple query by id interface might be 10-100x faster than the corresponding SQL server implementation, with a lower developer maintenance cost.

它通常更适合依赖于数据库性能的 Web 应用程序,并且不需要关系数据库引擎的更高级功能。例如,通过 id 接口提供简单查询的 Key->Value 存储可能比相应的 SQL 服务器实现快 10-100 倍,开发人员维护成本更低。

One example is this paperfor an OLTPTuple Store, which sacrificed transactions for single threaded processing (no concurrency problem because no concurrency allowed), and kept all data in memory; achieving 10-100x better performance as compared to a similar RDBMSdriven system. Basically, it's moving away from the 'One Size Fits All' view of SQL and database systems.

一个例子是这种用于OLTP元组存储,其中牺牲单线程处理交易(没有并发问题,因为没有并发允许的),并保存在存储器中的所有数据; 与类似的RDBMS驱动系统相比,性能提高了 10-100 倍。基本上,它正在远离 SQL 和数据库系统的“一刀切”视图。

回答by Gopi Nathan

In practice, NoSQL is a database system which supports fast access to large binary objects (docs, jpgs etc) using a key based access strategy. This is a departure from the traditional SQL access which is only good enough for alphanumeric values. Not only the internal storage and access strategy but also the syntax and limitations on the display format restricts the traditional SQL. BLOB implementations of traditional relational databases too suffer from these restrictions.

在实践中,NoSQL 是一个数据库系统,它支持使用基于密钥的访问策略快速访问大型二进制对象(docs、jpgs 等)。这与仅适用于字母数字值的传统 SQL 访问不同。不仅内部存储和访问策略,还有语法和显示格式的限制都限制了传统的 SQL。传统关系数据库的 BLOB 实现也受到这些限制的影响。

Behind the scene it is an indirect admission of the failure of the SQL model to support any form of OLTP or support for new dataformats. "Support" means not just store but full access capabilities - programmatic and querywise using the standard model.

在幕后,它间接承认 SQL 模型无法支持任何形式的 OLTP 或支持新的数据格式。“支持”不仅意味着存储,还意味着完整的访问能力——使用标准模型进行编程和查询。

Relational enthusiasts were quick to modify the defnition of NoSQL from Not-SQL to Not-Only-SQL to keep SQL still in the picture! This is not good especially when we see that most Java programs today resort to ORM mapping of the underlying relational model. A new concept must have a clearcut definition. Else it will end up like SOA.

关系爱好者很快将 NoSQL 的定义从 Not-SQL 修改为 Not-Only-SQL,以保持 SQL 仍然在画面中!这并不好,尤其是当我们看到当今大多数 Java 程序求助于底层关系模型的 ORM 映射时。一个新概念必须有一个明确的定义。否则它最终会像 SOA 一样。

The basis of the NoSQL systems lies in the random key - value pair. But this is not new. Traditional database systems like IMS and IDMS did support hashed ramdom keys (without making use of any index) and they still do. In fact IDMS already has a keyword NONSQL where they support SQL access to their older network database which they termed as NONSQL.

NoSQL 系统的基础在于随机键值对。但这并不新鲜。像 IMS 和 IDMS 这样的传统数据库系统确实支持散列随机密钥(不使用任何索引),而且它们仍然支持。事实上,IDMS 已经有一个关键字 NONSQL,它们支持 SQL 访问他们称为 NONSQL 的旧网络数据库。

回答by David Xu

NoSQL is a database system which doesn't use string based SQL queries to fetch data.

NoSQL 是一种数据库系统,它不使用基于字符串的 SQL 查询来获取数据。

Instead you build queries using an API they will provide, for example Amazon DynamoDB is a good example of a NoSQL database.

相反,您使用他们将提供的 API 构建查询,例如 Amazon DynamoDB 是 NoSQL 数据库的一个很好的例子。

NoSQL databases are better for large applications where scalability is important.

NoSQL 数据库更适合可扩展性很重要的大型应用程序。

回答by Matthew Flaschen

NoSQLthe actual program appears to be a relational database implemented in awk using flat files on the backend. Though they profess, "NoSQL essentially has no arbitrary limits, and can work where other products can't. For example there is no limit on data field size, the number of columns, or file size" , I don't think it is the large scale database of the future.

NoSQL实际程序似乎是在后端使用平面文件在 awk 中实现的关系数据库。尽管他们声称“NoSQL 本质上没有任意限制,并且可以在其他产品无法做到的地方工作。例如,对数据字段大小、列数或文件大小没有限制”,但我不认为它是未来的大型数据库。

As Joel says, massively scalable databases like BigTableor HBase, are much more interesting. GQL is the query language associated with BigTable and App Engine. It's largely SQL tweaked to avoid features Google considers bottle-necks (like joins). However, I haven't heard this referred to as "NoSQL" before.

正如 Joel 所说,像BigTableHBase这样的大规模可扩展数据库更有趣。GQL 是与 BigTable 和 App Engine 相关的查询语言。它主要是对 SQL 进行了调整,以避免 Google 认为存在瓶颈的功能(如连接)。但是,我以前从未听说过将其称为“NoSQL”。

回答by Joel Coehoorn

It's like Jacuzzi: both a brand and a generic name. It's not just a specific technology, but rather a specific typeof technology, in this case referring to large-scale (often sparse) "databases" like Google's BigTable or CouchDB.

这就像按摩浴缸:既是品牌又是通用名称。它不仅仅是一种特定的技术,而是一种特定类型的技术,在这种情况下指的是大规模(通常是稀疏的)“数据库”,例如 Google 的 BigTable 或 CouchDB。

回答by Arun C

Does NoSQL mean non-relational database?

NoSQL 是否意味着非关系型数据库?

Yes, NoSQL is different from RDBMS and OLAP. It uses looser consistency models than traditional relational databases.

是的,NoSQL 不同于 RDBMS 和 OLAP。它使用比传统关系数据库更松散的一致性模型。

Consistency models are used in distributed systems like distributed shared memory systems or distributed data store.

一致性模型用于分布式系统,如分布式共享内存系统或分布式数据存储。

How it works internally?

它在内部如何运作?

NoSQL database systems are often highly optimized for retrieval and appending operations and often offer little functionality beyond record storage (e.g. key-value stores). The reduced run-time flexibility compared to full SQL systems is compensated by marked gains in scalability and performance for certain data models.

NoSQL 数据库系统通常针对检索和附加操作进行了高度优化,并且除了记录存储(例如键值存储)之外,通常还提供很少的功能。与完整 SQL 系统相比,降低的运行时灵活性可以通过某些数据模型的可扩展性和性能的显着提高来弥补。

It can work on Structured and Unstructured Data. It uses Collections instead of Tables

它可以处理结构化和非结构化数据。它使用集合而不是表

How do you query such "database"?

你如何查询这样的“数据库”?

Watch SQL vs NoSQL: Battle of the Backends; it explains it all.

观看SQL 与 NoSQL:后端之战;它解释了这一切。