SQL 使用 NOSQL 进行联接操作

Question

提问by Sri

I have gone through some articles regarding Bigtable and NOSQL. It is very interesting that they avoid JOIN operations.

我浏览了一些关于 Bigtable 和 NOSQL 的文章。非常有趣的是，他们避免了 JOIN 操作。

As a basic example, let's take Employee and Department table and assume the data is spread across multiple tables / servers.

作为一个基本示例，让我们以 Employee 和 Department 表为例，并假设数据分布在多个表/服务器上。

Just want to know, if data is spread across multiple servers, how do we do JOIN or UNION operations?

只是想知道，如果数据分布在多个服务器上，我们如何进行JOIN或UNION操作？

Answer 1

回答by MarkR

When you have extremely large data, you probably want to avoid joins. This is because the overhead of an individual key lookup is relatively large (the service needs to figure out which node(s) to query, and query them in parallel and wait for responses). By overhead, I mean latency, not throughput limitation.

当您有非常大的数据时，您可能希望避免连接。这是因为单个键查找的开销相对较大（服务需要找出要查询的节点，并并行查询并等待响应）。通过开销，我的意思是延迟，而不是吞吐量限制。

This makes joins suck really badly as you'd need to do a lot of foreign key lookups, which would end up going to many,many different nodes (in many cases). So you'd want to avoid this as a pattern.

这使得连接非常糟糕，因为您需要进行大量外键查找，这最终会转到许多不同的节点（在许多情况下）。所以你想避免这种模式。

If it doesn't happen very often, you could probably take the hit, but if you're going to want to do a lot of them, it may be worth "denormalising" the data.

如果它不经常发生，你可能会受到打击，但如果你想要做很多这样的事情，可能值得对数据进行“非规范化”。

The kind of stuff which gets stored in NoSQL stores is typically pretty "abnormal" in the first place. It is not uncommon to duplicate the same data in all sorts of different places to make lookups easier.

首先，存储在 NoSQL 存储中的东西通常非常“异常”。在各种不同的地方复制相同的数据以使查找更容易，这种情况并不少见。

Additionally most nosql don't (really) support secondary indexes either, which means you have to duplicate stuff if you want to query by any other criterion.

此外，大多数 nosql 也不（真的）支持二级索引，这意味着如果您想按任何其他标准进行查询，则必须复制内容。

If you're storing data such as employees and departments, you're really better off with a conventional database.

如果您要存储员工和部门等数据，那么使用传统数据库确实更好。

Answer 2

回答by Kaleb Brasee

You would have to do multiple selects, and join the data manually in your application. See this SO postfor more information. From that post:

您必须进行多项选择，并在您的应用程序中手动加入数据。有关更多信息，请参阅此 SO 帖子。从那个帖子：

Bigtable datasets can be queried from services like AppEngine using a language called GQL ("gee-kwal") which is a based on a subset of SQL. Conspicuously missing from GQL is any sort of JOIN command. Because of the distributed nature of a Bigtable database, performing a join between two tables would be terribly inefficient. Instead, the programmer has to implement such logic in his application, or design his application so as to not need it.

Bigtable 数据集可以使用一种称为 GQL（“gee-kwal”）的语言从 AppEngine 等服务中查询，该语言基于 SQL 的子集。GQL 明显缺少任何类型的 JOIN 命令。由于 Bigtable 数据库的分布式特性，在两个表之间执行连接会非常低效。相反，程序员必须在他的应用程序中实现这样的逻辑，或者设计他的应用程序以便不需要它。

Answer 3

回答by Ken Fox

Kaleb's right. You write custom code with a NoSQL solution if your data doesn't fit well into a key-value store. Map-reduce/async processing and custom view caches are common. Brian Aker gave a very funny (and satirical and biased) presentation at the Nov 2009 OpenSQLCamp http://www.youtube.com/watch?v=LhnGarRsKnA. Skip in 40 seconds to hear about joins.

卡莱布是对的。如果您的数据不适合键值存储，您可以使用 NoSQL 解决方案编写自定义代码。Map-reduce/async 处理和自定义视图缓存很常见。Brian Aker 在 2009 年 11 月的 OpenSQLCamp http://www.youtube.com/watch?v=LhnGarRsKnA 上做了一个非常有趣（讽刺和有偏见）的演讲。在 40 秒内跳过以了解加入。

SQL 使用 NOSQL 进行联接操作

提问by Sri

回答by MarkR

回答by Kaleb Brasee

回答by Ken Fox

相关推荐

最近更新

标签

SQL 使用 NOSQL 进行联接操作

提问by Sri

回答by MarkR

回答by Kaleb Brasee

回答by Ken Fox

相关推荐

SQL PostgreSQL 的时间戳差异（以小时为单位）

SQL ORACLE：如何将 BLOB 转换为 VARCHAR2

在 SQL Server Management Studio 中编辑表后保存更改

SQL 如何在SQL中选择具有偶数ID号的列？

相关推荐

最近更新

标签