SQL 交叉连接有什么用途?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/219716/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 00:00:35  来源:igfitidea点击:

What are the uses for Cross Join?

sqldatabasejoinrelational-database

提问by Llyle

A cross join performs a cartesian product on the tuples of the two sets.

交叉连接对两个集合的元组执行笛卡尔积。

SELECT *
FROM Table1
CROSS JOIN Table2

Which circumstances render such an SQL operation particularly useful?

哪些情况使这样的 SQL 操作特别有用?

采纳答案by Dave DuPlantis

If you have a "grid" that you want to populate completely, like size and color information for a particular article of clothing:

如果您有一个要完全填充的“网格”,例如特定服装的尺寸和颜色信息:

select 
    size,
    color
from
    sizes CROSS JOIN colors

Maybe you want a table that contains a row for every minute in the day, and you want to use it to verify that a procedure has executed each minute, so you might cross three tables:

也许你想要一个包含一天中每一分钟一行的表,并且你想用它来验证每分钟都执行了一个过程,所以你可能会跨越三个表:

select
    hour,
    minute
from
    hours CROSS JOIN minutes

Or you have a set of standard report specs that you want to apply to every month in the year:

或者你有一套标准的报告规范,你想应用到一年中的每个月:

select
    specId,
    month
from
    reports CROSS JOIN months

The problem with maintaining these as views is that in most cases, you don't want a complete product, particularly with respect to clothes. You can add MINUSlogic to the query to remove certain combinations that you don't carry, but you might find it easier to populate a table some other way and not use a Cartesian product.

将这些作为视图维护的问题在于,在大多数情况下,您不想要一个完整的产品,尤其是与衣服有关的产品。您可以MINUS向查询添加逻辑以删除您没有携带的某些组合,但您可能会发现以其他方式填充表而不使用笛卡尔积更容易。

Also, you might end up trying the cross join on tables that have perhaps a few more rows than you thought, or perhaps your WHEREclause was partially or completely missing. In that case, your DBA will notify you promptly of the omission. Usually he or she will not be happy.

此外,您最终可能会尝试在行数可能比您想象的多几行的表上进行交叉联接,或者您的WHERE子句可能部分或完全丢失。在这种情况下,您的 DBA 会立即通知您有遗漏。通常他或她不会快乐。

回答by Ovidiu Pacurar

Generate data for testing.

生成测试数据。

回答by Randy

You're typically not going to want a full Cartesian product for most database queries. The whole power of relational databases is that you can apply whatever restrictions you might be interested in to allow you to avoid pulling unnecessary rows from the db.

对于大多数数据库查询,您通常不需要完整的笛卡尔积。关系数据库的强大之处在于您可以应用您可能感兴趣的任何限制,以避免从数据库中提取不必要的行。

I suppose one contrived example where you might want that is if you have a table of employees and a table of jobs that need doing and want to see all possible assignments of one employee to one job.

我想一个你可能想要的人为例子是,如果你有一张员工表和一张需要做的工作表,并且想要查看一名员工对一项工作的所有可能分配。

回答by James Curran

Ok, this probably won't answer the question, but, if it's true (and I'm not even sure of that) it's a fun bit of history.

好吧,这可能不会回答这个问题,但是,如果这是真的(我什至不确定),那是一段有趣的历史。

In the early days of Oracle, one of the developers realized that he needed to duplicate every row in a table (for example, it's possible it was a table of events and he needed to change it separate "start event" and "end event" entries). He realized that if he had a table with just two rows, he could do a cross join, selecting just the columns in the first table, and get exactly had he needed. So he created a simple table, which he naturally enough called "DUAL".

在 Oracle 的早期,其中一位开发人员意识到他需要复制表中的每一行(例如,它可能是一个事件表,他需要将其更改为单独的“开始事件”和“结束事件”条目)。他意识到,如果他有一个只有两行的表,他可以进行交叉连接,只选择第一个表中的列,并得到他需要的确切信息。所以他创建了一个简单的表,他很自然地将其称为“DUAL”。

Later, he need to do something which could only be done via a select from a table, even though the action itself had nothing to do with the table, (perhaps he forgot his watch and wanted to read the time via SELECT SYSDATE FROM...) He realized that he still had his DUAL table lying around, and used that. After a while, he tired of seeing the time printed twice, so he eventual deleted one of the rows.

后来,他需要做一些只能通过从表中选择来完成的事情,即使操作本身与表无关,(也许他忘记了手表,想通过 SELECT SYSDATE FROM.. .) 他意识到他的 DUAL 桌子仍然放着,并使用了它。过了一会儿,他厌倦了看到两次打印时间,所以他最终删除了其中的一行。

Others at Oracle started using his table, and eventually, it was decided to include it in the standard Oracle installation.

Oracle 的其他人开始使用他的表,最终决定将它包含在标准的 Oracle 安装中。

Which explains why a table whose only significance is that it has one row has a name which means "two".

这解释了为什么只有一行的表的名称意味着“二”。

回答by Kevin Dostalek

The key is "show me all possible combinations". I've used these in conjunction with other calculated fields an then sorted/filtered those.

关键是“向我展示所有可能的组合”。我将这些与其他计算字段结合使用,然后对它们进行排序/过滤。

For example, say you are building an arbitrage (trading) application. You have sellers offering products at a price and buyers asking for products at a cost. You do a cross join on the product key (to match up the potential buyers and sellers), calculate the spread between cost and price, then sort desc. on this to give you (the middleman) the most profitable trades to execute. Almost always you'll have other bounding filter criteria of course.

例如,假设您正在构建一个套利(交易)应用程序。您有卖家以一定的价格提供产品,而买家则需要付费购买产品。您对产品密钥进行交叉连接(以匹配潜在的买家和卖家),计算成本和价格之间的价差,然后按 desc 排序。以此为您(中间人)提供最有利可图的交易。当然,几乎总是有其他边界过滤条件。

回答by Joel Coehoorn

Takes something like a digits table, which has ten rows for the digits 0-9. You can use cross join on that table a few times to a get result that has however many rows you need, with the results numbered appropriately. This has a number of uses. For example, you can combine it with a datadd() function to get a set for every day in a given year.

采用类似数字表的东西,其中有十行数字 0-9。您可以在该表上多次使用交叉连接,以获得具有您需要的任意多行的结果,并适当地对结果进行编号。这有很多用途。例如,您可以将它与 dataadd() 函数结合使用以获取给定年份中每一天的集合。

回答by Jeff Jones

This is an interesting way to use a cross join to create a crosstab report. I found it in Joe Celko's SQL For Smarties, and have used it several times. It does take a little setup, but has been worth the time invested.

这是使用交叉联接创建交叉表报表的一种有趣方式。我在Joe Celko 的 SQL For Smarties 中找到了它,并且已经多次使用它。它确实需要一些设置,但值得投入时间。

回答by HankerPL

you can use it CROSS JOINto: - generate data for testing purposes - combine all properties - you need all possible combination of e.g blood groups (A,B,..) with Rh-/+, etc... --tune it for your purposes;) - I'm not expert in this area;)

您可以使用它CROSS JOIN来: - 生成用于测试目的的数据 - 结合所有属性 - 您需要所有可能的组合,例如血型(A、B、..)与 Rh-/+ 等... -- 调整它出于您的目的;) - 我不是这方面的专家;)

CREATE TABLE "HR"."BL_GRP_01" 
("GR_1" VARCHAR2(5 BYTE));
REM INSERTING into BL_GRP_01
SET DEFINE OFF;
Insert into BL_GRP_02 (GR_1) values ('A');
Insert into BL_GRP_02 (GR_1) values ('B');
Insert into BL_GRP_02 (GR_1) values ('O');
Insert into BL_GRP_01 (GR_1) values (NULL);

CREATE TABLE "HR"."BL_GRP_02" 
("GR_1" VARCHAR2(5 BYTE));

REM INSERTING into BL_GRP_02
SET DEFINE OFF;
Insert into BL_GRP_02 (GR_1) values ('A');
Insert into BL_GRP_02 (GR_1) values ('B');
Insert into BL_GRP_02 (GR_1) values ('O');
Insert into BL_GRP_02 (GR_1) values (NULL);

CREATE TABLE "HR"."RH_VAL_01" 
("RH_VAL" VARCHAR2(5 BYTE));
REM INSERTING into RH_VAL_01
SET DEFINE OFF;
Insert into RH_VAL_01 (RH_VAL) values ('+');
Insert into RH_VAL_01 (RH_VAL) values ('-');
Insert into RH_VAL_01 (RH_VAL) values (NULL);

select distinct  a.GR_1 || b.GR_1 || c.RH_VAL as BL_GRP
from BL_GRP_01 a, BL_GRP_02 b, RH_VAL_01 c
GROUP BY a.GR_1, b.GR_1, c.RH_VAL;
  • create a join for 2 tables without a common id and then group it using max(),etc.. to find highest possible combination
  • 为没有公共 ID 的 2 个表创建连接,然后使用 max() 等对其进行分组以找到可能的最高组合

回答by thoroughly

Imagine you had a series of queries you want to issue over a specific combination of items and dates (prices, availability, etc..). You could load the items and dates into separate temp tables and have your queries cross join the tables. This may be more convenient than the alternative of enumerating the items and dates in IN clauses, especially since some databases limit the number of elements in an IN clause.

想象一下,您有一系列要针对特定​​项目和日期组合(价格、可用性等)发出的查询。您可以将项目和日期加载到单独的临时表中,并让您的查询交叉连接这些表。这可能比在 IN 子句中枚举项目和日期的替代方法更方便,特别是因为某些数据库限制了 IN 子句中的元素数量。