SQL 什么时候应该在内部连接上使用交叉应用?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1139160/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
When should I use cross apply over inner join?
提问by Jeff Meatball Yang
What is the main purpose of using CROSS APPLY?
使用CROSS APPLY的主要目的是什么?
I have read (vaguely, through posts on the Internet) that cross apply
can be more efficient when selecting over large data sets if you are partitioning. (Paging comes to mind)
我读过(模糊地,通过 Internet 上的帖子)cross apply
如果您进行分区,则在选择大型数据集时会更有效。(想到分页)
I also know that CROSS APPLY
doesn't require a UDF as the right-table.
我也知道CROSS APPLY
不需要 UDF 作为右表。
In most INNER JOIN
queries (one-to-many relationships), I could rewrite them to use CROSS APPLY
, but they always give me equivalent execution plans.
在大多数INNER JOIN
查询(一对多关系)中,我可以重写它们以使用CROSS APPLY
,但它们总是给我等效的执行计划。
Can anyone give me a good example of when CROSS APPLY
makes a difference in those cases where INNER JOIN
will work as well?
谁能给我一个很好的例子,说明什么时候CROSS APPLY
在这些情况下INNER JOIN
也会起作用?
Edit:
编辑:
Here's a trivial example, where the execution plans are exactly the same. (Show me one where they differ and where cross apply
is faster/more efficient)
这是一个简单的示例,其中执行计划完全相同。(向我展示它们的不同之处以及cross apply
更快/更有效的地方)
create table Company (
companyId int identity(1,1)
, companyName varchar(100)
, zipcode varchar(10)
, constraint PK_Company primary key (companyId)
)
GO
create table Person (
personId int identity(1,1)
, personName varchar(100)
, companyId int
, constraint FK_Person_CompanyId foreign key (companyId) references dbo.Company(companyId)
, constraint PK_Person primary key (personId)
)
GO
insert Company
select 'ABC Company', '19808' union
select 'XYZ Company', '08534' union
select '123 Company', '10016'
insert Person
select 'Alan', 1 union
select 'Bobby', 1 union
select 'Chris', 1 union
select 'Xavier', 2 union
select 'Yoshi', 2 union
select 'Zambrano', 2 union
select 'Player 1', 3 union
select 'Player 2', 3 union
select 'Player 3', 3
/* using CROSS APPLY */
select *
from Person p
cross apply (
select *
from Company c
where p.companyid = c.companyId
) Czip
/* the equivalent query using INNER JOIN */
select *
from Person p
inner join Company c on p.companyid = c.companyId
采纳答案by Quassnoi
Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?
任何人都可以举一个很好的例子来说明 CROSS APPLY 何时在 INNER JOIN 也可以工作的情况下有所不同?
See the article in my blog for detailed performance comparison:
详细性能对比见我博客文章:
CROSS APPLY
works better on things that have no simple JOIN
condition.
CROSS APPLY
在没有简单JOIN
条件的事情上效果更好。
This one selects 3
last records from t2
for each record from t1
:
这个3
从t2
以下位置为每条记录选择最后一条记录t1
:
SELECT t1.*, t2o.*
FROM t1
CROSS APPLY
(
SELECT TOP 3 *
FROM t2
WHERE t2.t1_id = t1.id
ORDER BY
t2.rank DESC
) t2o
It cannot be easily formulated with an INNER JOIN
condition.
它不能很容易地用INNER JOIN
条件来表述。
You could probably do something like that using CTE
's and window function:
您可能可以使用CTE
's 和窗口函数来做类似的事情:
WITH t2o AS
(
SELECT t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn
FROM t2
)
SELECT t1.*, t2o.*
FROM t1
INNER JOIN
t2o
ON t2o.t1_id = t1.id
AND t2o.rn <= 3
, but this is less readable and probably less efficient.
,但这不太可读,而且效率可能较低。
Update:
更新:
Just checked.
刚查过。
master
is a table of about 20,000,000
records with a PRIMARY KEY
on id
.
master
是一个20,000,000
带有PRIMARY KEY
on的 about记录表id
。
This query:
这个查询:
WITH q AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS rn
FROM master
),
t AS
(
SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
JOIN q
ON q.rn <= t.id
runs for almost 30
seconds, while this one:
运行了几乎30
几秒钟,而这个:
WITH t AS
(
SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
CROSS APPLY
(
SELECT TOP (t.id) m.*
FROM master m
ORDER BY
id
) q
is instant.
是即时的。
回答by nurettin
cross apply
sometimes enables you to do things that you cannot do with inner join
.
cross apply
有时使您能够做您无法使用的事情inner join
。
Example (a syntax error):
示例(语法错误):
select F.* from sys.objects O
inner join dbo.myTableFun(O.name) F
on F.schema_id= O.schema_id
This is a syntax error, because, when used with inner join
, table functions can only take variables or constantsas parameters. (I.e., the table function parameter cannot depend on another table's column.)
这是一个语法错误,因为当与 一起使用时inner join
,表函数只能将变量或常量作为参数。(即,表函数参数不能依赖于另一个表的列。)
However:
然而:
select F.* from sys.objects O
cross apply ( select * from dbo.myTableFun(O.name) ) F
where F.schema_id= O.schema_id
This is legal.
这是合法的。
Edit:Or alternatively, shorter syntax: (by ErikE)
编辑:或者,更短的语法:(由 ErikE)
select F.* from sys.objects O
cross apply dbo.myTableFun(O.name) F
where F.schema_id= O.schema_id
Edit:
编辑:
Note: Informix 12.10 xC2+ has Lateral Derived Tablesand Postgresql (9.3+) has Lateral Subquerieswhich can be used to a similar effect.
注意:Informix 12.10 xC2+ 有横向派生表,而 Postgresql (9.3+) 有横向子查询,可以用来达到类似的效果。
回答by Sarath Avanavu
Consider you have two tables.
考虑你有两张桌子。
MASTER TABLE
主表
x------x--------------------x
| Id | Name |
x------x--------------------x
| 1 | A |
| 2 | B |
| 3 | C |
x------x--------------------x
DETAILS TABLE
详情表
x------x--------------------x-------x
| Id | PERIOD | QTY |
x------x--------------------x-------x
| 1 | 2014-01-13 | 10 |
| 1 | 2014-01-11 | 15 |
| 1 | 2014-01-12 | 20 |
| 2 | 2014-01-06 | 30 |
| 2 | 2014-01-08 | 40 |
x------x--------------------x-------x
There are many situations where we need to replace INNER JOIN
with CROSS APPLY
.
有很多情况我们需要替换INNER JOIN
为CROSS APPLY
。
1. Join two tables based on TOP n
results
1.根据TOP n
结果连接两个表
Consider if we need to select Id
and Name
from Master
and last two dates for each Id
from Details table
.
考虑我们是否需要为每个from选择Id
和Name
fromMaster
和最后两个日期。Id
Details table
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
INNER JOIN
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
ORDER BY CAST(PERIOD AS DATE)DESC
)D
ON M.ID=D.ID
The above query generates the following result.
上述查询生成以下结果。
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
x------x---------x--------------x-------x
See, it generated results for last two dates with last two date's Id
and then joined these records only in the outer query on Id
, which is wrong. This should be returning both Ids
1 and 2 but it returned only 1 because 1 has the last two dates. To accomplish this, we need to use CROSS APPLY
.
看,它用最后两个日期生成了最后两个日期的结果Id
,然后只在外部查询中加入了这些记录Id
,这是错误的。这应该返回Ids
1 和 2 但它只返回 1 因为 1 有最后两个日期。为此,我们需要使用CROSS APPLY
.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
CROSS APPLY
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
WHERE M.ID=D.ID
ORDER BY CAST(PERIOD AS DATE)DESC
)D
and forms the following result.
并形成以下结果。
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-08 | 40 |
| 2 | B | 2014-01-06 | 30 |
x------x---------x--------------x-------x
Here's how it works. The query inside CROSS APPLY
can reference the outer table, where INNER JOIN
cannot do this (it throws compile error). When finding the last two dates, joining is done inside CROSS APPLY
i.e., WHERE M.ID=D.ID
.
这是它的工作原理。内部的查询CROSS APPLY
可以引用外部表,哪里INNER JOIN
不能这样做(它会引发编译错误)。当找到最后两个日期时,加入是在CROSS APPLY
ie,内完成的WHERE M.ID=D.ID
。
2. When we need INNER JOIN
functionality using functions.
2. 当我们需要INNER JOIN
使用函数的功能时。
CROSS APPLY
can be used as a replacement with INNER JOIN
when we need to get result from Master
table and a function
.
CROSS APPLY
INNER JOIN
当我们需要从Master
table 和 a 中获取结果时,可以用作替代function
。
SELECT M.ID,M.NAME,C.PERIOD,C.QTY
FROM MASTER M
CROSS APPLY dbo.FnGetQty(M.ID) C
And here is the function
这是功能
CREATE FUNCTION FnGetQty
(
@Id INT
)
RETURNS TABLE
AS
RETURN
(
SELECT ID,PERIOD,QTY
FROM DETAILS
WHERE ID=@Id
)
which generated the following result
产生了以下结果
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-11 | 15 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-06 | 30 |
| 2 | B | 2014-01-08 | 40 |
x------x---------x--------------x-------x
ADDITIONAL ADVANTAGE OF CROSS APPLY
交叉申请的额外优势
APPLY
can be used as a replacement for UNPIVOT
. Either CROSS APPLY
or OUTER APPLY
can be used here, which are interchangeable.
APPLY
可以作为UNPIVOT
. 无论是CROSS APPLY
或OUTER APPLY
可以在这里使用,这是可以互换的。
Consider you have the below table(named MYTABLE
).
考虑您有下表(名为MYTABLE
)。
x------x-------------x--------------x
| Id | FROMDATE | TODATE |
x------x-------------x--------------x
| 1 | 2014-01-11 | 2014-01-13 |
| 1 | 2014-02-23 | 2014-02-27 |
| 2 | 2014-05-06 | 2014-05-30 |
| 3 | NULL | NULL |
x------x-------------x--------------x
The query is below.
查询如下。
SELECT DISTINCT ID,DATES
FROM MYTABLE
CROSS APPLY(VALUES (FROMDATE),(TODATE))
COLUMNNAMES(DATES)
which brings you the result
这给你带来了结果
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
| 3 | NULL |
x------x-------------x
回答by mtone
It seems to me that CROSS APPLY can fill a certain gap when working with calculated fields in complex/nested queries, and make them simpler and more readable.
在我看来,在处理复杂/嵌套查询中的计算字段时,CROSS APPLY 可以填补一定的空白,并使它们更简单、更具可读性。
Simple example: you have a DoB and you want to present multiple age-related fields that will also rely on other data sources (such as employment), like Age, AgeGroup, AgeAtHiring, MinimumRetirementDate, etc. for use in your end-user application (Excel PivotTables, for example).
简单示例:您有一个 DoB 并且您想要呈现多个与年龄相关的字段,这些字段也将依赖于其他数据源(例如就业),例如 Age、AgeGroup、AgeAtHiring、MinimumRetirementDate 等,以便在您的最终用户应用程序中使用(例如,Excel 数据透视表)。
Options are limited and rarely elegant:
选项有限,很少优雅:
JOIN subqueries cannot introduce new values in the dataset based on data in the parent query (it must stand on its own).
UDFs are neat, but slow as they tend to prevent parallel operations. And being a separate entity can be a good (less code) or a bad (where is the code) thing.
Junction tables. Sometimes they can work, but soon enough you're joining subqueries with tons of UNIONs. Big mess.
Create yet another single-purpose view, assuming your calculations don't require data obtained mid-way through your main query.
Intermediary tables. Yes... that usually works, and often a good option as they can be indexed and fast, but performance can also drop due to to UPDATE statements not being parallel and not allowing to cascade formulas (reuse results) to update several fields within the same statement. And sometimes you'd just prefer to do things in one pass.
Nesting queries. Yes at any point you can put parenthesis on your entire query and use it as a subquery upon which you can manipulate source data and calculated fields alike. But you can only do this so much before it gets ugly. Very ugly.
Repeating code. What is the greatest value of 3 long (CASE...ELSE...END) statements? That's gonna be readable!
- Tell your clients to calculate the damn things themselves.
JOIN 子查询不能基于父查询中的数据在数据集中引入新值(它必须独立存在)。
UDF 很简洁,但速度很慢,因为它们往往会阻止并行操作。作为一个独立的实体可能是一件好事(更少的代码)或一件坏事(代码在哪里)。
连接表。有时它们可以工作,但很快你就会加入带有大量 UNION 的子查询。大混乱。
创建另一个单一用途的视图,假设您的计算不需要在主查询中途获得的数据。
中间表。是的......这通常有效,而且通常是一个不错的选择,因为它们可以被索引并且速度很快,但是由于 UPDATE 语句不是并行的并且不允许级联公式(重用结果)来更新其中的几个字段,因此性能也会下降同样的声明。有时您更喜欢一次性完成所有事情。
嵌套查询。是的,您可以在任何时候将括号放在整个查询上,并将其用作子查询,您可以在子查询上操作源数据和计算字段。但是你只能在它变得丑陋之前做这么多。十分难看。
重复代码。3 条长 (CASE...ELSE...END) 语句的最大值是多少?这将是可读的!
- 告诉您的客户自己计算该死的事情。
Did I miss something? Probably, so feel free to comment. But hey, CROSS APPLY is like a godsend in such situations: you just add a simple CROSS APPLY (select tbl.value + 1 as someFormula) as crossTbl
and voilà! Your new field is now ready for use practically like it had always been there in your source data.
我错过了什么?可能,所以请随时发表评论。但是,嘿,在这种情况下,CROSS APPLY 就像是天赐之物:您只需添加一个简单的就可以了CROSS APPLY (select tbl.value + 1 as someFormula) as crossTbl
!您的新字段现在几乎可以使用,就像它一直存在于您的源数据中一样。
Values introduced through CROSS APPLY can...
通过 CROSS APPLY 引入的值可以...
- be used to create one or multiple calculated fields without adding performance, complexity or readability issues to the mix
- like with JOINs, several subsequent CROSS APPLY statements can refer to themselves:
CROSS APPLY (select crossTbl.someFormula + 1 as someMoreFormula) as crossTbl2
- you can use values introduced by a CROSS APPLY in subsequent JOIN conditions
- As a bonus, there's the Table-valued function aspect
- 用于创建一个或多个计算字段,而不会增加性能、复杂性或可读性问题
- 与 JOIN 类似,几个后续的 CROSS APPLY 语句可以引用它们自己:
CROSS APPLY (select crossTbl.someFormula + 1 as someMoreFormula) as crossTbl2
- 您可以在随后的 JOIN 条件中使用 CROSS APPLY 引入的值
- 作为奖励,还有表值函数方面
Dang, there's nothing they can't do!
该死,没有什么是他们做不到的!
回答by Chris
Cross apply works well with an XML field as well. If you wish to select node values in combination with other fields.
交叉应用也适用于 XML 字段。如果您希望结合其他字段选择节点值。
For example, if you have a table containing some xml
例如,如果您有一个包含一些 xml 的表
<root> <subnode1> <some_node value="1" /> <some_node value="2" /> <some_node value="3" /> <some_node value="4" /> </subnode1> </root>
<root> <subnode1> <some_node value="1" /> <some_node value="2" /> <some_node value="3" /> <some_node value="4" /> </subnode1> </root>
Using the query
使用查询
SELECT
id as [xt_id]
,xmlfield.value('(/root/@attribute)[1]', 'varchar(50)') root_attribute_value
,node_attribute_value = [some_node].value('@value', 'int')
,lt.lt_name
FROM dbo.table_with_xml xt
CROSS APPLY xmlfield.nodes('/root/subnode1/some_node') as g ([some_node])
LEFT OUTER JOIN dbo.lookup_table lt
ON [some_node].value('@value', 'int') = lt.lt_id
Will return a result
会返回结果
xt_id root_attribute_value node_attribute_value lt_name
----------------------------------------------------------------------
1 test1 1 Benefits
1 test1 4 FINRPTCOMPANY
回答by Apneal
This has already been answered very well technically, but let me give a concrete example of how it's extremely useful:
这在技术上已经得到很好的回答,但让我举一个具体的例子来说明它是如何非常有用的:
Lets say you have two tables, Customer and Order. Customers have many Orders.
假设您有两个表,客户和订单。客户有很多订单。
I want to create a view that gives me details about customers, and the most recent order they've made. With just JOINS, this would require some self-joins and aggregation which isn't pretty. But with Cross Apply, its super easy:
我想创建一个视图,为我提供有关客户的详细信息以及他们最近下的订单。仅使用 JOINS,这将需要一些不美观的自连接和聚合。但是通过交叉应用,它超级简单:
SELECT *
FROM Customer
CROSS APPLY (
SELECT TOP 1 *
FROM Order
WHERE Order.CustomerId = Customer.CustomerId
ORDER BY OrderDate DESC
) T
回答by balaji dileep kumar
Cross apply can be used to replace subquery's where you need a column of the subquery
交叉应用可用于替换需要子查询列的子查询
subquery
子查询
select * from person p where
p.companyId in(select c.companyId from company c where c.companyname like '%yyy%')
here i won't be able to select the columns of company table so, using cross apply
在这里我将无法选择公司表的列,因此使用交叉应用
select P.*,T.CompanyName
from Person p
cross apply (
select *
from Company C
where p.companyid = c.companyId and c.CompanyName like '%yyy%'
) T
回答by shahkalpesh
I guess it should be readability ;)
我想它应该是可读性的;)
CROSS APPLY will be somewhat unique for people reading to tell them that a UDF is being used which will be applied to each row from the table on the left.
CROSS APPLY 对于阅读的人来说有点独特,告诉他们正在使用 UDF,该 UDF 将应用于左侧表中的每一行。
Ofcourse, there are other limitations where a CROSS APPLY is better used than JOIN which other friends have posted above.
当然,还有其他限制,CROSS APPLY 比其他朋友在上面发布的 JOIN 更好用。
回答by Shanid
Here is an article that explains it all, with their performance difference and usage over JOINS.
这是一篇解释这一切的文章,以及它们在 JOINS 上的性能差异和用法。
SQL Server CROSS APPLY and OUTER APPLY over JOINS
SQL Server CROSS APPLY 和 OUTER APPLY over JOINS
As suggested in this article, there is no performance difference between them for normal join operations (INNER AND CROSS).
正如本文所建议的,对于正常的连接操作(内部和交叉),它们之间没有性能差异。
The usage difference arrives when you have to do a query like this:
当您必须执行这样的查询时,使用差异就会出现:
CREATE FUNCTION dbo.fn_GetAllEmployeeOfADepartment(@DeptID AS INT)
RETURNS TABLE
AS
RETURN
(
SELECT * FROM Employee E
WHERE E.DepartmentID = @DeptID
)
GO
SELECT * FROM Department D
CROSS APPLY dbo.fn_GetAllEmployeeOfADepartment(D.DepartmentID)
That is, when you have to relate with function. This cannot be done using INNER JOIN, which would give you the error "The multi-part identifier "D.DepartmentID" could not be bound."Here the value is passed to the function as each row is read. Sounds cool to me. :)
也就是说,当您必须与功能相关时。这不能使用 INNER JOIN 来完成,这会给您带来错误“无法绑定多部分标识符“D.DepartmentID”。在这里,当读取每一行时,将值传递给函数。对我来说听起来很酷。:)
回答by Raf
The essence of the APPLY operator is to allow correlation between left and right side of the operator in the FROM clause.
APPLY 运算符的本质是允许在 FROM 子句中运算符的左侧和右侧之间进行关联。
In contrast to JOIN, the correlation between inputs is not allowed.
与 JOIN 相比,不允许输入之间的相关性。
Speaking about correlation in APPLY operator, I mean on the right hand side we can put:
谈到 APPLY 运算符中的相关性,我的意思是在右侧,我们可以放置:
- a derived table - as a correlated subquery with an alias
- a table valued function - a conceptual view with parameters, where the parameter can refer to the left side
- 派生表 - 作为具有别名的相关子查询
- 表值函数 - 带参数的概念视图,其中参数可以引用左侧
Both can return multiple columns and rows.
两者都可以返回多个列和行。