SQL 从左外连接获取不同的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/788984/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Getting distinct rows from a left outer join
提问by Nazgul
I am building an application which dynamically generates sql to search for rows of a particular Table (this is the main domain class, like an Employee).
我正在构建一个应用程序,它动态生成 sql 来搜索特定表的行(这是主要的域类,如员工)。
There are three tables Table1, Table2 and Table1Table2Map. Table1 has a many to many relationship with Table2, and is mapped through Table1Table2Map table. But since Table1 is my main table the relationship is virtually like a one to many.
共有三个表 Table1、Table2 和 Table1Table2Map。Table1和Table2是多对多的关系,通过Table1Table2Map表映射。但由于 Table1 是我的主表,关系实际上就像一对多。
My app generates a sql which basically gives a result set containing rows from all these tables. The select clause and joins dont change whereas the where clause is generated based on user interaction. In any case I dont want duplicate rows of Table1 in my result set as it is the main table for result display. Right now the query that is getting generated is like this:
我的应用程序生成一个 sql,它基本上给出了一个包含所有这些表行的结果集。select 子句和 joins 不会改变,而 where 子句是基于用户交互生成的。在任何情况下,我都不希望结果集中出现重复的 Table1 行,因为它是结果显示的主表。现在正在生成的查询是这样的:
select distinct Table1.Id as Id, Table1.Name, Table2.Description from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id)
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)
For simplicity I have excluded the where clause. The problem is when there are multiple rows in Table2 for Table1 even though I have said distinct of Table1.Id the result set has duplicate rows of Table1 as it has to select all the matching rows in Table2.
为简单起见,我排除了 where 子句。问题是当 Table2 中有多个行用于 Table1,即使我已经说过 Table1.Id 的不同,结果集有 Table1 的重复行,因为它必须选择 Table2 中的所有匹配行。
To elaborate more, consider that for a row in Table1 with Id = 1 there are two rows in Table1Table2Map (1, 1) and (1, 2) mapping Table1 to two rows in Table2 with ids 1, 2. The above mentioned query returns duplicate rows for this case. Now I want the query to return Table1 row with Id 1 only once. This is because there is only one row in Table2 that is like an active value for the corresponding entry in Table1 (this information is in Mapping table). Is there a way I can avoid getting duplicate rows of Table1.
更详细地说,考虑对于 Table1 中 Id = 1 的一行,Table1Table2Map (1, 1) 和 (1, 2) 中有两行将 Table1 映射到 Table2 中 ID 为 1, 2 的两行。 上述查询返回这种情况下的重复行。现在我希望查询只返回一次 Id 为 1 的 Table1 行。这是因为 Table2 中只有一行类似于 Table1 中相应条目的活动值(此信息在映射表中)。有什么方法可以避免获得 Table1 的重复行。
I think there is some basic problem in the way I am trying to solve the problem, but I am not able to find out what it is. Thanks in advance.
我认为我试图解决问题的方式存在一些基本问题,但我无法找出它是什么。提前致谢。
回答by Lasse V. Karlsen
Try:
尝试:
left outer join (select distinct YOUR_COLUMNS_HERE ...) SUBQUERY_ALIAS on ...
In other words, don't join directly against the table, join against a sub-query that limits the rows you join against.
换句话说,不要直接连接表,而是连接限制您连接的行的子查询。
回答by kommradHomer
You can use GROUP BY
on Table1.Id
,and that will get rid off the extra rows. You wouldn't need to worry about any mechanics on join side.
您可以使用 GROUP BY
on Table1.Id
,这将摆脱多余的行。您无需担心加入方面的任何机制。
I came up with this solution in a huge query and it this solution didnt effect the query time much.
我在一个巨大的查询中提出了这个解决方案,这个解决方案对查询时间没有太大影响。
NOTE : I'm answering this question 3 years after its been asked but this may help someone i believe.
注意:我在被问到这个问题 3 年后才回答这个问题,但这可能对我相信的人有所帮助。
回答by John Gibb
You can re-write your left joins to be outer applies, so that you can use a top 1 and an order by as follows:
您可以将左连接重新编写为外部应用,以便您可以使用顶部 1 和顺序,如下所示:
select Table1.Id as Id, Table1.Name, Table2.Description
from Table1
outer apply (
select top 1 *
from Table1Table2Map
where (Table1Table2Map.Table1Id = Table1.Id) and Table1Table2Map.IsActive = 1
order by somethingCol
) t1t2
outer apply (
select top 1 *
from Table2
where (Table2.Id = Table1Table2Map.Table2Id)
) t2;
Note that an outer apply without a "top" or an "order by" is exactly equivalent to a left outer join, it just gives you a little more control. (cross apply is equivalent to an inner join).
请注意,没有“顶部”或“排序依据”的外部应用与左外部连接完全等效,它只是为您提供了更多控制权。(交叉应用相当于内部联接)。
You can also do something similar using the row_number() function:
您还可以使用 row_number() 函数执行类似操作:
select * from (
select distinct Table1.Id as Id, Table1.Name, Table2.Description,
rowNum = row_number() over ( partition by table1.id order by something )
from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id)
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)
) x
where rowNum = 1;
Most of this doesn't apply if the IsActive flag can narrow down your other tables to one row, but they might come in useful for you.
如果 IsActive 标志可以将您的其他表缩小到一行,则其中大部分不适用,但它们可能对您有用。
回答by cletus
To elaborate on one point: you said that there is only one "active" row in Table2 per row in Table1. Is that row not marked as active such that you could put it in the where clause? Or is there some magic in the dynamic conditions supplied by the user that determines what's active and what isn't.
详细说明一点:您说Table1 中的每行只有一个“活动”行在Table2 中。该行是否未标记为活动以便您可以将其放在 where 子句中?或者用户提供的动态条件中是否有一些魔法来确定什么是活动的,什么不是。
If you don't need to select anything from Table2 the solution is relatively simply in that you can use the EXISTS function but since you've put TAble2.Description in the clause I'll assume that's not the case.
如果您不需要从 Table2 中选择任何内容,则解决方案相对简单,因为您可以使用 EXISTS 函数,但由于您已将 TAble2.Description 放在子句中,因此我认为情况并非如此。
Basically what separates the relevant rows in Table2 from the irrelevant ones? Is it an active flag or a dynamic condition? The first row? That's really how you should be removing duplicates.
基本上是什么将 Table2 中的相关行与不相关的行分开?它是活动标志还是动态条件?第一排?这就是你应该如何删除重复项。
DISTINCT clauses tend to be overused. That may not be the case here but it sounds like it's possible that you're trying to hack out the results you want with DISTINCT rather than solving the real problem, which is a fairly common problem.
DISTINCT 子句往往被过度使用。这可能不是这里的情况,但听起来您可能试图用 DISTINCT 破解出您想要的结果,而不是解决真正的问题,这是一个相当普遍的问题。
回答by Arvo
You have to include activity clause into your join (and no need for distinct):
您必须在您的加入中包含活动子句(并且不需要不同):
select Table1.Id as Id, Table1.Name, Table2.Description from Table1
left outer join Table1Table2Map on (Table1Table2Map.Table1Id = Table1.Id) and Table1Table2Map.IsActive = 1
left outer join Table2 on (Table2.Id = Table1Table2Map.Table2Id)
回答by Nathan Koop
If you want to display multiple rows from table2 you will have duplicate data from table1 displayed. If you wanted to you could use an aggregate function (IE Max, Min) on table2, this would eliminate the duplicate rows from table1, but would also hide some of the data from table2.
如果要显示 table2 中的多行,则会显示 table1 中的重复数据。如果您愿意,可以在 table2 上使用聚合函数 (IE Max, Min),这将消除 table1 中的重复行,但也会隐藏 table2 中的一些数据。
See also my answer on question #70161for additional explanation
另请参阅我对问题#70161 的回答以获取更多解释