如何避免跨三个连接表的 sql 查询中的重复项

Question

提问by Chain

I'm getting duplicates when I do two LEFT JOINs to get to the "event_name" in my example below. I get 112 cases with it set up this way. However, if I get rid of the 2 LEFT JOIN lines and run the query, I get the proper 100 records without duplicates. I tried DISTINCT with the code below, but I still get 112 with duplicates.

在下面的示例中，当我执行两个 LEFT JOIN 以到达“event_name”时，我得到了重复项。我以这种方式设置了 112 个案例。但是，如果我去掉 2 LEFT JOIN 行并运行查询，我会得到正确的 100 条记录，没有重复。我用下面的代码尝试了 DISTINCT，但我仍然得到 112 重复。

SELECT "cases"."id", "cases"."date", "cases"."name", "event"."event_name" 
FROM "cases"
LEFT JOIN "middle_table" ON "cases"."serial" = "middle_table"."m_serial"
LEFT JOIN "event" ON "middle_table"."e_serial" = "event"."ev_serial"
WHERE "cases"."date" BETWEEN '2012-12-11' AND '2012-12-13'

How can I specify that I only want the exact 100 cases from "cases", and that I don't want anything from the tables in the joins to produce any more rows?

如何指定我只想要“案例”中的 100 个案例，并且我不希望连接中的表中的任何内容产生更多行？

Thanks!

谢谢！

Answer 1

采纳答案by AndreKR

You need to extend your ON clauses to include a condition so that for each entry in casesthere is only one entry in middle_tablethat matches the condition and that for each entry in middle_tablethere is only one entry in event:

您需要扩展 ON 子句以包含条件，以便对于中的每个条目，cases只有一个条目middle_table与条件匹配，并且对于中的每个条目，middle_table只有一个条目event：

LEFT JOIN middle_table ON cases.serial = middle_table.m_serial AND some_condition

You can of course use DISTINCT. If that doesn't work it means that your results are all different in the fields cases.id, cases.date, cases.nameand event.event_name. Examine the results and decide which of the entries you want to throw away and include that condition in your ON clause.

您当然可以使用 DISTINCT。如果不工作就意味着你的结果是在所有领域的不同cases.id，cases.date，cases.name和event.event_name。检查结果并决定您要丢弃哪些条目并将该条件包含在您的 ON 子句中。

Answer 2

回答by JohnLBevan

The issue is you have multiple matches in the tables you're left joining with. Effectively your code says:

问题是您要加入的表中有多个匹配项。实际上，您的代码说：

select *
from parent
left outer join child on parent.id = child.parentId

If a parent has two children, you get both; so the parent appears twice.

如果父母有两个孩子，你得到两个；所以父母出现两次。

If you want to only get the parent once you need to compromise; you can't have both children. Either perform an aggregate function on columns from the child table and do a group by on columns from the parent table, or use rownumber() over partition by (list,of,parent,columns order by list,of,child,columns) rin an inner statement and where r=1in an outer statement, such as below:

如果您只想在需要妥协时获得父母；你不能有两个孩子。要么对子表中的列执行聚合函数并对父表中的列执行分组依据，要么rownumber() over partition by (list,of,parent,columns order by list,of,child,columns) r在内部语句和where r=1外部语句中使用，如下所示：

select p.id, p.name, max(c.id), max(c.name) --nb: child id and name may come from different records
from parent p
left outer join child c on parent.id = child.parentId
group by p.id, p.name

or

或者

select *
from 
(
    select p.id, p.name, c.id, c.name
    , rownumber() over (partition by p.id order by c.id desc) r
    from parent p
    left outer join child c on parent.id = child.parentId
) x
where x.r = 1

UPDATE

更新

As mentioned in the comments, if the child data is exactly the same you can do this:

如评论中所述，如果子数据完全相同，您可以这样做：

select p.id, p.name, c.name
from parent p
left outer join 
(
    select distinct c.parentId, c.name
    from child
) c on parent.id = child.parentId

or (if a few fields are different but you don't care which you get)

或者（如果有几个字段不同，但你不在乎你得到哪个）

select p.id, p.name, c.id, c.name
from parent p
left outer join 
(
    select max(c.id) id, c.parentId, c.name
    from child
    group by c.parentId, c.name
) c on parent.id = child.parentId

Answer 3

回答by Michael Durrant

The duplicates are the result of having multiple fields for "middle_table" and "event" for "cases". You can limit the selections to the values that are unique by using the "GROUP BY" keyword (which is usually used for collating functions, such as COUNT and SUM), as follows:

重复项是“cases”的“middle_table”和“event”有多个字段的结果。您可以使用“GROUP BY”关键字（通常用于整理功能，例如 COUNT 和 SUM）将选择限制为唯一的值，如下所示：

SELECT "cases"."id", "cases"."date", "cases"."name", "event"."event_name" 
FROM "cases"
LEFT JOIN "middle_table" ON "cases"."serial" = "middle_table"."m_serial"
LEFT JOIN "event" ON "middle_table"."e_serial" = "event"."ev_serial"
GROUP BY  "cases"."id", "cases"."date", "cases"."name", "event"."event_name" 
WHERE "cases"."date" BETWEEN '2012-12-11' AND '2012-12-13'

如何避免跨三个连接表的 sql 查询中的重复项

提问by Chain

采纳答案by AndreKR

回答by JohnLBevan

回答by Michael Durrant

相关推荐

最近更新

标签

如何避免跨三个连接表的 sql 查询中的重复项

提问by Chain

采纳答案by AndreKR

回答by JohnLBevan

回答by Michael Durrant

相关推荐

Oracle SQL 查询日志记录

SQL Server - 插入后的返回值

SQL 将逗号分隔的列值转换为行

SQL 根据每个表中的一列相等，使用另一表中的数据更新一个表中的行

相关推荐

最近更新

标签