oracle SQL SELECT:使用子查询在三个表之间组合和分组数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/809056/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL SELECT: combining and grouping data between three tables using subqueries
提问by tputkonen
Sorry for a long question and not a very descriptive title, but my problem is very difficult to explain briefly.
抱歉问了一个很长的问题而不是一个非常具有描述性的标题,但我的问题很难简要解释。
I have three database tables:
我有三个数据库表:
TABLE A:
AID PK
STATUS VARCHAR
TABLE B:
BID PK
AID FK
CID FK
TABLE C:
CID PK
CREATIONTIME DATE
For each STATUS = 'OK' row in table A I want to find the corresponding row in C which has the latest creation time.
对于表 AI 中的每个 STATUS = 'OK' 行,要查找 C 中具有最新创建时间的相应行。
First I can to fetch all rows from table A where STATUS = 'OK'.
Next I can to fetch all corresponding rows from table B.
But how to continue from there?
首先,我可以从表 A 中获取 STATUS = 'OK' 的所有行。
接下来我可以从表 B 中获取所有相应的行。
但是如何从那里继续呢?
For example:
例如:
select AID, CID from B where AID in (select AID from A where STATUS = 'OK')
could return something like:
可以返回类似的东西:
AID, CID
1 1
2 2
2 3
3 4
4 5
4 6
Let's say that CID 2 has later creation time than CID 3 and CID 6 is newer than CID 5. This means that the correct result would be rows 1, 2, 4 and 6 in table C.
假设 CID 2 的创建时间比 CID 3 晚,CID 6 比 CID 5 新。这意味着正确的结果将是表 C 中的第 1、2、4 和 6 行。
Is there a way to express this with a query?
有没有办法用查询来表达这一点?
EDIT: Sorry that I wasn't specific enough. What I want to get is the CIDs from table C.
编辑:抱歉,我不够具体。我想得到的是表 C 中的 CID。
EDIT:
I counted returned rows with the different solutions. Results were very interesting - and diversified:
HAINSTECH: 298 473 rows
JMUCCHIELLO: 298 473 rows
RUSS CAM: 290 121 rows
CHRIS: 344 093 rows
TYRANNOSAURS: 290 119 rows
编辑:我用不同的解决方案计算返回的行。结果非常有趣 - 并且多样化:
HAINSTECH:298 473 行
JMUCCHIELLO:298 473 行
RUSS CAM:290 121 行
CHRIS:344 093 行
TYRANNOSAURS:290 119 行
I have not yet had the time to analyse returned rows in depth, but I'd really appreciate views on which of the queries are "broken" and why.
我还没有时间深入分析返回的行,但我真的很感激关于哪些查询“损坏”以及原因的看法。
回答by Russ Cam
Something like this, if I've understood you correctly
像这样,如果我理解正确的话
SELECT
MAX(CREATIONTIME),
A.AID
FROM
A
INNER JOIN
B
ON
A.AID = B.AID
INNER JOIN
C
ON
B.CID = C.CID
WHERE
A.STATUS = 'OK'
GROUP BY
A.AID
EDIT:
编辑:
I have now checked the following in SQL Server (I would epxect the same outcome in Oracle) and it returns the CID
for the C
record with the Maximum CREATIONTIME
where the STATUS
for the related record in A
id 'OK'
.
现在我已经检查了下面的SQL Server(我会在epxect甲骨文同样的结果),它返回CID
了C
与最大记录CREATIONTIME
,其中STATUS
在相关的记录A
ID 'OK'
。
SELECT C.CID
FROM
C C
INNER JOIN
B B
ON
C.CID = B.CID
INNER JOIN
(
SELECT
MAX(C.CREATIONTIME) CREATIONTIME,
A.AID
FROM
A A
INNER JOIN
B B
ON
A.AID = B.AID
INNER JOIN
C C
ON
B.CID = C.CID
WHERE
A.STATUS = 'OK'
GROUP BY
A.AID
) ABC
ON B.AID = ABC.AID
AND C.CREATIONTIME = ABC.CREATIONTIME
Demonstrated with the following T-SQL
用以下T-SQL演示
DECLARE @A TABLE(AID INT IDENTITY(1,1), STATUS VARCHAR(10))
DECLARE @B TABLE(BID INT IDENTITY(1,1), AID INT, CID INT)
DECLARE @C TABLE(CID INT IDENTITY(1,1), CREATIONTIME DATETIME)
INSERT INTO @A VALUES ('OK')
INSERT INTO @A VALUES ('OK')
INSERT INTO @A VALUES ('NOT OK')
INSERT INTO @A VALUES ('OK')
INSERT INTO @A VALUES ('NOT OK')
INSERT INTO @C VALUES ('10 MAR 2008')
INSERT INTO @C VALUES ('13 MAR 2008')
INSERT INTO @C VALUES ('15 MAR 2008')
INSERT INTO @C VALUES ('17 MAR 2008')
INSERT INTO @C VALUES ('21 MAR 2008')
INSERT INTO @B VALUES (1,1)
INSERT INTO @B VALUES (1,2)
INSERT INTO @B VALUES (1,3)
INSERT INTO @B VALUES (2,2)
INSERT INTO @B VALUES (2,3)
INSERT INTO @B VALUES (2,4)
INSERT INTO @B VALUES (3,3)
INSERT INTO @B VALUES (3,4)
INSERT INTO @B VALUES (3,5)
INSERT INTO @B VALUES (4,5)
INSERT INTO @B VALUES (4,1)
INSERT INTO @B VALUES (4,2)
SELECT C.CID
FROM
@C C
INNER JOIN
@B B
ON
C.CID = B.CID
INNER JOIN
(
SELECT
MAX(C.CREATIONTIME) CREATIONTIME,
A.AID
FROM
@A A
INNER JOIN
@B B
ON
A.AID = B.AID
INNER JOIN
@C C
ON
B.CID = C.CID
WHERE
A.STATUS = 'OK'
GROUP BY
A.AID
) ABC
ON B.AID = ABC.AID
AND C.CREATIONTIME = ABC.CREATIONTIME
Results in the following
结果如下
CID
-----------
3
4
5
EDIT 2:
编辑2:
In response to your comment about each of the statements giving different results, I have ran some of the different answers here through SQL Server 2005 using my test data above (I appreciate you are using Oracle). Here are the results
为了回应您对给出不同结果的每个语句的评论,我使用上面的测试数据通过 SQL Server 2005 在这里运行了一些不同的答案(感谢您使用 Oracle)。这是结果
--Expected results for CIDs would be
--CID
-----------
--3
--4
--5
--As indicated in the comments next to the insert statements
DECLARE @A TABLE(AID INT IDENTITY(1,1), STATUS VARCHAR(10))
DECLARE @B TABLE(BID INT IDENTITY(1,1), AID INT, CID INT)
DECLARE @C TABLE(CID INT IDENTITY(1,1), CREATIONTIME DATETIME)
INSERT INTO @A VALUES ('OK') -- AID 1
INSERT INTO @A VALUES ('OK') -- AID 2
INSERT INTO @A VALUES ('NOT OK')
INSERT INTO @A VALUES ('OK') -- AID 4
INSERT INTO @A VALUES ('NOT OK')
INSERT INTO @C VALUES ('10 MAR 2008')
INSERT INTO @C VALUES ('13 MAR 2008')
INSERT INTO @C VALUES ('15 MAR 2008')
INSERT INTO @C VALUES ('17 MAR 2008')
INSERT INTO @C VALUES ('21 MAR 2008')
INSERT INTO @B VALUES (1,1)
INSERT INTO @B VALUES (1,2)
INSERT INTO @B VALUES (1,3) -- Will be CID 3 For AID 1
INSERT INTO @B VALUES (2,2)
INSERT INTO @B VALUES (2,3)
INSERT INTO @B VALUES (2,4) -- Will be CID 4 For AID 2
INSERT INTO @B VALUES (3,3)
INSERT INTO @B VALUES (3,4)
INSERT INTO @B VALUES (3,5)
INSERT INTO @B VALUES (4,5) -- Will be CID 5 FOR AID 4
INSERT INTO @B VALUES (4,1)
INSERT INTO @B VALUES (4,2)
-- Russ Cam
SELECT C.CID, ABC.CREATIONTIME
FROM
@C C
INNER JOIN
@B B
ON
C.CID = B.CID
INNER JOIN
(
SELECT
MAX(C.CREATIONTIME) CREATIONTIME,
A.AID
FROM
@A A
INNER JOIN
@B B
ON
A.AID = B.AID
INNER JOIN
@C C
ON
B.CID = C.CID
WHERE
A.STATUS = 'OK'
GROUP BY
A.AID
) ABC
ON B.AID = ABC.AID
AND C.CREATIONTIME = ABC.CREATIONTIME
-- Tyrannosaurs
select A.AID,
max(AggC.CREATIONTIME)
from @A A,
@B B,
( select C.CID,
max(C.CREATIONTIME) CREATIONTIME
from @C C
group by CID
) AggC
where A.AID = B.AID
and B.CID = AggC.CID
and A.Status = 'OK'
group by A.AID
-- jmucchiello
SELECT c.cid, max(c.creationtime)
FROM @B b, @C c
WHERE b.cid = c.cid
AND b.aid IN (SELECT a.aid FROM @A a WHERE status = 'OK')
GROUP BY c.cid
-- hainstech
SELECT agg.aid, agg.cid
FROM (
SELECT a.aid
,c.cid
,max(c.creationtime) as maxcCreationTime
FROM @C c INNER JOIN @B b ON b.cid = c.cid
INNER JOIN @A a on a.aid = b.aid
WHERE a.status = 'OK'
GROUP BY a.aid, c.cid
) as agg
--chris
SELECT A.AID, C.CID, C.CREATIONTIME
FROM @A A, @B B, @C C
WHERE A.STATUS = 'OK'
AND A.AID = B.AID
AND B.CID = C.CID
AND C.CREATIONTIME =
(SELECT MAX(C2.CREATIONTIME)
FROM @C C2, @B B2
WHERE B2.AID = A.AID
AND C2.CID = B2.CID);
the results are as follows
结果如下
--Russ Cam - Correct CIDs (I have added in the CREATIONTIME for reference)
CID CREATIONTIME
----------- -----------------------
3 2008-03-15 00:00:00.000
4 2008-03-17 00:00:00.000
5 2008-03-21 00:00:00.000
--Tyrannosaurs - No CIDs in the resultset
AID
----------- -----------------------
1 2008-03-15 00:00:00.000
2 2008-03-17 00:00:00.000
4 2008-03-21 00:00:00.000
--jmucchiello - Incorrect CIDs in the resultset
cid
----------- -----------------------
1 2008-03-10 00:00:00.000
2 2008-03-13 00:00:00.000
3 2008-03-15 00:00:00.000
4 2008-03-17 00:00:00.000
5 2008-03-21 00:00:00.000
--hainstech - Too many CIDs in the resultset, which CID has the MAX(CREATIONTIME) for each AID?
aid cid
----------- -----------
1 1
1 2
1 3
2 2
2 3
2 4
4 1
4 2
4 5
--chris - Correct CIDs, it is the same SQL as mine
AID CID CREATIONTIME
----------- ----------- -----------------------
1 3 2008-03-15 00:00:00.000
2 4 2008-03-17 00:00:00.000
4 5 2008-03-21 00:00:00.000
I would recommend running each of the given answers against a smaller number of records, so that you can ascertain whether the resultset returned is the expected one.
我建议针对较少数量的记录运行每个给定的答案,以便您可以确定返回的结果集是否是预期的结果。
回答by Jon Hopkins
EDIT: My previous answer was nonsense. This is now a complete rewrite
编辑:我之前的回答是无稽之谈。这是现在完全重写
This is actually a problem which has bugged me throughout my SQL life. The solution I'm going to give you is messy as hell but it works and I'd appreciate anyone either saying "yes this is messy as hell but it's the only way to do it" or say "no, do this...".
这实际上是一个困扰我整个 SQL 生活的问题。我要给你的解决方案是一团糟,但它有效,我很感激有人说“是的,这太混乱了,但这是唯一的方法”或说“不,这样做...... ”。
I think the unease comes from joining two dates. The way it happens here it's not an issue as they will be an exact match (they have exactly the same root data) but it still feels wrong...
我认为不安来自加入两个约会。它在这里发生的方式不是问题,因为它们将是完全匹配的(它们具有完全相同的根数据)但它仍然感觉是错误的......
Anyway, breaking this down, you need to do this in two stages.
无论如何,将其分解,您需要分两个阶段进行。
1) The first is to return a results set [AID], [earliest CreationTime] giving you the earliest creationtime for each AID.
1)首先是返回一个结果集[AID],[earliest CreationTime]给你每个AID的最早创建时间。
2) You can then use latestCreationTime to pull the CID you want.
2) 然后您可以使用 latestCreationTime 来提取您想要的 CID。
So for part (1), I'd personally create a view to do it just to keep things neat. It allows you to test this part and get it working before you merge it with the other stuff.
所以对于第 (1) 部分,我个人会创建一个视图来做到这一点,只是为了保持整洁。它允许您在将其与其他内容合并之前测试该部分并使其正常工作。
create view LatestCreationTimes
as
select b.AID,
max(c.CreationTime) LatestCreationTime
from TableB b,
TableC c
where b.CID = c.CID
group by b.AID
Note, we've not taken into account the status at this point.
请注意,此时我们尚未考虑状态。
You then need to join that to TableA (to get the status) and TableB and TableC (to get the CID). You need to do all the obvious links (AID, CID) and also join the LatestCreationTime column in the view to the CreationTime column in TableC. Don't also forget to join the view on AID otherwise where two records have been created at the same time for different A records you'll get issues.
然后,您需要将其连接到 TableA(以获取状态)以及 TableB 和 TableC(以获取 CID)。您需要完成所有明显的链接(AID、CID),并将视图中的 LatestCreationTime 列加入到 TableC 中的 CreationTime 列中。也不要忘记加入 AID 上的视图,否则在为不同的 A 记录同时创建两个记录的情况下,您会遇到问题。
select A.AID,
C.CID
from TableA a,
TableB b,
TableC c,
LatestCreationTimes lct
where a.AID = b.AID
and b.CID = c.CID
and a.AID = lct.AID
and c.CreationTime = lct.LatestCreationTime
and a.STATUS = 'OK'
I'm certain that works - I've tested it, tweaked data, retested it and it behaves. At least it does what I believe it's meant to do.
我确信它有效 - 我已经测试过它,调整过数据,重新测试过它并且它表现得很好。至少它做了我认为它应该做的事情。
It doesn't however deal with the possibility of two identical CreationTimes in table C for the same record. I'm guessing that this shouldn't happen however unless you've written sometime that absolutely constrains it it needs to be accounted for.
然而,它不处理表 C 中同一记录的两个相同 CreationTimes 的可能性。我猜这不应该发生,除非你写了某个时候绝对限制它需要考虑。
To do this I need to make an assumption about which one you'd prefer. In this case I'm going to say that if there are two CIDs which match, you'd rather have the higher one (it's most likely more up to date).
为此,我需要假设您更喜欢哪一个。在这种情况下,我要说的是,如果有两个 CID 匹配,您宁愿拥有更高的一个(它很可能是最新的)。
select A.AID,
max(C.CID) CID
from TableA a,
TableB b,
TableC c,
LatestCreationTimes lct
where a.AID = b.AID
and b.CID = c.CID
and c.CreationTime = lct.LatestCreationTime
and a.STATUS = 'OK'
group by A.AID
And that, I believe should work for you. If you want it as one query rather than with the view then:
那,我相信应该对你有用。如果您希望将其作为一个查询而不是视图,则:
select A.AID,
max(C.CID) CID
from TableA a,
TableB b,
TableC c,
(select b.AID,
max(c.CreationTime) LatestCreationTime
from TableB b,
TableC c
where b.CID = c.CID
group by b.AID) lct
where a.AID = b.AID
and b.CID = c.CID
and c.CreationTime = lct.LatestCreationTime
and a.STATUS = 'OK'
group by A.AID
(I've just embedded the view in the query, otherwise the principal is exactly the same).
(我刚刚在查询中嵌入了视图,否则主体完全相同)。
回答by Rob van Wijk
SQL> create table a (aid,status)
2 as
3 select 1, 'OK' from dual union all
4 select 2, 'OK' from dual union all
5 select 3, 'OK' from dual union all
6 select 4, 'OK' from dual union all
7 select 5, 'NOK' from dual
8 /
Tabel is aangemaakt.
SQL> create table c (cid,creationtime)
2 as
3 select 1, sysdate - 1 from dual union all
4 select 2, sysdate - 2 from dual union all
5 select 3, sysdate - 3 from dual union all
6 select 4, sysdate - 4 from dual union all
7 select 5, sysdate - 6 from dual union all
8 select 6, sysdate - 5 from dual
9 /
Tabel is aangemaakt.
SQL> create table b (bid,aid,cid)
2 as
3 select 1, 1, 1 from dual union all
4 select 2, 2, 2 from dual union all
5 select 3, 2, 3 from dual union all
6 select 4, 3, 4 from dual union all
7 select 5, 4, 5 from dual union all
8 select 6, 4, 6 from dual union all
9 select 7, 5, 6 from dual
10 /
Tabel is aangemaakt.
SQL> select a.aid
2 , max(c.cid) keep (dense_rank last order by c.creationtime) cid
3 , max(c.creationtime) creationtime
4 from a
5 , b
6 , c
7 where b.aid = a.aid
8 and b.cid = c.cid
9 and a.status = 'OK'
10 group by a.aid
11 /
AID CID CREATIONTIME
---------- ---------- -------------------
1 1 30-04-2009 09:26:00
2 2 29-04-2009 09:26:00
3 4 27-04-2009 09:26:00
4 6 26-04-2009 09:26:00
4 rijen zijn geselecteerd.
回答by Chris
Select the field you are looking for using a join of all 3 tables and then limit the results to the ones where the CREATIONDATE is the most recent.
使用所有 3 个表的连接选择您要查找的字段,然后将结果限制为 CREATIONDATE 是最近的那些。
SELECT A.AID, C.CID, C.CREATIONTIME
FROM A A, B B, C C
WHERE A.STATUS = 'OK'
AND A.AID = B.AID
AND B.CID = C.CID
AND C.CREATIONTIME =
(SELECT MAX(C2.CREATIONTIME)
FROM C C2, B B2
WHERE B2.AID = A.AID
AND C2.CID = B2.CID);
回答by jmucchiello
Am I missing something? What is wrong with:
我错过了什么吗?出什么问题了:
EDIT: Okay, I see you actually want to group by aid.
编辑:好的,我看到你实际上想通过援助分组。
SELECT c.cid FROM b, c,
(SELECT b.aid as aid, max(c.creationtime) as creationtime
FROM b, c
WHERE b.cid = c.cid
AND b.aid IN (SELECT a.aid FROM a WHERE status = 'OK')
GROUP BY b.aid) as z
WHERE b.cid = c.cid
AND z.aid = b.aid
AND z.creationtime = c.creationtime
回答by ahains
There is no need for a subquery, the aggregation to determine the latest cid creation time is straightforward:
不需要子查询,确定最新 cid 创建时间的聚合很简单:
SELECT a.aid
,c.cid
,max(c.creationtime) as maxcCreationTime
FROM c INNER JOIN b ON b.cid = c.cid
INNER JOIN a on a.aid = b.aid
WHERE a.status = 'OK'
GROUP BY a.aid, c.cid
If you really don't want the creationtime in your row set, you can just wrap it in a subquery and drop it from the projection:
如果您真的不希望在行集中创建时间,则可以将其包装在子查询中并将其从投影中删除:
SELECT agg.aid, agg.cid
FROM (
SELECT a.aid
,c.cid
,max(c.creationtime) as maxcCreationTime
FROM c INNER JOIN b ON b.cid = c.cid
INNER JOIN a on a.aid = b.aid
WHERE a.status = 'OK'
GROUP BY a.aid, c.cid
) as agg
Coding in the web page, please excuse any syntax mistakes. Also, I'm an mssql guy so I hope there is nothing different in the Oracle world for this..
在网页中编码,请原谅任何语法错误。另外,我是一个 mssql 人,所以我希望 Oracle 世界对此没有什么不同。
Note that the schema you have provided does not enforce uniqueness of CREATIONTIME per cid. If there are ever two cid values that map to a given aid value with the same creationtime, they will both be outputted. If you rely on the pair of cid,creationtime to be unique, you should enforce it declaratively with a constraint.
请注意,您提供的架构不会强制每个 cid 的 CREATIONTIME 具有唯一性。如果有两个 cid 值映射到具有相同创建时间的给定辅助值,则它们都将被输出。如果您依赖一对 cid,creationtime 是唯一的,您应该使用约束声明性地强制执行它。