SQL 如何将一个表中的最新行连接到另一个表?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/497535/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 00:58:21  来源:igfitidea点击:

How do I join the most recent row in one table to another table?

sqldatejoingreatest-n-per-group

提问by GloryFish

I have data that looks like this:

我有看起来像这样的数据:

entities
id         name
1          Apple
2          Orange
3          Banana

Periodically, a process will run and give a score to each entity. The process generates the data and adds it to a scores table like so:

一个进程将定期运行并为每个实体打分。该过程生成数据并将其添加到分数表中,如下所示:

scores 
id  entity_id    score   date_added
1    1            10       1/2/09
2    2            10       1/2/09
3    1            15       1/3/09
4    2            10       1/03/09
5    1            15       1/4/09
6    2            15       1/4/09
7    3            22       1/4/09

I want to be able to select all of the entities along with the most recent recorded score for each resulting in some data like this:

我希望能够选择所有实体以及每个实体的最新记录分数,从而产生如下数据:

entities
id name     score  date_added
1  Apple     15     1/4/09
2  Orange    15     1/4/09
3  Banana    15     1/4/09

I can get the data for a single entity using this query:

我可以使用此查询获取单个实体的数据:

SELECT entities.*, 
       scores.score, 
       scores.date_added 
FROM entities

INNER  JOIN scores
ON entities.id = scores.entity_id

WHERE entities.id = ?

ORDER BY scores.date_added DESC
LIMIT 1

But I'm at a loss for how to select the same for all entities. Perhaps it's staring me in the face?

但是我不知道如何为所有实体选择相同的。也许它正盯着我的脸?

Thank you very kindly for taking the time.

非常感谢您抽出时间。

Thanks for the great responses. I'll give it a few days to see if a preferred solution bubbles up then I'll select the answer.

感谢您的精彩回复。我会给它几天时间,看看是否有首选的解决方案出现,然后我会选择答案。

UPDATE: I've tried out several of the proposed solutions, the main issue I'm facing now is that if an entity does not yet have a generated score they don't appear in the list.

更新:我已经尝试了几个建议的解决方案,我现在面临的主要问题是,如果一个实体还没有生成的分数,它们就不会出现在列表中。

What would the SQL look like to ensure that all entities are returned, even if they don't have any score posted yet?

确保返回所有实体的 SQL 是什么样的,即使它们尚未发布任何分数?

UPDATE: Answer selected. Thanks everyone!

更新:已选择答案。谢谢大家!

回答by Bill Karwin

I do it this way:

我这样做:

SELECT e.*, s1.score, s1.date_added 
FROM entities e
  INNER JOIN scores s1
    ON (e.id = s1.entity_id)
  LEFT OUTER JOIN scores s2
    ON (e.id = s2.entity_id AND s1.id < s2.id)
WHERE s2.id IS NULL;

回答by Ray Hidayat

Just to add my variation on it:

只是为了添加我的变化:

SELECT e.*, s1.score
FROM entities e
INNER JOIN score s1 ON e.id = s1.entity_id
WHERE NOT EXISTS (
    SELECT 1 FROM score s2 WHERE s2.id > s1.id
)

回答by Michael Buen

approach 1

方法一

SELECT entities.*, 
       scores.score, 
       scores.date_added 
FROM entities

INNER  JOIN scores
ON entities.id = scores.entity_id

WHERE scores.date_added = 
  (SELECT max(date_added) FROM scores where entity_id = entities.id)

回答by Michael Buen

approach 2

方法二

query cost relative to batch:

相对于批处理的查询成本:



SELECT entities.*, 
       scores.score, 
       scores.date_added 
FROM entities

INNER  JOIN scores
ON entities.id = scores.entity_id

inner join 
    (
    SELECT 
           entity_id, max(date_added) as recent_date
    FROM scores
    group by entity_id
    ) as y on entities.id = y.entity_id and scores.date_added = y.recent_date

回答by Manatherin

I know this is a old question, just thought I'd add a approach no-one has mentioned yet, Cross Applyor Outer Apply. These are available in SQL Server 2005 (the database type is not tagged in this question) Or higher

我知道这是一个老问题,只是想我会添加一种没有人提到的方法,Cross Apply或者Outer Apply. 这些在 SQL Server 2005 中可用(此问题中未标记数据库类型)或更高版本

Using the temporary tables

使用临时表

DECLARE @Entities TABLE(Id INT PRIMARY KEY, name NVARCHAR(MAX))
INSERT INTO @Entities
VALUES (1, 'Apple'), (2, 'Orange'), (3, 'Banana'), (4, 'Cherry')

DECLARE @Scores TABLE(Id INT PRIMARY KEY, Entity_Id INT, Score INT, Date_Added DATE)
INSERT INTO @Scores
VALUES (1,1,10,'2009-02-01'),
(2,2,10,'2009-02-01'),
(3,1,15,'2009-02-01'),
(4,2,10,'2009-03-01'),
(5,1,15,'2009-04-01'),
(6,2,15,'2009-04-01'),
(7,3,22,'2009-04-01')

You could use

你可以用

SELECT E.Id, E.name, S.Score, S.Date_Added 
FROM @Entities E
CROSS APPLY
(
    SELECT TOP 1 * 
    FROM @Scores Sc 
    WHERE Sc.Entity_Id = E.Id  
    ORDER BY sc.Score DESC
) AS S

to get the desired results. The equivilent to allow entities without scores would be

以获得所需的结果。允许没有分数的实体的等价物是

SELECT E.Id, E.name, S.Score, S.Date_Added 
FROM @Entities E
OUTER APPLY
(
    SELECT TOP 1 * 
    FROM @Scores Sc 
    WHERE Sc.Entity_Id = E.Id  
    ORDER BY sc.Score DESC
) AS S

回答by Cristi S.

You can also do this today in most RDBMSs (Oracle, PostgreSQL, SQL Server) with a natural query using window functions such as ROW_NUMBER:

现在,您还可以在大多数 RDBMS(Oracle、PostgreSQL、SQL Server)中使用 ROW_NUMBER 等窗口函数进行自然查询:

SELECT id, name, score, date_added FROM (
 SELECT e.id, e.name, s.score, s.date_added,
 ROW_NUMBER() OVER (PARTITION BY e.id ORDER BY s.date_added DESC) rn
 FROM Entities e INNER JOIN Scores s ON e.id = s.entity_id
) tmp WHERE rn = 1;

SQL Fiddle

SQL小提琴

回答by Otávio Décio

SELECT entities.*, 
       scores.score, 
       scores.date_added 
FROM entities

INNER  JOIN scores
ON entities.id = scores.entity_id

WHERE entities.id in 
(select id from scores s2 where date_added = max(date_added) and s2.id = entities.id)

ORDER BY scores.date_added DESC
LIMIT 1