SQL 如何使用postgres在连接表中仅加入一行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24042359/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to join only one row in joined table with postgres?
提问by Benjamin Crouzier
I have the following schema:
我有以下架构:
CREATE TABLE author (
id integer
, name varchar(255)
);
CREATE TABLE book (
id integer
, author_id integer
, title varchar(255)
, rating integer
);
And I want each author with its last book:
我希望每个作者都有最后一本书:
SELECT book.id, author.id, author.name, book.title as last_book
FROM author
JOIN book book ON book.author_id = author.id
GROUP BY author.id
ORDER BY book.id ASC
Apparently you can do that in mysql: Join two tables in MySQL, returning just one row from the second table.
显然你可以在 mysql 中做到这一点:在 MySQL 中加入两个表,从第二个表中只返回一行。
But postgres gives this error:
但是 postgres 给出了这个错误:
ERROR: column "book.id" must appear in the GROUP BY clause or be used in an aggregate function: SELECT book.id, author.id, author.name, book.title as last_book FROM author JOIN book book ON book.author_id = author.id GROUP BY author.id ORDER BY book.id ASC
错误:列“book.id”必须出现在 GROUP BY 子句中或用于聚合函数:SELECT book.id, author.id, author.name, book.title as last_book FROM author JOIN book book ON book.author_id = author.id GROUP BY author.id ORDER BY book.id ASC
这是因为:
When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column.
当存在 GROUP BY 时,SELECT 列表表达式引用未分组的列是无效的,除非在聚合函数内,因为对于未分组的列将有多个可能的值返回。
How can I specify to postgres: "Give me only the last row, when ordered by joined_table.id
, in the joined table ?"
我如何指定 postgres:“只给我最后一行,当排序时joined_table.id
,在连接表中?”
Edit: With this data:
编辑:使用此数据:
INSERT INTO author (id, name) VALUES
(1, 'Bob')
, (2, 'David')
, (3, 'John');
INSERT INTO book (id, author_id, title, rating) VALUES
(1, 1, '1st book from bob', 5)
, (2, 1, '2nd book from bob', 6)
, (3, 1, '3rd book from bob', 7)
, (4, 2, '1st book from David', 6)
, (5, 2, '2nd book from David', 6);
I should see:
我应该看到:
book_id author_id name last_book
3 1 "Bob" "3rd book from bob"
5 2 "David" "2nd book from David"
回答by Clodoaldo Neto
select distinct on (author.id)
book.id, author.id, author.name, book.title as last_book
from
author
inner join
book on book.author_id = author.id
order by author.id, book.id desc
Check distinct on
查看 distinct on
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.
SELECT DISTINCT ON ( expression [, ...] ) 仅保留给定表达式计算结果相等的每组行的第一行。DISTINCT ON 表达式使用与 ORDER BY 相同的规则进行解释(见上文)。请注意,每个集合的“第一行”是不可预测的,除非使用 ORDER BY 来确保所需的行首先出现。
With distinct on it is necessary to include the "distinct" columns in the order by
. If that is not the order you want then you need to wrap the query and reorder
使用 distinct 时,必须在order by
. 如果这不是您想要的顺序,那么您需要包装查询并重新排序
select
*
from (
select distinct on (author.id)
book.id, author.id, author.name, book.title as last_book
from
author
inner join
book on book.author_id = author.id
order by author.id, book.id desc
) authors_with_first_book
order by authors_with_first_book.name
Another solution is to use a window function as in Lennart's answer. And another very generic one is this
另一种解决方案是使用 Lennart 答案中的窗口函数。另一个非常通用的就是这个
select
book.id, author.id, author.name, book.title as last_book
from
book
inner join
(
select author.id as author_id, max(book.id) as book_id
from
author
inner join
book on author.id = book.author_id
group by author.id
) s
on s.book_id = book.id
inner join
author on book.author_id = author.id
回答by wildplasser
This may look archaic and overly simple, but it does not depend on window functions, CTE's and aggregating subqueries. In most cases it is also the fastest.
这可能看起来过时且过于简单,但它不依赖于窗口函数、CTE 和聚合子查询。在大多数情况下,它也是最快的。
SELECT bk.id, au.id, au.name, bk.title as last_book
FROM author au
JOIN book bk ON bk.author_id = au.id
WHERE NOT EXISTS (
SELECT *
FROM book nx
WHERE nx.author_id = bk.author_id
AND nx.book_id > bk.book_id
)
ORDER BY book.id ASC
;
回答by Lennart
Here is one way:
这是一种方法:
SELECT book_id, author_id, author_name, last_book
FROM (
SELECT b.id as book_id
, a.id as author_id
, a.name as author_name
, b.title as last_book
, row_number() over (partition by a.id
order by b.id desc) as rn
FROM author a
JOIN book b
ON b.author_id = a.id
) last_books
WHERE rn = 1;
回答by Mirthe
I've done something similar for a chat system, where room holds the metadata and list contains the messages. I ended up using the Postgresql LATERAL JOIN which worked like a charm.
我为聊天系统做了类似的事情,其中房间保存元数据,列表包含消息。我最终使用了 Postgresql LATERAL JOIN,它就像一个魅力。
SELECT MR.id AS room_id, MR.created_at AS room_created,
lastmess.content as lastmessage_content, lastmess.datetime as lastmessage_when
FROM message.room MR
LEFT JOIN LATERAL (
SELECT content, datetime
FROM message.list
WHERE room_id = MR.id
ORDER BY datetime DESC
LIMIT 1) lastmess ON true
ORDER BY lastmessage_when DESC NULLS LAST, MR.created_at DESC
For more info see https://heap.io/blog/engineering/postgresqls-powerful-new-join-type-lateral
有关更多信息,请参阅https://heap.io/blog/engineering/postgresqls-powerful-new-join-type-lateral
回答by Taochok
You could add a rule into the join for specifying only one row. I had work for me.
您可以将规则添加到联接中以仅指定一行。我有工作。
Like this:
像这样:
SELECT
book.id,
author.id,
author.name,
book.title as last_book
FROM author auth1
JOIN book book ON (book.author_id = auth1.id AND book.id = (select max(b.id) from book b where b.author_id = auth1))
GROUP BY auth1.id
ORDER BY book.id ASC
This way you get the data from the book with the higher ID. You could add "date" and make the same with the max(date).
通过这种方式,您可以从具有更高 ID 的书中获取数据。您可以添加“日期”并与 max(date) 相同。
回答by jobermark
As a slight variation on @wildplasser's suggestion, which still works across implementations, you can use max rather than not exists. This reads better if you like short joins better than long where clauses
作为@wildplasser 建议的一个细微变化,它仍然适用于各种实现,您可以使用 max 而不是不存在。如果你更喜欢短连接而不是长 where 子句,这会更好读
select *
from author au
join (
select max(id) as max_id, author_id
from book bk
group by author_id) as lb
on lb.author_id = au.id
join bk
on bk.id = lb.max_id;
or, to give a name to the subquery, which clarifies things, go with WITH
或者,要为子查询命名以澄清问题,请使用 WITH
with last_book as
(select max(id) as max_id, author_id
from book bk
group by author_id)
select *
from author au
join last_book lb
on au.id = lb.author_id
join bk
on bk.id = lb.max_id;
回答by Bobburi Madhu
create temp table book_1 as (
SELECT
id
,title
,author_id
,row_number() OVER (PARTITION BY id) as rownum
FROM
book) distributed by ( id );
select author.id,b.id, author.id, author.name, b.title as last_book
from
author
left join
(select * from book_1 where rownum = 1 ) b on b.author_id = author.id
order by author.id, b.id desc