如何在选择子句中执行 Postgresql 子查询,并在 SQL Server 等从子句中加入?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3004887/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to do a Postgresql subquery in select clause with join in from clause like SQL Server?
提问by Ricardo
I am trying to write the following query on postgresql:
我正在尝试在 postgresql 上编写以下查询:
select name, author_id, count(1),
(select count(1)
from names as n2
where n2.id = n1.id
and t2.author_id = t1.author_id
)
from names as n1
group by name, author_id
This would certainly work on Microsoft SQL Server but it does not at all on postegresql. I read its documentation a bit and it seems I could rewrite it as:
这当然可以在 Microsoft SQL Server 上运行,但在 postegresql 上根本不起作用。我阅读了它的文档,似乎可以将其重写为:
select name, author_id, count(1), total
from names as n1, (select count(1) as total
from names as n2
where n2.id = n1.id
and n2.author_id = t1.author_id
) as total
group by name, author_id
But that returns the following error on postegresql: "subquery in FROM cannot refer to other relations of same query level". So I'm stuck. Does anyone know how I can achieve that?
但这会在 postegresql 上返回以下错误:“FROM 中的子查询不能引用相同查询级别的其他关系”。所以我被困住了。有谁知道我怎么能做到这一点?
Thanks
谢谢
回答by Bob Jarvis - Reinstate Monica
I'm not sure I understand your intent perfectly, but perhaps the following would be close to what you want:
我不确定我是否完全理解您的意图,但也许以下内容与您想要的很接近:
select n1.name, n1.author_id, count_1, total_count
from (select id, name, author_id, count(1) as count_1
from names
group by id, name, author_id) n1
inner join (select id, author_id, count(1) as total_count
from names
group by id, author_id) n2
on (n2.id = n1.id and n2.author_id = n1.author_id)
Unfortunately this adds the requirement of grouping the first subquery by id as well as name and author_id, which I don't think was wanted. I'm not sure how to work around that, though, as you need to have id available to join in the second subquery. Perhaps someone else will come up with a better solution.
不幸的是,这增加了按 id 以及 name 和 author_id 对第一个子查询进行分组的要求,我认为这是不需要的。不过,我不确定如何解决这个问题,因为您需要有可用的 id 才能加入第二个子查询。也许其他人会想出更好的解决方案。
Share and enjoy.
分享和享受。
回答by Ricardo
I am just answering here with the formatted version of the final sql I needed based on Bob Jarvis answer as posted in my comment above:
我只是根据我上面评论中发布的 Bob Jarvis 回答,使用我需要的最终 sql 的格式化版本来回答:
select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
from (select id, name, author_id, count(1) as count_1
from names
group by id, name, author_id) n1
inner join (select author_id, count(1) as total_count
from names
group by author_id) n2
on (n2.author_id = n1.author_id)
回答by deFreitas
Complementing @Bob Jarvisand @dmikamanswer, Postgres don't perform a good plan when you don't use LATERAL, below a simulation, in both cases the query data results are the same, but the cost are very different
补充@Bob Jarvis和@dmikam 的回答,Postgres 在不使用 LATERAL 的情况下不会执行好的计划,在模拟下,两种情况下查询数据结果是相同的,但是成本却大不相同
Table structure
表结构
CREATE TABLE ITEMS (
N INTEGER NOT NULL,
S TEXT NOT NULL
);
INSERT INTO ITEMS
SELECT
(random()*1000000)::integer AS n,
md5(random()::text) AS s
FROM
generate_series(1,1000000);
CREATE INDEX N_INDEX ON ITEMS(N);
Performing JOIN
with GROUP BY
in subquery without LATERAL
执行JOIN
与GROUP BY
子查询无LATERAL
EXPLAIN
SELECT
I.*
FROM ITEMS I
INNER JOIN (
SELECT
COUNT(1), n
FROM ITEMS
GROUP BY N
) I2 ON I2.N = I.N
WHERE I.N IN (243477, 997947);
The results
结果
Merge Join (cost=0.87..637500.40 rows=23 width=37)
Merge Cond: (i.n = items.n)
-> Index Scan using n_index on items i (cost=0.43..101.28 rows=23 width=37)
Index Cond: (n = ANY ('{243477,997947}'::integer[]))
-> GroupAggregate (cost=0.43..626631.11 rows=861418 width=12)
Group Key: items.n
-> Index Only Scan using n_index on items (cost=0.43..593016.93 rows=10000000 width=4)
Using LATERAL
使用 LATERAL
EXPLAIN
SELECT
I.*
FROM ITEMS I
INNER JOIN LATERAL (
SELECT
COUNT(1), n
FROM ITEMS
WHERE N = I.N
GROUP BY N
) I2 ON 1=1 --I2.N = I.N
WHERE I.N IN (243477, 997947);
Results
结果
Nested Loop (cost=9.49..1319.97 rows=276 width=37)
-> Bitmap Heap Scan on items i (cost=9.06..100.20 rows=23 width=37)
Recheck Cond: (n = ANY ('{243477,997947}'::integer[]))
-> Bitmap Index Scan on n_index (cost=0.00..9.05 rows=23 width=0)
Index Cond: (n = ANY ('{243477,997947}'::integer[]))
-> GroupAggregate (cost=0.43..52.79 rows=12 width=12)
Group Key: items.n
-> Index Only Scan using n_index on items (cost=0.43..52.64 rows=12 width=4)
Index Cond: (n = i.n)
My Postgres version is PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)
我的 Postgres 版本是 PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)
回答by dmikam
I know this is old, but since Postgresql 9.3there is an option to use a keyword "LATERAL" to use RELATED subqueries inside of JOINS, so the query from the question would look like:
我知道这很旧,但是从Postgresql 9.3 开始,可以选择使用关键字“LATERAL”在 JOINS 内使用 RELATED 子查询,因此问题中的查询如下所示:
SELECT
name, author_id, count(*), t.total
FROM
names as n1
INNER JOIN LATERAL (
SELECT
count(*) as total
FROM
names as n2
WHERE
n2.id = n1.id
AND n2.author_id = n1.author_id
) as t ON 1=1
GROUP BY
n1.name, n1.author_id
回答by Zahid Gani
select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
from (select id, name, author_id, count(1) as count_1
from names
group by id, name, author_id) n1
inner join (select distinct(author_id), count(1) as total_count
from names) n2
on (n2.author_id = n1.author_id)
Where true
used distinct
if more inner join, because more join group performance is slow
使用distinct
如果有更多的内部连接,因为越来越多的加入组性能很慢