如何在选择子句中执行 Postgresql 子查询，并在 SQL Server 等从子句中加入？

Question

提问by Ricardo

I am trying to write the following query on postgresql:

我正在尝试在 postgresql 上编写以下查询：

select name, author_id, count(1), 
    (select count(1)
    from names as n2
    where n2.id = n1.id
        and t2.author_id = t1.author_id
    )               
from names as n1
group by name, author_id

This would certainly work on Microsoft SQL Server but it does not at all on postegresql. I read its documentation a bit and it seems I could rewrite it as:

这当然可以在 Microsoft SQL Server 上运行，但在 postegresql 上根本不起作用。我阅读了它的文档，似乎可以将其重写为：

select name, author_id, count(1), total                     
from names as n1, (select count(1) as total
    from names as n2
    where n2.id = n1.id
        and n2.author_id = t1.author_id
    ) as total
group by name, author_id

But that returns the following error on postegresql: "subquery in FROM cannot refer to other relations of same query level". So I'm stuck. Does anyone know how I can achieve that?

但这会在 postegresql 上返回以下错误：“FROM 中的子查询不能引用相同查询级别的其他关系”。所以我被困住了。有谁知道我怎么能做到这一点？

Thanks

谢谢

Answer 1

回答by Bob Jarvis - Reinstate Monica

I'm not sure I understand your intent perfectly, but perhaps the following would be close to what you want:

我不确定我是否完全理解您的意图，但也许以下内容与您想要的很接近：

select n1.name, n1.author_id, count_1, total_count
  from (select id, name, author_id, count(1) as count_1
          from names
          group by id, name, author_id) n1
inner join (select id, author_id, count(1) as total_count
              from names
              group by id, author_id) n2
  on (n2.id = n1.id and n2.author_id = n1.author_id)

Unfortunately this adds the requirement of grouping the first subquery by id as well as name and author_id, which I don't think was wanted. I'm not sure how to work around that, though, as you need to have id available to join in the second subquery. Perhaps someone else will come up with a better solution.

不幸的是，这增加了按 id 以及 name 和 author_id 对第一个子查询进行分组的要求，我认为这是不需要的。不过，我不确定如何解决这个问题，因为您需要有可用的 id 才能加入第二个子查询。也许其他人会想出更好的解决方案。

Share and enjoy.

分享和享受。

Answer 2

回答by Ricardo

I am just answering here with the formatted version of the final sql I needed based on Bob Jarvis answer as posted in my comment above:

我只是根据我上面评论中发布的 Bob Jarvis 回答，使用我需要的最终 sql 的格式化版本来回答：

select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
  from (select id, name, author_id, count(1) as count_1
          from names
          group by id, name, author_id) n1
inner join (select author_id, count(1) as total_count
              from names
              group by author_id) n2
  on (n2.author_id = n1.author_id)

Answer 3

回答by deFreitas

Complementing @Bob Jarvisand @dmikamanswer, Postgres don't perform a good plan when you don't use LATERAL, below a simulation, in both cases the query data results are the same, but the cost are very different

补充@Bob Jarvis和@dmikam 的回答，Postgres 在不使用 LATERAL 的情况下不会执行好的计划，在模拟下，两种情况下查询数据结果是相同的，但是成本却大不相同

Table structure

表结构

CREATE TABLE ITEMS (
    N INTEGER NOT NULL,
    S TEXT NOT NULL
);

INSERT INTO ITEMS
  SELECT
    (random()*1000000)::integer AS n,
    md5(random()::text) AS s
  FROM
    generate_series(1,1000000);

CREATE INDEX N_INDEX ON ITEMS(N);

Performing JOINwith GROUP BYin subquery without LATERAL

执行JOIN与GROUP BY子查询无LATERAL

EXPLAIN 
SELECT 
    I.*
FROM ITEMS I
INNER JOIN (
    SELECT 
        COUNT(1), n
    FROM ITEMS
    GROUP BY N
) I2 ON I2.N = I.N
WHERE I.N IN (243477, 997947);

The results

结果

Merge Join  (cost=0.87..637500.40 rows=23 width=37)
  Merge Cond: (i.n = items.n)
  ->  Index Scan using n_index on items i  (cost=0.43..101.28 rows=23 width=37)
        Index Cond: (n = ANY ('{243477,997947}'::integer[]))
  ->  GroupAggregate  (cost=0.43..626631.11 rows=861418 width=12)
        Group Key: items.n
        ->  Index Only Scan using n_index on items  (cost=0.43..593016.93 rows=10000000 width=4)

Using LATERAL

使用 LATERAL

EXPLAIN 
SELECT 
    I.*
FROM ITEMS I
INNER JOIN LATERAL (
    SELECT 
        COUNT(1), n
    FROM ITEMS
    WHERE N = I.N
    GROUP BY N
) I2 ON 1=1 --I2.N = I.N
WHERE I.N IN (243477, 997947);

Results

结果

Nested Loop  (cost=9.49..1319.97 rows=276 width=37)
  ->  Bitmap Heap Scan on items i  (cost=9.06..100.20 rows=23 width=37)
        Recheck Cond: (n = ANY ('{243477,997947}'::integer[]))
        ->  Bitmap Index Scan on n_index  (cost=0.00..9.05 rows=23 width=0)
              Index Cond: (n = ANY ('{243477,997947}'::integer[]))
  ->  GroupAggregate  (cost=0.43..52.79 rows=12 width=12)
        Group Key: items.n
        ->  Index Only Scan using n_index on items  (cost=0.43..52.64 rows=12 width=4)
              Index Cond: (n = i.n)

My Postgres version is PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)

我的 Postgres 版本是 PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)

Answer 4

回答by dmikam

I know this is old, but since Postgresql 9.3there is an option to use a keyword "LATERAL" to use RELATED subqueries inside of JOINS, so the query from the question would look like:

我知道这很旧，但是从Postgresql 9.3 开始，可以选择使用关键字“LATERAL”在 JOINS 内使用 RELATED 子查询，因此问题中的查询如下所示：

SELECT 
    name, author_id, count(*), t.total
FROM
    names as n1
    INNER JOIN LATERAL (
        SELECT 
            count(*) as total
        FROM 
            names as n2
        WHERE 
            n2.id = n1.id
            AND n2.author_id = n1.author_id
    ) as t ON 1=1
GROUP BY 
    n1.name, n1.author_id

Answer 5

回答by Zahid Gani

select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
  from (select id, name, author_id, count(1) as count_1
          from names
          group by id, name, author_id) n1
inner join (select distinct(author_id), count(1) as total_count
              from names) n2
  on (n2.author_id = n1.author_id)
Where true

used distinctif more inner join, because more join group performance is slow

使用distinct如果有更多的内部连接，因为越来越多的加入组性能很慢

如何在选择子句中执行 Postgresql 子查询，并在 SQL Server 等从子句中加入？

提问by Ricardo

回答by Bob Jarvis - Reinstate Monica

回答by Ricardo

回答by deFreitas

回答by dmikam

回答by Zahid Gani

相关推荐

最近更新

标签

如何在选择子句中执行 Postgresql 子查询，并在 SQL Server 等从子句中加入？

提问by Ricardo

回答by Bob Jarvis - Reinstate Monica

回答by Ricardo

回答by deFreitas

回答by dmikam

回答by Zahid Gani

相关推荐

SQL 链接服务器 SQLNCLI 问题。“没有交易是活跃的”

从 SQL Server 中的十进制中删除尾随零

在 SQL 中合并两行

SQL 如何从今天的日期获得下个月的日期（即）从今天的日期开始正好一个月

相关推荐

最近更新

标签