postgresql 将连接限制为一行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12650381/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:21:58  来源:igfitidea点击:

Limit join to one row

sqlpostgresqljoingreatest-n-per-groupsql-limit

提问by user1175817

I have the following query:

我有以下查询:

SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as     
    "creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".* 
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id") 
    JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId") 
    JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" ) 
    JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId") 
    JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null 
GROUP BY "Clients".id, "Rmas".id;

The problem is that the table "EsnsSalesOrderItems"can have the same EsnIdin different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems"that has the same "EsnId".

问题是该表"EsnsSalesOrderItems"可以EsnId在不同的条目中具有相同的内容。我想将查询限制为仅提取"EsnsSalesOrderItems"具有相同"EsnId".

By "last" entry I mean the following:

“最后”条目的意思是:

The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems"has two entries with "EsnId" = 6and "createdAt" = '2012-06-19'and '2012-07-19'respectively it should only give me the entry from '2012-07-19'.

最后出现在表格中的那个"EsnsSalesOrderItems"。因此,例如,如果"EsnsSalesOrderItems"有两个条目分别带有"EsnId" = 6and"createdAt" = '2012-06-19'和 ,'2012-07-19'它应该只给我来自'2012-07-19'.

回答by Erwin Brandstetter

SELECT (count(*) * sum(s."price")) AS amount
     , 'rma'       AS "creditType"
     , c."company" AS "client"
     , c.id        AS "ClientId"
     , r.* 
FROM   "Rmas"            r
JOIN   "EsnsRmas"        er ON er."RmaId" = r."id"
JOIN   "Esns"            e  ON e.id = er."EsnId"
JOIN  (
   SELECT DISTINCT ON ("EsnId") *
   FROM   "EsnsSalesOrderItems"
   ORDER  BY "EsnId", "createdAt" DESC
   )                     es ON es."EsnId" = e."id"
JOIN   "SalesOrderItems" s  ON s."id" = es."SalesOrderItemId"
JOIN   "Clients"         c  ON c."id" = r."ClientId"
WHERE  r."credited" = FALSE
AND    r."verifyStatus" IS NOT NULL 
GROUP  BY c.id, r.id;

Your query in the question has an illegal aggregate over another aggregate:

您在问题中的查询在另一个聚合上具有非法聚合:

sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount

Simplified and converted to legal syntax:

简化并转换为合法语法:

(count(*) * sum(s."price")) AS amount

But do you really want to multiply with the count per group?

但是你真的想乘以每组的数量吗?

I retrieve the the single row per group in "EsnsSalesOrderItems"with DISTINCT ON. Detailed explanation:

我检索每个组单列"EsnsSalesOrderItems"DISTINCT ON。详细解释:

I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotesclouding the view.

我还添加了表别名和格式,使查询更容易被人眼解析。如果您可以避免使用驼峰式大小写,则可以摆脱所有使视图蒙上阴影的双引号

回答by a_horse_with_no_name

Something like:

就像是:

join (
  select "EsnId", 
         row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
  from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1

this will select the latest "EsnId"from "EsnsSalesOrderItems"based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.

这将选择最新的“EsnId""EsnsSalesOrderItems"基于列creation_date。当你没有张贴你的表的结构,我不得不‘发明’列名。您可以使用允许您定义的行发出命令列适合你。

But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unlessyou specify an order by

但请记住,“最后一行”的概念仅在您指定订单或行时才有效。这样的表没有排序,查询的结果也没有排序,除非您指定order by

回答by Stefan Steiger

Necromancing because the answers are outdated.
Take advantage of the LATERALkeyword introduced in PG 9.3

死灵法师,因为答案已经过时了。
利用PG 9.3中LATERAL引入的关键字

left | right | inner JOIN LATERAL

左 | 对| 内连接横向

I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.

我会用一个例子来解释:
假设你有一个表“联系人”。
现在联系人有组织单位。
他们可以在一个时间点有一个 OU,但在 N 个时间点有 N 个 OU。

Now, if you have to query contacts and OU in a time period(not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).

现在,如果您必须在某个时间段内(不是报告日期,而是日期范围)查询联系人和 OU 如果您只是执行左连接,则可以将记录数增加 N 倍。
因此,要显示 OU,您只需加入每个联系人的第一个 OU(其中第一个是一个任意标准 - 例如,当取最后一个值时,这只是排序时第一个值的另一种说法按日期降序排列)。

In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.

在 SQL-server 中,您将使用 cross-apply(或者更确切地说是 OUTER APPLY,因为我们需要一个左连接),它将在它必须连接的每一行上调用一个表值函数。

SELECT * FROM T_Contacts 

--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1 
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989

-- CROSS APPLY -- = INNER JOIN 
OUTER APPLY    -- = LEFT JOIN 
(
    SELECT TOP 1 
         --MAP_CTCOU_UID    
         MAP_CTCOU_CT_UID   
        ,MAP_CTCOU_COU_UID  
        ,MAP_CTCOU_DateFrom 
        ,MAP_CTCOU_DateTo   
   FROM T_MAP_Contacts_Ref_OrganisationalUnit 
   WHERE MAP_CTCOU_SoftDeleteStatus = 1 
   AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID 

    /*  
    AND 
    ( 
        (@in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo) 
        AND 
        (@in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom) 
    ) 
    */
   ORDER BY MAP_CTCOU_DateFrom 
) AS FirstOE 

In PostgreSQL, starting from version 9.3,you can do that, too - just use the LATERALkeyword to achieve the same:

在 PostgreSQL 中,从 9.3 版开始,您也可以这样做 - 只需使用LATERAL关键字来实现相同的效果:

SELECT * FROM T_Contacts 

--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1 
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989


LEFT JOIN LATERAL 
(
    SELECT 
         --MAP_CTCOU_UID    
         MAP_CTCOU_CT_UID   
        ,MAP_CTCOU_COU_UID  
        ,MAP_CTCOU_DateFrom 
        ,MAP_CTCOU_DateTo   
   FROM T_MAP_Contacts_Ref_OrganisationalUnit 
   WHERE MAP_CTCOU_SoftDeleteStatus = 1 
   AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID 

    /*  
    AND 
    ( 
        (__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo) 
        AND 
        (__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom) 
    ) 
    */
   ORDER BY MAP_CTCOU_DateFrom 
   LIMIT 1 
) AS FirstOE 

回答by Dondi Michael Stroma

Try using a subquery in your ON clause. An abstract example:

尝试在 ON 子句中使用子查询。一个抽象的例子:

SELECT 
    *
FROM table1
JOIN table2 ON table2.id = (
    SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE 
    ...