SQL 具有不同 ORDER BY 的 PostgreSQL DISTINCT ON

Question

提问by sl_bug

I want to run this query:

我想运行这个查询：

SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM purchases
WHERE purchases.product_id = 1
ORDER BY purchases.purchased_at DESC

But I get this error:

但我收到此错误：

PG::Error: ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions

PG::Error: ERROR: SELECT DISTINCT ON 表达式必须匹配初始 ORDER BY 表达式

Adding address_idas first ORDER BYexpression silences the error, but I really don't want to add sorting over address_id. Is it possible to do without ordering by address_id?

添加address_id为第一个ORDER BY表达式会消除错误，但我真的不想在address_id. 可以不订购address_id吗？

Answer 1

采纳答案by Mosty Mostacho

Documentation says:

文档说：

DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. [...] Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. [...] The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).

DISTINCT ON ( expression [, ...] ) 仅保留给定表达式计算结果相等的每组行的第一行。[...] 请注意，每个集合的“第一行”是不可预测的，除非使用 ORDER BY 来确保所需的行首先出现。[...] DISTINCT ON 表达式必须匹配最左边的 ORDER BY 表达式。

Official documentation

官方文档

So you'll have to add the address_idto the order by.

所以你必须将加入address_id到订单中。

Alternatively, if you're looking for the full row that contains the most recent purchased product for each address_idand that result sorted by purchased_atthen you're trying to solve a greatest N per group problem which can be solved by the following approaches:

或者，如果您正在寻找包含每个最近购买的产品的完整行，address_id并且该结果按以下排序，purchased_at那么您正在尝试解决每组最大的 N 问题，该问题可以通过以下方法解决：

The general solution that should work in most DBMSs:

适用于大多数 DBMS 的通用解决方案：

SELECT t1.* FROM purchases t1
JOIN (
    SELECT address_id, max(purchased_at) max_purchased_at
    FROM purchases
    WHERE product_id = 1
    GROUP BY address_id
) t2
ON t1.address_id = t2.address_id AND t1.purchased_at = t2.max_purchased_at
ORDER BY t1.purchased_at DESC

A more PostgreSQL-oriented solution based on @hkf's answer:

基于@hkf 的回答的更面向 PostgreSQL 的解决方案：

SELECT * FROM (
  SELECT DISTINCT ON (address_id) *
  FROM purchases 
  WHERE product_id = 1
  ORDER BY address_id, purchased_at DESC
) t
ORDER BY purchased_at DESC

Problem clarified, extended and solved here: Selecting rows ordered by some column and distinct on another

问题在此处得到澄清、扩展和解决：选择按某列排序并在另一列上不同的行

Answer 2

回答by hkf

You can order by address_id in an subquery, then order by what you want in an outer query.

您可以在子查询中按 address_id 排序，然后在外部查询中按您想要的排序。

SELECT * FROM 
    (SELECT DISTINCT ON (address_id) purchases.address_id, purchases.* 
    FROM "purchases" 
    WHERE "purchases"."product_id" = 1 ORDER BY address_id DESC ) 
ORDER BY purchased_at DESC

Answer 3

回答by Erwin Brandstetter

A subquerycan solve it:

一个子查询可以解决这个问题：

SELECT *
FROM  (
    SELECT DISTINCT ON (address_id) *
    FROM   purchases
    WHERE  product_id = 1
    ) p
ORDER  BY purchased_at DESC;

Leading expressions in ORDER BYhave to agree with columns in DISTINCT ON, so you can't order by different columns in the same SELECT.

in 中的前导表达式ORDER BY必须与中的列一致DISTINCT ON，因此您不能在同一个SELECT.

Only use an additional ORDER BYin the subquery if you want to pick a particular row from each set:

ORDER BY如果要从每个集合中选择特定行，请仅在子查询中使用附加项：

SELECT *
FROM  (
    SELECT DISTINCT ON (address_id) *
    FROM   purchases
    WHERE  product_id = 1
    ORDER  BY address_id, purchased_at DESC  -- get "latest" row per address_id
    ) p
ORDER  BY purchased_at DESC;

If purchased_atcan be NULL, consider DESC NULLS LAST. But make sure to match your index if you intend to use it. See:

如果purchased_at可以，可以NULL考虑DESC NULLS LAST。但是如果您打算使用它，请确保匹配您的索引。看：

Related, with more explanation:

回答by savenkov

Window function may solve that in one pass:

窗口函数可以一次性解决这个问题：

SELECT DISTINCT ON (address_id) 
   LAST_VALUE(purchases.address_id) OVER wnd AS address_id
FROM "purchases"
WHERE "purchases"."product_id" = 1
WINDOW wnd AS (
   PARTITION BY address_id ORDER BY purchases.purchased_at DESC
   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

Answer 5

回答by reubano

For anyone using Flask-SQLAlchemy, this worked for me

对于使用 Flask-SQLAlchemy 的任何人，这对我有用

from app import db
from app.models import Purchases
from sqlalchemy.orm import aliased
from sqlalchemy import desc

stmt = Purchases.query.distinct(Purchases.address_id).subquery('purchases')
alias = aliased(Purchases, stmt)
distinct = db.session.query(alias)
distinct.order_by(desc(alias.purchased_at))

Answer 6

回答by vaishali

You can also done this by using group by clause

您也可以使用 group by 子句来完成此操作

   SELECT purchases.address_id, purchases.* FROM "purchases"
    WHERE "purchases"."product_id" = 1 GROUP BY address_id,
purchases.purchased_at ORDER purchases.purchased_at DESC

SQL 具有不同 ORDER BY 的 PostgreSQL DISTINCT ON

提问by sl_bug

采纳答案by Mosty Mostacho

回答by hkf

回答by Erwin Brandstetter

回答by savenkov

回答by reubano

回答by vaishali

相关推荐

最近更新

标签

SQL 具有不同 ORDER BY 的 PostgreSQL DISTINCT ON

提问by sl_bug

采纳答案by Mosty Mostacho

回答by hkf

回答by Erwin Brandstetter

回答by savenkov

回答by reubano

回答by vaishali

相关推荐

在 SQL Server 中，“SET ANSI_NULLS ON”是什么意思？

组功能在 SQL 服务器上不起作用

读取 SQL Server 2008 中的日志文件 (*.LDF)

SQL 字符串变量的空或空检查

相关推荐

最近更新

标签