postgresql SQL 选择字段总和小于 N 的元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11689080/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:08:02  来源:igfitidea点击:

SQL select elements where sum of field is less than N

sqlsqlitepostgresqlaggregate-functionssql-limit

提问by user1105595

Given that I've got a table with the following, very simple content:

鉴于我有一个包含以下内容的表格,非常简单:

# select * from messages;
  id | verbosity 
 ----+-----------
   1 |        20
   2 |        20
   3 |        20
   4 |        30
   5 |       100
 (5 rows)

I would like to select N messages, which sum of verbosity is lower than Y (for testing purposes let's say it should be 70, then correct results will be messages with id 1,2,3). It's really important to me, that solution should be database independent (it should work at least on Postgres and SQLite).

我想选择 N 条消息,其详细程度总和低于 Y(出于测试目的,假设它应该是 70,那么正确的结果将是 ID 为 1、2、3 的消息)。对我来说真的很重要,该解决方案应该独立于数据库(它至少应该在 Postgres 和 SQLite 上工作)。

I was trying with something like:

我正在尝试类似的东西:

SELECT * FROM messages GROUP BY id HAVING SUM(verbosity) < 70;

However it doesn't seem to work as expected, because it doesn't actually sum all values from verbosity column.

然而,它似乎没有按预期工作,因为它实际上并没有对详细列中的所有值求和。

I would be very grateful for any hints/help.

我将非常感谢任何提示/帮助。

回答by Erwin Brandstetter

SELECT m.id, sum(m1.verbosity) AS total
FROM   messages m
JOIN   messages m1 ON m1.id <= m.id
WHERE  m.verbosity < 70    -- optional, to avoid pointless evaluation
GROUP  BY m.id
HAVING SUM(m1.verbosity) < 70
ORDER  BY total DESC
LIMIT  1;

This assumes a unique, ascending idlike you have in your example.

这假设了一个独特的、上升的,id就像你在你的例子中那样。



In modern Postgres - or generally with modern standard SQL(but notin SQLite):

在现代 Postgres 中——或者通常使用现代标准 SQL(但不是在 SQLite 中):

Simple CTE

简单的 CTE

WITH cte AS (
   SELECT *, sum(verbosity) OVER (ORDER BY id) AS total
   FROM   messages
   )
SELECT *
FROM   cte
WHERE  total <= 70
ORDER  BY id;

Recursive CTE

递归 CTE

Should be faster for big tables where you only retrieve a small set.

对于只检索小集合的大表,应该更快。

WITH RECURSIVE cte AS (
   (  -- parentheses required
   SELECT id, verbosity, verbosity AS total
   FROM   messages
   ORDER  BY id
   LIMIT  1
   )

   UNION ALL 
   SELECT c1.id, c1.verbosity, c.total + c1.verbosity 
   FROM   cte c
   JOIN   LATERAL (
      SELECT *
      FROM   messages
      WHERE  id > c.id
      ORDER  BY id
      LIMIT  1
      ) c1 ON  c1.verbosity <= 70 - c.total
   WHERE c.total <= 70
   )
SELECT *
FROM   cte
ORDER  BY id;

All standard features, except for LIMIT.

所有标准功能,除了LIMIT.

Strictly speaking, there is no such thing as "database-independent". There are various SQL-standards, but no RDBMS complies completely. LIMITworks for PostgreSQL and SQLite (and some others). Use TOP 1for SQL Server, rownumfor Oracle. Here's a comprehensive list on Wikipedia.

严格来说,没有“数据库无关”这样的东西。有各种 SQL 标准,但没有一个 RDBMS 完全符合。LIMIT适用于 PostgreSQL 和 SQLite(以及其他一些)。使用TOP 1的SQL Server,rownum用于Oracle。这是维基百科上的完整列表。

The SQL:2008 standardwould be:

SQL:2008标准是:

...
FETCH  FIRST 1 ROWS ONLY

... which PostgreSQL supports - but hardly any other RDBMS.

... PostgreSQL 支持 - 但几乎没有任何其他 RDBMS。

The pure alternative that works with more systems would be to wrap it in a subquery and

适用于更多系统的纯替代方法是将其包装在子查询中

SELECT max(total) FROM <subquery>

But that is slow and unwieldy.

但这是缓慢而笨拙的。

SQL Fiddle.

SQL小提琴。

回答by podiluska

This will work...

这将工作...

select * 
from messages
where id<=
(
    select MAX(id) from
    (
        select m2.id, SUM(m1.verbosity) sv 
        from messages m1
        inner join messages m2 on m1.id <=m2.id
        group by m2.id
    ) v
    where sv<70
)

However, you should understand that SQL is designed as a set based language, rather than an iterative one, so it designed to treat data as a set, rather than on a row by row basis.

但是,您应该理解 SQL 被设计为基于集合的语言,而不是迭代语言,因此它旨在将数据视为一个集合,而不是逐行处理。