在 SQL 中使用 WITH 子句的指南

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8721503/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 13:49:04  来源:igfitidea点击:

Guidance on using the WITH clause in SQL

sqlpostgresqlsql-updatecommon-table-expression

提问by cc young

I understand how to use the WITHclause for recursive queries (!!), but I'm having problems understanding its general use / power.

我了解如何将WITH子句用于递归查询 (!!),但我在理解它的一般用途/功能时遇到了问题。

For example the following query updates one record whose id is determined by using a subquery returning the id of the first record by timestamp:

例如,以下查询更新一个记录,其 id 是通过使用按时间戳返回第一条记录的 id 的子查询确定的:

update global.prospect psp
set    status=status||'*'
where  psp.psp_id=(
           select  p2.psp_id
           from    global.prospect p2
           where   p2.status='new' or p2.status='reset'
           order   by p2.request_ts
           limit   1 )
returning psp.*;

Would this be a good candidate for using a WITHwrapper instead of the relatively ugly sub-query? If so, why?

这是否是使用WITH包装器而不是相对丑陋的子查询的好选择?如果是这样,为什么?

回答by Erwin Brandstetter

If there can be concurrent write accessto involved tables, there are race conditions in the above following queries. Consider:

如果可以对涉及的表进行并发写入访问,则上述以下查询中存在竞争条件。考虑:



Your example canuse a CTE (common table expression), but it will give you nothing a subquery couldn't do:

您的示例可以使用 CTE(公用表表达式),但它不会为您提供子查询无法执行的任何操作:

WITH x AS (
   SELECT  psp_id
   FROM    global.prospect
   WHERE   status IN ('new', 'reset')
   ORDER   BY request_ts
   LIMIT   1
   )
UPDATE global.prospect psp
SET    status = status || '*'
FROM   x
WHERE  psp.psp_id = x.psp_id
RETURNING psp.*;

BTW, the returned row will be the updatedversion.

顺便说一句,返回的行将是更新的版本。



Ifyou wanted to insert the returned row into another table, that's where a WITH clause becomes essential:

如果您想将返回的行插入到另一个表中,那么 WITH 子句就变得必不可少了:

WITH x AS (
   SELECT  psp_id
   FROM    global.prospect
   WHERE   status IN ('new', 'reset')
   ORDER   BY request_ts
   LIMIT   1
   ), y AS (
   UPDATE global.prospect psp
   SET    status = status || '*'
   FROM   x
   WHERE  psp.psp_id = x.psp_id
   RETURNING psp.*
   )
INSERT INTO z
SELECT *
FROM   y

Data modifying queries using CTE are possible with PostgreSQL 9.1 or later.
Read more in the excellent manual.

PostgreSQL 9.1 或更高版本可以使用 CTE 进行数据修改查询。在优秀的手册中
阅读更多内容

回答by Joey Adams

WITHlets you define "temporary tables" for use in a SELECTquery. For example, I recently wrote a query like this, to calculate changes between two sets:

WITH允许您定义用于SELECT查询的“临时表” 。例如,我最近写了一个这样的查询,来计算两个集合之间的变化:

-- Let o be the set of old things, and n be the set of new things.
WITH o AS (SELECT * FROM things(OLD)),
     n AS (SELECT * FROM things(NEW))

-- Select both the set of things whose value changed,
-- and the set of things in the old set but not in the new set.
SELECT o.key, n.value
    FROM o
    LEFT JOIN n ON o.key = n.key
    WHERE o.value IS DISTINCT FROM n.value

UNION ALL

-- Select the set of things in the new set but not in the old set.
SELECT n.key, n.value
    FROM o
    RIGHT JOIN n ON o.key = n.key
    WHERE o.key IS NULL;

By defining the "tables" oand nat the top, I was able to avoid repeating the expressions things(OLD)and things(NEW).

通过定义“表格”on顶部,我能够避免重复表达式things(OLD)things(NEW)

Sure, we could probably eliminate the UNION ALLusing a FULL JOIN, but I wasn't able to do that in my particular case.

当然,我们可能可以消除UNION ALL使用 a FULL JOIN,但在我的特殊情况下我无法做到这一点。



If I understand your query correctly, it does this:

如果我正确理解您的查询,它会执行以下操作:

  • Find the oldest row in global.prospect whose status is 'new' or 'reset'.

  • Mark it by adding an asterisk to its status

  • Return the row (including our tweak to status).

  • 在 global.prospect 中查找状态为“new”或“reset”的最旧行。

  • 通过在其状态中添加星号来标记它

  • 返回行(包括我们对 的调整status)。

I don't think WITHwill simplify anything in your case. It may be slightly more elegant to use a FROMclause, though:

我认为WITH在你的情况下不会简化任何事情。FROM不过,使用子句可能会稍微优雅一些:

update global.prospect psp
set    status = status || '*'
from   ( select psp_id
         from   global.prospect
         where  status = 'new' or status = 'reset'
         order  by request_ts
         limit  1
       ) p2
where  psp.psp_id = p2.psp_id
returning psp.*;

Untested. Let me know if it works.

未经测试。让我知道它是否有效。

It's pretty much exactly what you have already, except:

这几乎就是你已经拥有的,除了:

  • This can be easily extended to update multiple rows. In your version, which uses a subquery expression, the query would fail if the subquery were changed to yield multiple rows.

  • I did not alias global.prospectin the subquery, so it's a bit easier to read. Since this uses a FROMclause, you'll get an error if you accidentally reference the table being updated.

  • In your version, the subquery expression is encountered for every single item. Although PostgreSQL should optimize this and only evaluate the expression once, this optimization will go away if you accidentally reference a column in pspor add a volatile expression.

  • 这可以很容易地扩展到更新多行。在您使用子查询表达式的版本中,如果将子查询更改为产生多行,则查询将失败。

  • 我没有global.prospect在子查询中使用别名,所以它更容易阅读。由于这使用了一个FROM子句,如果您不小心引用了正在更新的表,您将得到一个错误。

  • 在您的版本中,每个项目都会遇到子查询表达式。尽管 PostgreSQL 应该对此进行优化并且只对表达式求值一次,但是如果您不小心在其中引用了列psp或添加了 volatile 表达式,则此优化将消失。