在 SQL 中使用 WITH 子句的指南
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8721503/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Guidance on using the WITH clause in SQL
提问by cc young
I understand how to use the WITH
clause for recursive queries (!!), but I'm having problems understanding its general use / power.
我了解如何将WITH
子句用于递归查询 (!!),但我在理解它的一般用途/功能时遇到了问题。
For example the following query updates one record whose id is determined by using a subquery returning the id of the first record by timestamp:
例如,以下查询更新一个记录,其 id 是通过使用按时间戳返回第一条记录的 id 的子查询确定的:
update global.prospect psp
set status=status||'*'
where psp.psp_id=(
select p2.psp_id
from global.prospect p2
where p2.status='new' or p2.status='reset'
order by p2.request_ts
limit 1 )
returning psp.*;
Would this be a good candidate for using a WITH
wrapper instead of the relatively ugly sub-query? If so, why?
这是否是使用WITH
包装器而不是相对丑陋的子查询的好选择?如果是这样,为什么?
回答by Erwin Brandstetter
If there can be concurrent write accessto involved tables, there are race conditions in the above following queries. Consider:
如果可以对涉及的表进行并发写入访问,则上述以下查询中存在竞争条件。考虑:
Your example canuse a CTE (common table expression), but it will give you nothing a subquery couldn't do:
您的示例可以使用 CTE(公用表表达式),但它不会为您提供子查询无法执行的任何操作:
WITH x AS (
SELECT psp_id
FROM global.prospect
WHERE status IN ('new', 'reset')
ORDER BY request_ts
LIMIT 1
)
UPDATE global.prospect psp
SET status = status || '*'
FROM x
WHERE psp.psp_id = x.psp_id
RETURNING psp.*;
BTW, the returned row will be the updatedversion.
顺便说一句,返回的行将是更新的版本。
Ifyou wanted to insert the returned row into another table, that's where a WITH clause becomes essential:
如果您想将返回的行插入到另一个表中,那么 WITH 子句就变得必不可少了:
WITH x AS (
SELECT psp_id
FROM global.prospect
WHERE status IN ('new', 'reset')
ORDER BY request_ts
LIMIT 1
), y AS (
UPDATE global.prospect psp
SET status = status || '*'
FROM x
WHERE psp.psp_id = x.psp_id
RETURNING psp.*
)
INSERT INTO z
SELECT *
FROM y
Data modifying queries using CTE are possible with PostgreSQL 9.1 or later.
Read more in the excellent manual.
回答by Joey Adams
WITH
lets you define "temporary tables" for use in a SELECT
query. For example, I recently wrote a query like this, to calculate changes between two sets:
WITH
允许您定义用于SELECT
查询的“临时表” 。例如,我最近写了一个这样的查询,来计算两个集合之间的变化:
-- Let o be the set of old things, and n be the set of new things.
WITH o AS (SELECT * FROM things(OLD)),
n AS (SELECT * FROM things(NEW))
-- Select both the set of things whose value changed,
-- and the set of things in the old set but not in the new set.
SELECT o.key, n.value
FROM o
LEFT JOIN n ON o.key = n.key
WHERE o.value IS DISTINCT FROM n.value
UNION ALL
-- Select the set of things in the new set but not in the old set.
SELECT n.key, n.value
FROM o
RIGHT JOIN n ON o.key = n.key
WHERE o.key IS NULL;
By defining the "tables" o
and n
at the top, I was able to avoid repeating the expressions things(OLD)
and things(NEW)
.
通过定义“表格”o
和n
顶部,我能够避免重复表达式things(OLD)
和things(NEW)
。
Sure, we could probably eliminate the UNION ALL
using a FULL JOIN
, but I wasn't able to do that in my particular case.
当然,我们可能可以消除UNION ALL
使用 a FULL JOIN
,但在我的特殊情况下我无法做到这一点。
If I understand your query correctly, it does this:
如果我正确理解您的查询,它会执行以下操作:
Find the oldest row in global.prospect whose status is 'new' or 'reset'.
Mark it by adding an asterisk to its status
Return the row (including our tweak to
status
).
在 global.prospect 中查找状态为“new”或“reset”的最旧行。
通过在其状态中添加星号来标记它
返回行(包括我们对 的调整
status
)。
I don't think WITH
will simplify anything in your case. It may be slightly more elegant to use a FROM
clause, though:
我认为WITH
在你的情况下不会简化任何事情。FROM
不过,使用子句可能会稍微优雅一些:
update global.prospect psp
set status = status || '*'
from ( select psp_id
from global.prospect
where status = 'new' or status = 'reset'
order by request_ts
limit 1
) p2
where psp.psp_id = p2.psp_id
returning psp.*;
Untested. Let me know if it works.
未经测试。让我知道它是否有效。
It's pretty much exactly what you have already, except:
这几乎就是你已经拥有的,除了:
This can be easily extended to update multiple rows. In your version, which uses a subquery expression, the query would fail if the subquery were changed to yield multiple rows.
I did not alias
global.prospect
in the subquery, so it's a bit easier to read. Since this uses aFROM
clause, you'll get an error if you accidentally reference the table being updated.In your version, the subquery expression is encountered for every single item. Although PostgreSQL should optimize this and only evaluate the expression once, this optimization will go away if you accidentally reference a column in
psp
or add a volatile expression.
这可以很容易地扩展到更新多行。在您使用子查询表达式的版本中,如果将子查询更改为产生多行,则查询将失败。
我没有
global.prospect
在子查询中使用别名,所以它更容易阅读。由于这使用了一个FROM
子句,如果您不小心引用了正在更新的表,您将得到一个错误。在您的版本中,每个项目都会遇到子查询表达式。尽管 PostgreSQL 应该对此进行优化并且只对表达式求值一次,但是如果您不小心在其中引用了列
psp
或添加了 volatile 表达式,则此优化将消失。