postgresql 在同一个分区上应用多个窗口函数

Question

提问by Verhogen

Is it possible to apply multiple window functions to the same partition? (Correct me if I'm not using the right vocabulary)

是否可以将多个窗口函数应用于同一个分区？（如果我没有使用正确的词汇，请纠正我）

For example you can do

例如你可以做

SELECT name, first_value() over (partition by name order by date) from table1

But is there a way to do something like:

但是有没有办法做这样的事情：

SELECT name, (first_value() as f, last_value() as l (partition by name order by date)) from table1

Where we are applying two functions onto the same window?

我们在哪里将两个函数应用到同一个窗口？

Reference: http://postgresql.ro/docs/8.4/static/tutorial-window.html

参考：http: //postgresql.ro/docs/8.4/static/tutorial-window.html

Answer 1

回答by Adriaan Stander

Can you not just use the window per selection

你不能只使用每个选择的窗口吗

Something like

就像是

SELECT  name, 
        first_value() OVER (partition by name order by date) as f, 
        last_value() OVER (partition by name order by date) as l 
from table1

Also from your reference you can do it like this

同样从您的参考中，您可以这样做

SELECT sum(salary) OVER w, avg(salary) OVER w
FROM empsalary
WINDOW w AS (PARTITION BY depname ORDER BY salary DESC)

Answer 2

回答by Skippy le Grand Gourou

Warning?:I don't delete this answer since it seems technically correct and therefore may be helpful, but bewarethat PARTITION BY bar ORDER BY foois probably not what you want to doanyway. Indeed, aggregate functions won't compute the partition elements as a whole. That is, SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo)is not equivalentto SELECT avg(foo) OVER (PARTITION BY bar)(see proof at the end of the answer).

警告？：因为它似乎技术上是正确的，我不删除这个答案，因此可能会有所帮助，但要注意这PARTITION BY bar ORDER BY foo是可能不是你想要做什么呢。实际上，聚合函数不会整体计算分区元素。也就是说，SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo)不等同于SELECT avg(foo) OVER (PARTITION BY bar)（见证据在回答结束）。

Though it doesn't improve performance per se, if you use multiple times the same partition, you probably wantto use the second syntax proposed by astander, and not only because it's cheaper to write. Here is why.

虽然它不会提高性能本身，如果你多次使用同一个分区，您可能希望使用由astander提出的第二个语法，这不仅是因为它更便宜来写。这是为什么。

Consider the following query?:

考虑以下查询？：

SELECT 
  array_agg(foo)
    OVER (PARTITION BY bar ORDER BY foo), 
  avg(baz)
    OVER (PARTITION BY bar ORDER BY foo) 
FROM 
  foobar;

Since in principle the ordering has no effect on the computation of the average, you might be tempted to use the following query instead (no ordering on the second partition)?:

由于原则上排序对平均值的计算没有影响，您可能会想改用以下查询（第二个分区上没有排序）？：

SELECT 
  array_agg(foo) 
    OVER (PARTITION BY bar ORDER BY foo), 
  avg(baz)
    OVER (PARTITION BY bar) 
FROM 
  foobar;

This is a big mistake, as it will take much longer. Proof :

这是一个很大的错误，因为它需要更长的时间。证明：

> EXPLAIN ANALYZE SELECT array_agg(foo) OVER (PARTITION BY bar ORDER BY foo), avg(baz) OVER (PARTITION BY bar ORDER BY foo) FROM foobar;
                                                           QUERY PLAN                                                        
---------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=215781.92..254591.76 rows=1724882 width=12) (actual time=969.659..2353.865 rows=1724882 loops=1)
   ->  Sort  (cost=215781.92..220094.12 rows=1724882 width=12) (actual time=969.640..1083.039 rows=1724882 loops=1)
         Sort Key: bar, foo
         Sort Method: quicksort  Memory: 130006kB
         ->  Seq Scan on foobar  (cost=0.00..37100.82 rows=1724882 width=12) (actual time=0.027..393.815 rows=1724882 loops=1)
 Total runtime: 2458.969 ms
(6 lignes)

> EXPLAIN ANALYZE SELECT array_agg(foo) OVER (PARTITION BY bar ORDER BY foo), avg(baz) OVER (PARTITION BY bar) FROM foobar;
                                                              QUERY PLAN                                                           
---------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=215781.92..276152.79 rows=1724882 width=12) (actual time=938.733..2958.811 rows=1724882 loops=1)
   ->  WindowAgg  (cost=215781.92..250279.56 rows=1724882 width=12) (actual time=938.699..2033.172 rows=1724882 loops=1)
         ->  Sort  (cost=215781.92..220094.12 rows=1724882 width=12) (actual time=938.683..1062.568 rows=1724882 loops=1)
               Sort Key: bar, foo
               Sort Method: quicksort  Memory: 130006kB
               ->  Seq Scan on foobar  (cost=0.00..37100.82 rows=1724882 width=12) (actual time=0.028..377.299 rows=1724882 loops=1)
 Total runtime: 3060.041 ms
(7 lignes)

Now, if you are aware of this issue, of course you will use the same partition everywhere. But when you have ten times or more the same partition and you are updating it over days, it is quite easy to forget to add the ORDER BYclause on a partition which doesn't need it by itself.

现在，如果您意识到这个问题，您当然会在任何地方使用相同的分区。但是当您有十次或更多相同的分区并且您在几天内更新它时，很容易忘记ORDER BY在一个不需要它的分区上添加子句。

Here comes the WINDOWsyntax, which will prevent you from such careless mistakes (provided, of course, you're aware it's better to minimize the number of different window functions). The following is strictly equivalent (as far as I can tell from EXPLAIN ANALYZE) to the first query?:

这里是WINDOW语法，它可以防止你犯这种粗心大意的错误（当然，前提是你知道最好尽量减少不同窗口函数的数量）。以下严格等同于（据我所知EXPLAIN ANALYZE）第一个查询？：

SELECT
  array_agg(foo)
    OVER qux,
  avg(baz)
    OVER qux
FROM
  foobar
WINDOW
  qux AS (PARTITION BY bar ORDER BY bar)

Post-warning update?:

警告后更新？：

I understand the statement that "SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo)is not equivalentto SELECT avg(foo) OVER (PARTITION BY bar)" seems questionable, so here is an example?:

据我所知，“声明SELECT avg(foo) OVER (PARTITION BY bar ORDER BY foo)不等同于SELECT avg(foo) OVER (PARTITION BY bar)”似乎值得商榷，所以这里有一个例子？：

# SELECT * FROM foobar;
 foo | bar 
-----+-----
   1 |   1
   2 |   2
   3 |   1
   4 |   2
(4 lines)

# SELECT array_agg(foo) OVER qux, avg(foo) OVER qux FROM foobar WINDOW qux AS (PARTITION BY bar);
 array_agg | avg 
-----------+-----
 {1,3}     |   2
 {1,3}     |   2
 {2,4}     |   3
 {2,4}     |   3
 (4 lines)

# SELECT array_agg(foo) OVER qux, avg(foo) OVER qux FROM foobar WINDOW qux AS (PARTITION BY bar ORDER BY foo);
 array_agg | avg 
-----------+-----
 {1}       |   1
 {1,3}     |   2
 {2}       |   2
 {2,4}     |   3
(4 lines)

postgresql 在同一个分区上应用多个窗口函数

提问by Verhogen

回答by Adriaan Stander

回答by Skippy le Grand Gourou

Post-warning update?:

警告后更新？：

相关推荐

最近更新

标签

postgresql 在同一个分区上应用多个窗口函数

提问by Verhogen

回答by Adriaan Stander

回答by Skippy le Grand Gourou

Post-warning update?:

警告后更新？：

相关推荐

postgresql PHP 和 Postgres：捕捉错误？

ogr2ogr 和 Postgis/PostgreSQL 数据库的编码问题

postgreSQL 真空临时文件？

如何在 postgresql 中使用 GMT 时间？

相关推荐

最近更新

标签