postgresql 将常见查询存储为列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11165450/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 00:02:05  来源:igfitidea点击:

Store common query as column?

postgresqldatabase-designview

提问by ibrewster

Using PostgreSQL, I have a number of queries that look like this:

使用 PostgreSQL,我有许多如下所示的查询:

SELECT <col 1>, <col 2>
     , (SELECT sum(<col x>)
        FROM   <otherTable> 
        WHERE  <other table foreignkeyCol>=<this table keycol>) AS <col 3>
FROM   <tbl>

Given that the sub-select will be identical in every case, is there a way to store that sub-select as a pseudo-column in the table? Essentially, I want to be able to select a column from table A that is a sum of a specific column from table B where the records are related. Is this possible?

鉴于子选择在每种情况下都是相同的,有没有办法将该子选择存储为表中的伪列?本质上,我希望能够从表 A 中选择一列,该列是表 B 中记录相关的特定列的总和。这可能吗?

回答by Erwin Brandstetter

Is there a way to store that sub-select as a pseudo-column in the table?

有没有办法将该子选择存储为表中的伪列?

A VIEWlike has been advised is a perfectly valid solution. But there is another way that fits your question even more closely. You can write a function that takes the table type as parameter to emulatea "computed field"or "generated column".

一个VIEW喜欢已经被告知是一个完全有效的解决方案。但是还有另一种更适合您的问题的方法。可以编写一个函数,该表型为参数来模拟一个“计算字段”“生成的列”

Consider this test case, derived from your description:

考虑这个来自您的描述的测试用例:

CREATE TABLE tbl_a (a_id int, col1 int, col2 int);
INSERT INTO tbl_a VALUES (1,1,1), (2,2,2), (3,3,3), (4,4,4);

CREATE TABLE tbl_b (b_id int, a_id int, colx int);
INSERT INTO tbl_b VALUES
 (1,1,5),  (2,1,5),  (3,1,1)
,(4,2,8),  (5,2,8),  (6,2,6)
,(7,3,11), (8,3,11), (9,3,11);

Create function that emulates col3:

创建模拟的函数col3

CREATE FUNCTION col3(tbl_a)
  RETURNS int8 AS
$func$
    SELECT sum(colx)
    FROM   tbl_b b
    WHERE  b.a_id = .a_id
$func$ LANGUAGE SQL STABLE;

Now you can query:

现在您可以查询:

SELECT a_id, col1, col2, tbl_a.col3
FROM   tbl_a;

Or even:

甚至:

SELECT *, a.col3 FROM tbl_a a;

Note how I wrote tbl_a.col3/ a.col3, not just col3. This is essential.

请注意我是如何写tbl_a.col3/ 的a.col3,而不仅仅是col3. 这是必不可少的

Unlike a "virtual column" in Oracleit is notincluded automatically in a SELECT * FROM tbl_a. You could use a VIEWfor that.

Oracle 中“虚拟列”不同,不会自动包含在SELECT * FROM tbl_a. 你可以使用 a VIEW

Why does this work?

为什么这样做?

The common way to reference a table column is with attribute notation:

引用表列的常用方法是使用属性表示法

SELECT tbl_a.col1 FROM tbl_a;

The common way to call a function is with functional notation:

调用函数的常用方法是使用函数符号

SELECT col3(tbl_a);

Generally, it's best to stick to these canonical ways, which agree with the SQL standard.

一般来说,最好坚持使用这些符合 SQL 标准的规范方式

But in PostgreSQL, functional notation and attribute notation are equivalent. So these work as well:

但是在 PostgreSQL 中,函数符号和属性符号是等价的。所以这些也有效:

SELECT col1(tbl_a) FROM tbl_a;
SELECT tbl_a.col3;

More about that in the manual.
You probably see by now, where this is going. This lookslike you would add an extra column of table tbl_awhile col3()is actually a function that takes the current row of tbl_a(or its alias) as row type argument and computes a value.

更多关于手册中的内容。
你现在可能已经看到了,这是怎么回事。这看起来像你会添加一个额外的表列,tbl_acol3()实际上是一个函数,它将当前行tbl_a(或其别名)作为行类型参数并计算一个值。

SELECT *, a.col3
FROM   tbl_a AS a;

If there is an actual column col3it takes priority and the system does not look for a function of that name taking the row tbl_aas parameter.

如果有一个实际的列,col3它优先并且系统不会寻找以行tbl_a作为参数的那个名称的函数。

The beauty of it: you can add or drop columns from tbl_aand the last query will dynamically return all current columns, where a view would only return such columns that existed at creation time (early binding vs. late binding of *).
Of course, you have to drop the depending function before you can drop the table now. And you have to take care not to invalidate the function when making changes to the table.

它的美妙之处:您可以添加或删除列,tbl_a最后一个查询将动态返回所有当前列,其中视图只会返回创建时存在的列(早期绑定与后期绑定*)。
当然,您必须先删除依赖函数,然后才能删除表。并且您必须注意不要在对表进行更改时使函数无效。

回答by ibrewster

Apparently this is handled with views, as per lion's comment. So in my case, I used the command:

显然,根据狮子的评论,这是通过视图处理的。所以就我而言,我使用了以下命令:

CREATE VIEW <viewname> AS
SELECT *, (SELECT sum(<col x>)
FROM   <otherTable
WHERE  <otherTable foreignkeyCol>=<thisTable keycol>) AS <col 3>
FROM   <tablename>

which essentially gives me another table including the desired column.

这基本上给了我另一个表格,包括所需的列。

回答by Johann Oskarsson

Apart from a view, you can create a function for the sum.

除了视图之外,您还可以为总和创建一个函数。

CREATE FUNCTION sum_other_table( key type_of_key ) RETURNS bigint
AS $$ SELECT sum( col_x ) FROM table_1 where table_1.key = key $$ LANGUAGE SQL;

and then use it as your aggregator:

然后将其用作您的聚合器:

SELECT col_1, col_2, sum_other_table( key ) AS col_3
FROM table_2 WHERE table_2.key = key;

Note that the return type of sum_other_table() depends on the type of the column you're summing up.

请注意, sum_other_table() 的返回类型取决于您要汇总的列的类型。

回答by kgrittn

There are three answers so far, all of which work. Any one of them could be a "best solution" depending on circumstances. With small tables the performance should be pretty close, but none of them are likely to scale well to tables with millions of rows. The fastest way to get the desired results with a large data set would probably be (using Erwin's setup):

到目前为止,有三个答案,所有答案都有效。根据情况,其中任何一个都可能是“最佳解决方案”。对于小表,性能应该非常接近,但它们都不可能很好地扩展到具有数百万行的表。使用大型数据集获得所需结果的最快方法可能是(使用 Erwin 的设置):

SELECT a_id, col1, col2, sum(colx)
FROM tbl_a LEFT JOIN tbl_b b using(a_id)
GROUP BY a_id, col1, col2;

If a_idis declared as a primary key, and this is run under 9.1 or later, the GROUP BYclause can be simplified because col1and col2are functionally dependenton a_id.

如果a_id声明为主键,并且在 9.1 或更高版本下运行,则GROUP BY可以简化该子句,因为col1并且col2功能上依赖a_id.

SELECT a_id, col1, col2, sum(colx)
FROM tbl_a LEFT JOIN tbl_b b using(a_id)
GROUP BY a_id;

The view could be defined this way and it would scale, but I don't thinkthat all the same execution paths will be considered for the approaches using functions, so the fastest execution path might not be used.

该视图可以定义这样的,它会扩展,但我不认为所有的同一个执行路径将被视为使用功能,所以可以不使用最快的执行路径的方法。