postgresql 在 PL/pgSQL 中动态执行查询

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29616126/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:51:35  来源:igfitidea点击:

Executing queries dynamically in PL/pgSQL

postgresqlaggregate-functionsplpgsqldynamic-sqldynamic-queries

提问by EJB

I have found solutions (I think) to the problem I'm about to ask for on Oracle and SQL Server, but can't seem to translate this into a Postgres solution. I am using Postgres 9.3.6.

我已经找到了(我认为)我将要在 Oracle 和 SQL Server 上提出的问题的解决方案,但似乎无法将其转换为 Postgres 解决方案。我正在使用 Postgres 9.3.6。

The idea is to be able to generate "metadata" about the table content for profiling purposes. This can only be done (AFAIK) by having queries run for each column so as to find out, say... min/max/count values and such. In order to automate the procedure, it is preferable to have the queries generated by the DB, then executed.

这个想法是为了能够生成关于表内容的“元数据”以进行分析。这只能通过对每一列运行查询来完成(AFAIK),以便找出,比如说……最小/最大/计数值等。为了使该过程自动化,最好让 DB 生成查询,然后执行。

With an example salesdatatable, I'm able to generate a select query for each column, returning the min() value, using the following snippet:

通过示例salesdata表,我可以使用以下代码段为每一列生成一个选择查询,返回 min() 值:

SELECT 'SELECT min('||column_name||') as minval_'||column_name||' from salesdata '  
FROM information_schema.columns 
WHERE table_name = 'salesdata'

The advantage being that the db will generate the code regardless of the number of columns. Now there's a myriad places I had in mind for storing these queries, either a variable of some sort, or a table column, the idea being to then have these queries execute. I thought of storing the generated queries in a variable then executing them using the EXECUTE(or EXECUTE IMMEDIATE) statement which is the approach employed here(see right pane), but Postgres won't let me declare a variable outside a function and I've been scratching my head with how this would fit together, whether that's even the direction to follow, perhaps there's something simpler.

优点是无论列数如何,数据库都会生成代码。现在我想到了很多地方来存储这些查询,要么是某种变量,要么是表列,然后让这些查询执行的想法。我想将生成的查询存储在一个变量中,然后使用EXECUTE(or EXECUTE IMMEDIATE) 语句执行它们,这是这里采用的方法(见右窗格),但 Postgres 不会让我在函数外声明一个变量,我一直在抓挠我想知道这将如何组合在一起,无论这是否是要遵循的方向,也许还有更简单的事情。

Would you have any pointers, I'm currently trying something like this, inspired by this other questionbut have no idea whether I'm headed in the right direction:

你有什么建议吗,我目前正在尝试这样的事情,受到另一个问题的启发,但不知道我是否朝着正确的方向前进:

CREATE OR REPLACE FUNCTION foo()
RETURNS void AS
$$
DECLARE
    dyn_sql text; 
BEGIN            
dyn_sql := SELECT 'SELECT min('||column_name||') from salesdata'    
    FROM information_schema.columns 
    WHERE table_name = 'salesdata';
execute dyn_sql
END
$$ LANGUAGE PLPGSQL;    

回答by Erwin Brandstetter

System statistics

系统统计

Before you roll your own, have a look at the system table pg_statisticor the view pg_stats:

在你自己动手之前,看看系统表pg_statistic或视图pg_stats

This view allows access only to rows of pg_statisticthat correspond to tables the user has permission to read, and therefore it is safe to allow public read access to this view.

此视图仅允许访问pg_statistic与用户有权读取的表相对应的行,因此允许对此视图进行公共读取访问是安全的。

It might already have some of the statistics you are about to compute. It's populated by ANALYZE, so you might run that for new (or any) tables before checking.

它可能已经有一些您要计算的统计信息。它由 填充ANALYZE,因此您可以在检查之前为新(或任何)表运行它。

-- ANALYZE tbl;  -- optionally, to init / refresh
SELECT * FROM pg_stats
WHERE tablename = 'tbl'
AND   schemaname = 'public';

Generic dynamic plpgsql function

通用动态plpgsql函数

You want to return the minimum value for every column in a given table. This is not a trivial task, because a function (like SQL in general) demands to know the return type at creation time - or at least at call time with the help of polymorphic data types.

您想返回给定表中每一列最小值。这不是一项微不足道的任务,因为函数(如一般的 SQL)需要在创建时知道返回类型 - 或者至少在多态数据类型的帮助下在调用时知道。

This function does everything automatically and safely. Works for anytable, as long as the aggregate function min()is allowed for every column. But you needto know your way around PL/pgSQL.

此功能会自动且安全地执行所有操作。适用于任何表,只要min()每列都允许使用聚合函数。但是您需要了解 PL/pgSQL。

CREATE OR REPLACE FUNCTION f_min_of(_tbl anyelement)
  RETURNS SETOF anyelement
  LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE (
   SELECT format('SELECT (t::%2$s).* FROM (SELECT min(%1$s) FROM %2$s) t'
                , string_agg(quote_ident(attname), '), min(' ORDER BY attnum)
                , pg_typeof(_tbl)::text)
   FROM   pg_attribute
   WHERE  attrelid = pg_typeof(_tbl)::text::regclass
   AND    NOT attisdropped  -- no dropped (dead) columns
   AND    attnum > 0        -- no system columns
   );
END
$func$;

Call (important!):

打电话(重要!):

SELECT * FROM f_min_of(NULL::tbl);  -- tbl being the table name

db<>fiddle here
Old sqlfiddle

db<>fiddle here
旧的sqlfiddle

You need to understand these concepts:

你需要了解这些概念:

  • Dynamic SQL in plpgsql with EXECUTE
  • Polymorphic types
  • Row types and table types in Postgres
  • How to defend against SQL injection
  • Aggregate functions
  • System catalogs
  • plpgsql 中的动态 SQL 与 EXECUTE
  • 多态类型
  • Postgres 中的行类型和表类型
  • 如何防御SQL注入
  • 聚合函数
  • 系统目录

Related answer with detailed explanation:

相关答案,详细解释:

Special difficulty with type mismatch

类型不匹配的特殊困难

I am taking advantage of Postgres defining a row type for every existing table. Using the concept of polymorphic types I am able to create onefunction that works for any table.

我正在利用 Postgres 为每个现有表定义行类型。使用多态类型的概念,我能够创建一个适用于任何表的函数。

However, some aggregate functions return related but different data types as compared to the underlying column. For instance, min(varchar_column)returns text, which is bit-compatible, but not exactlythe same data type. PL/pgSQL functions have a weak spot here and insist on data types exactlyas declared in the RETURNSclause. No attempt to cast, not even implicit casts, not to speak of assignment casts.

但是,与基础列相比,某些聚合函数返回相关但不同的数据类型。例如,min(varchar_column)returnstext是位兼容的,但不是完全相同的数据类型。PL/pgSQL 函数在这里有一个弱点,它坚持与子句中声明的数据类型完全一样RETURNS。没有尝试强制转换,甚至没有隐式强制转换,更不用说赋值转换了。

That should be improved. Tested with Postgres 9.3. Did not retest with 9.4, but I am pretty sure, nothing has changed in this area.

那应该改进。使用 Postgres 9.3 测试。没有用 9.4 重新测试,但我很确定,这方面没有任何变化。

That's where this construct comes in as workaround:

这就是这个构造作为解决方法出现的地方:

SELECT (t::tbl).* FROM (SELECT ... FROM tbl) t;

By casting the whole row to the row type of the underlying table explicitly we force assignment casts to get original data types for every column.

通过显式地将整行转换为基础表的行类型,我们强制分配转换以获得每一列的原始数据类型。

This might fail for some aggregate function. sum()returns numericfor a sum(bigint_column)to accommodate for a sum overflowing the base data type. Casting back to bigintmight fail ...

对于某些聚合函数,这可能会失败。sum()返回numericasum(bigint_column)以容纳溢出基本数据类型的总和。转换回bigint可能会失败...

回答by EJB

@Erwin Brandstetter, Many thanks for the extensive answer. pg_stats does indeed provide a few things, but what I really need to draw a complete profile is a variety of things, min, max values, counts, count of nulls, mean etc... so a bunch of queries have to be ran for each columns, some with GROUP BY and such.

@Erwin Brandstetter,非常感谢您的广泛回答。pg_stats 确实提供了一些东西,但我真正需要绘制一个完整的配置文件是各种各样的东西,最小值、最大值、计数、空值计数、平均值等......所以必须运行一堆查询每列,有些带有 GROUP BY 等。

Also, thanks for highlighting the importance of data types, i was sort of expecting this to throw a spanner in the works at some point, my main concern was with how to automate the query generation, and its execution, this last bit being my main concern.

另外,感谢您强调数据类型的重要性,我有点期待这会在某个时候在工作中抛出一个扳手,我主要关心的是如何自动化查询生成及其执行,最后一点是我的主要内容关心。

I have tried the function you provide (I probably will need to start learning some plpgsql) but get a error at the SELECT (t::tbl) :

我已经尝试了您提供的功能(我可能需要开始学习一些 plpgsql),但在 SELECT (t::tbl) 处出现错误:

ERROR: type "tbl" does not exist

btw, what is the (t::abc) notation referred as, in python this would be a list slice, but it's probably not the case in PLPGSQL

顺便说一句,什么是 (t::abc) 表示法,在 python 中这将是一个列表切片,但在 PLPGSQL 中可能不是这种情况