postgresql 在 PL/pgSQL 中动态执行查询
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29616126/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Executing queries dynamically in PL/pgSQL
提问by EJB
I have found solutions (I think) to the problem I'm about to ask for on Oracle and SQL Server, but can't seem to translate this into a Postgres solution. I am using Postgres 9.3.6.
我已经找到了(我认为)我将要在 Oracle 和 SQL Server 上提出的问题的解决方案,但似乎无法将其转换为 Postgres 解决方案。我正在使用 Postgres 9.3.6。
The idea is to be able to generate "metadata" about the table content for profiling purposes. This can only be done (AFAIK) by having queries run for each column so as to find out, say... min/max/count values and such. In order to automate the procedure, it is preferable to have the queries generated by the DB, then executed.
这个想法是为了能够生成关于表内容的“元数据”以进行分析。这只能通过对每一列运行查询来完成(AFAIK),以便找出,比如说……最小/最大/计数值等。为了使该过程自动化,最好让 DB 生成查询,然后执行。
With an example salesdata
table, I'm able to generate a select query for each column, returning the min() value, using the following snippet:
通过示例salesdata
表,我可以使用以下代码段为每一列生成一个选择查询,返回 min() 值:
SELECT 'SELECT min('||column_name||') as minval_'||column_name||' from salesdata '
FROM information_schema.columns
WHERE table_name = 'salesdata'
The advantage being that the db will generate the code regardless of the number of columns.
Now there's a myriad places I had in mind for storing these queries, either a variable of some sort, or a table column, the idea being to then have these queries execute.
I thought of storing the generated queries in a variable then executing them using the EXECUTE
(or EXECUTE IMMEDIATE
) statement which is the approach employed here(see right pane), but Postgres won't let me declare a variable outside a function and I've been scratching my head with how this would fit together, whether that's even the direction to follow, perhaps there's something simpler.
优点是无论列数如何,数据库都会生成代码。现在我想到了很多地方来存储这些查询,要么是某种变量,要么是表列,然后让这些查询执行的想法。我想将生成的查询存储在一个变量中,然后使用EXECUTE
(or EXECUTE IMMEDIATE
) 语句执行它们,这是这里采用的方法(见右窗格),但 Postgres 不会让我在函数外声明一个变量,我一直在抓挠我想知道这将如何组合在一起,无论这是否是要遵循的方向,也许还有更简单的事情。
Would you have any pointers, I'm currently trying something like this, inspired by this other questionbut have no idea whether I'm headed in the right direction:
你有什么建议吗,我目前正在尝试这样的事情,受到另一个问题的启发,但不知道我是否朝着正确的方向前进:
CREATE OR REPLACE FUNCTION foo()
RETURNS void AS
$$
DECLARE
dyn_sql text;
BEGIN
dyn_sql := SELECT 'SELECT min('||column_name||') from salesdata'
FROM information_schema.columns
WHERE table_name = 'salesdata';
execute dyn_sql
END
$$ LANGUAGE PLPGSQL;
回答by Erwin Brandstetter
System statistics
系统统计
Before you roll your own, have a look at the system table pg_statistic
or the view pg_stats
:
在你自己动手之前,看看系统表pg_statistic
或视图pg_stats
:
This view allows access only to rows of
pg_statistic
that correspond to tables the user has permission to read, and therefore it is safe to allow public read access to this view.
此视图仅允许访问
pg_statistic
与用户有权读取的表相对应的行,因此允许对此视图进行公共读取访问是安全的。
It might already have some of the statistics you are about to compute. It's populated by ANALYZE
, so you might run that for new (or any) tables before checking.
它可能已经有一些您要计算的统计信息。它由 填充ANALYZE
,因此您可以在检查之前为新(或任何)表运行它。
-- ANALYZE tbl; -- optionally, to init / refresh
SELECT * FROM pg_stats
WHERE tablename = 'tbl'
AND schemaname = 'public';
Generic dynamic plpgsql function
通用动态plpgsql函数
You want to return the minimum value for every column in a given table. This is not a trivial task, because a function (like SQL in general) demands to know the return type at creation time - or at least at call time with the help of polymorphic data types.
您想返回给定表中每一列的最小值。这不是一项微不足道的任务,因为函数(如一般的 SQL)需要在创建时知道返回类型 - 或者至少在多态数据类型的帮助下在调用时知道。
This function does everything automatically and safely. Works for anytable, as long as the aggregate function min()
is allowed for every column. But you needto know your way around PL/pgSQL.
此功能会自动且安全地执行所有操作。适用于任何表,只要min()
每列都允许使用聚合函数。但是您需要了解 PL/pgSQL。
CREATE OR REPLACE FUNCTION f_min_of(_tbl anyelement)
RETURNS SETOF anyelement
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT format('SELECT (t::%2$s).* FROM (SELECT min(%1$s) FROM %2$s) t'
, string_agg(quote_ident(attname), '), min(' ORDER BY attnum)
, pg_typeof(_tbl)::text)
FROM pg_attribute
WHERE attrelid = pg_typeof(_tbl)::text::regclass
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
);
END
$func$;
Call (important!):
打电话(重要!):
SELECT * FROM f_min_of(NULL::tbl); -- tbl being the table name
You need to understand these concepts:
你需要了解这些概念:
- Dynamic SQL in plpgsql with
EXECUTE
- Polymorphic types
- Row types and table types in Postgres
- How to defend against SQL injection
- Aggregate functions
- System catalogs
- plpgsql 中的动态 SQL 与
EXECUTE
- 多态类型
- Postgres 中的行类型和表类型
- 如何防御SQL注入
- 聚合函数
- 系统目录
Related answer with detailed explanation:
相关答案,详细解释:
- Table name as a PostgreSQL function parameter
- Refactor a PL/pgSQL function to return the output of various SELECT queries
- Postgres data type cast
- How to set value of composite variable field using dynamic SQL
- How to check if a table exists in a given schema
- Select columns with particular column names in PostgreSQL
- Generate series of dates - using date type as input
- 表名作为 PostgreSQL 函数参数
- 重构一个 PL/pgSQL 函数以返回各种 SELECT 查询的输出
- Postgres 数据类型转换
- 如何使用动态SQL设置复合变量字段的值
- 如何检查给定模式中是否存在表
- 在 PostgreSQL 中选择具有特定列名的列
- 生成一系列日期 - 使用日期类型作为输入
Special difficulty with type mismatch
类型不匹配的特殊困难
I am taking advantage of Postgres defining a row type for every existing table. Using the concept of polymorphic types I am able to create onefunction that works for any table.
我正在利用 Postgres 为每个现有表定义行类型。使用多态类型的概念,我能够创建一个适用于任何表的函数。
However, some aggregate functions return related but different data types as compared to the underlying column. For instance, min(varchar_column)
returns text
, which is bit-compatible, but not exactlythe same data type. PL/pgSQL functions have a weak spot here and insist on data types exactlyas declared in the RETURNS
clause. No attempt to cast, not even implicit casts, not to speak of assignment casts.
但是,与基础列相比,某些聚合函数返回相关但不同的数据类型。例如,min(varchar_column)
returnstext
是位兼容的,但不是完全相同的数据类型。PL/pgSQL 函数在这里有一个弱点,它坚持与子句中声明的数据类型完全一样RETURNS
。没有尝试强制转换,甚至没有隐式强制转换,更不用说赋值转换了。
That should be improved. Tested with Postgres 9.3. Did not retest with 9.4, but I am pretty sure, nothing has changed in this area.
那应该改进。使用 Postgres 9.3 测试。没有用 9.4 重新测试,但我很确定,这方面没有任何变化。
That's where this construct comes in as workaround:
这就是这个构造作为解决方法出现的地方:
SELECT (t::tbl).* FROM (SELECT ... FROM tbl) t;
By casting the whole row to the row type of the underlying table explicitly we force assignment casts to get original data types for every column.
通过显式地将整行转换为基础表的行类型,我们强制分配转换以获得每一列的原始数据类型。
This might fail for some aggregate function. sum()
returns numeric
for a sum(bigint_column)
to accommodate for a sum overflowing the base data type. Casting back to bigint
might fail ...
对于某些聚合函数,这可能会失败。sum()
返回numeric
asum(bigint_column)
以容纳溢出基本数据类型的总和。转换回bigint
可能会失败...
回答by EJB
@Erwin Brandstetter, Many thanks for the extensive answer. pg_stats does indeed provide a few things, but what I really need to draw a complete profile is a variety of things, min, max values, counts, count of nulls, mean etc... so a bunch of queries have to be ran for each columns, some with GROUP BY and such.
@Erwin Brandstetter,非常感谢您的广泛回答。pg_stats 确实提供了一些东西,但我真正需要绘制一个完整的配置文件是各种各样的东西,最小值、最大值、计数、空值计数、平均值等......所以必须运行一堆查询每列,有些带有 GROUP BY 等。
Also, thanks for highlighting the importance of data types, i was sort of expecting this to throw a spanner in the works at some point, my main concern was with how to automate the query generation, and its execution, this last bit being my main concern.
另外,感谢您强调数据类型的重要性,我有点期待这会在某个时候在工作中抛出一个扳手,我主要关心的是如何自动化查询生成及其执行,最后一点是我的主要内容关心。
I have tried the function you provide (I probably will need to start learning some plpgsql) but get a error at the SELECT (t::tbl) :
我已经尝试了您提供的功能(我可能需要开始学习一些 plpgsql),但在 SELECT (t::tbl) 处出现错误:
ERROR: type "tbl" does not exist
btw, what is the (t::abc) notation referred as, in python this would be a list slice, but it's probably not the case in PLPGSQL
顺便说一句,什么是 (t::abc) 表示法,在 python 中这将是一个列表切片,但在 PLPGSQL 中可能不是这种情况