在 PostgreSQL 中动态生成交叉表的列

Question

提问by invinc4u

I am trying to create crosstabqueries in PostgreSQL such that it automatically generates the crosstabcolumns instead of hardcoding it. I have written a function that dynamically generates the column list that I need for my crosstabquery. The idea is to substitute the result of this function in the crosstabquery using dynamic sql.

我正在尝试crosstab在 PostgreSQL 中创建查询，以便它自动生成crosstab列而不是对其进行硬编码。我编写了一个函数，可以动态生成crosstab查询所需的列列表。这个想法是crosstab使用动态sql在查询中替换这个函数的结果。

I know how to do this easily in SQL Server, but my limited knowledge of PostgreSQL is hindering my progress here. I was thinking of storing the result of function that generates the dynamic list of columns into a variable and use that to dynamically build the sql query. It would be great if someone could guide me regarding the same.

我知道如何在 SQL Server 中轻松地做到这一点，但我对 PostgreSQL 的有限知识阻碍了我在这里的进步。我正在考虑将生成列的动态列表的函数的结果存储到变量中，并使用它来动态构建 sql 查询。如果有人可以指导我，那就太好了。


-- Table which has be pivoted
CREATE TABLE test_db
(
    kernel_id int,
    key int,
    value int
);

INSERT INTO test_db VALUES
(1,1,99),
(1,2,78),
(2,1,66),
(3,1,44),
(3,2,55),
(3,3,89);


-- This function dynamically returns the list of columns for crosstab
CREATE FUNCTION test() RETURNS TEXT AS '
DECLARE
    key_id int;
    text_op TEXT = '' kernel_id int, '';
BEGIN
    FOR key_id IN SELECT DISTINCT key FROM test_db ORDER BY key LOOP
    text_op := text_op || key_id || '' int , '' ;
    END LOOP;
    text_op := text_op || '' DUMMY text'';
    RETURN text_op;
END;
' LANGUAGE 'plpgsql';

-- This query works. I just need to convert the static list
-- of crosstab columns to be generated dynamically.
SELECT * FROM
crosstab
(
    'SELECT kernel_id, key, value FROM test_db ORDER BY 1,2',
    'SELECT DISTINCT key FROM test_db ORDER BY 1'
)
AS x (kernel_id int, key1 int, key2 int, key3 int); -- How can I replace ..
-- .. this static list with a dynamically generated list of columns ?

Answer 1

采纳答案by Erwin Brandstetter

You can use the provided C function crosstab_hashfor this.

您可以crosstab_hash为此使用提供的 C 函数。

The manual is not very clear in this respect. It's mentioned at the end of the chapter on crosstab()with two parameters:

手册在这方面不是很清楚。有两个参数在章节末尾提到：crosstab()

You can create predefined functions to avoid having to write out the result column names and types in each query. See the examples in the previous section. The underlying C function for this form of crosstabis named crosstab_hash.

您可以创建预定义函数以避免在每个查询中写出结果列名称和类型。请参阅上一节中的示例。这种形式的底层 C 函数crosstab名为crosstab_hash。

For your example:

对于您的示例：

CREATE OR REPLACE FUNCTION f_cross_test_db(text, text)
  RETURNS TABLE (kernel_id int, key1 int, key2 int, key3 int)
  AS '$libdir/tablefunc','crosstab_hash' LANGUAGE C STABLE STRICT;

Call:

称呼：

SELECT * FROM f_cross_test_db(
      'SELECT kernel_id, key, value FROM test_db ORDER BY 1,2'
     ,'SELECT DISTINCT key FROM test_db ORDER BY 1');

Note that you need to create a distinct crosstab_hashfunction for every crosstabfunction with a different return type.

请注意，您需要crosstab_hash为每个crosstab具有不同返回类型的函数创建一个不同的函数。

PostgreSQL row to columns

PostgreSQL 行到列

Your function to generate the column listis rather convoluted, the result is incorrect (intmissing after kernel_id), it can be replaced with this SQL query:

您生成列列表的函数相当复杂，结果不正确（int之后丢失kernel_id），可以用此 SQL 查询替换：

SELECT 'kernel_id int, '
       || string_agg(DISTINCT key::text, ' int, '  ORDER BY key::text)
       || ' int, DUMMY text'
FROM   test_db;

And it cannot be used dynamically anyway.

而且无论如何都不能动态使用。

Answer 2

回答by Caullyn

@erwin-brandstetter: The return type of the function isn't an issue if you're always returning a JSON type with the converted results.

@erwin-brandstetter：如果您总是返回带有转换结果的 JSON 类型，则函数的返回类型不是问题。

Here is the function I came up with:

这是我想出的函数：

CREATE OR REPLACE FUNCTION report.test(
    i_start_date TIMESTAMPTZ,
    i_end_date TIMESTAMPTZ,
    i_interval INT
    ) RETURNS TABLE (
    tab JSON
    ) AS $ab$
DECLARE
    _key_id TEXT;
    _text_op TEXT = '';
    _ret JSON;
BEGIN
    -- SELECT DISTINCT for query results
    FOR _key_id IN
    SELECT DISTINCT at_name
      FROM report.company_data_date cd 
      JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id 
      JOIN report.amount_types at ON cda.amount_type_id  = at.id 
     WHERE date_start BETWEEN i_start_date AND i_end_date
       AND interval_type_id = i_interval
    LOOP
    -- build function_call with datatype of column
        IF char_length(_text_op) > 1 THEN
            _text_op := _text_op || ', ' || _key_id || ' NUMERIC(20,2)';
        ELSE
            _text_op := _text_op || _key_id || ' NUMERIC(20,2)';
        END IF;
    END LOOP;
    -- build query with parameter filters
    RETURN QUERY
    EXECUTE '
        SELECT array_to_json(array_agg(row_to_json(t)))
          FROM (
        SELECT * FROM crosstab(''SELECT date_start, at.at_name,  cda.amount ct 
          FROM report.company_data_date cd 
          JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id 
          JOIN report.amount_types at ON cda.amount_type_id  = at.id 
         WHERE date_start between $$' || i_start_date::TEXT || '$$ AND $$' || i_end_date::TEXT || '$$ 
           AND interval_type_id = ' || i_interval::TEXT || ' ORDER BY date_start'') 
            AS ct (date_start timestamptz, ' || _text_op || ')
             ) t;';
END;
$ab$ LANGUAGE 'plpgsql';

So, when you run it, you get the dynamic results in JSON, and you don't need to know how many values were pivoted:

所以，当你运行它时，你会得到 JSON 格式的动态结果，你不需要知道有多少值被旋转：

select * from report.test(now()- '1 week'::interval, now(), 1);
                                                                                                                     tab                                                                                                                      
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [{"date_start":"2015-07-27T08:40:01.277556-04:00","burn_rate":0.00,"monthly_revenue":5800.00,"cash_balance":0.00},{"date_start":"2015-07-27T08:50:02.458868-04:00","burn_rate":34000.00,"monthly_revenue":15800.00,"cash_balance":24000.00}]
(1 row)

Edit: If you have mixed datatypes in your crosstab, you can add logic to look it up for each column with something like this:

编辑：如果您的交叉表中有混合数据类型，您可以添加逻辑以使用以下内容为每列查找它：

  SELECT a.attname as column_name, format_type(a.atttypid, a.atttypmod) AS data_type 
    FROM pg_attribute a 
    JOIN pg_class b ON (a.attrelid = b.relfilenode) 
    JOIN pg_catalog.pg_namespace n ON n.oid = b.relnamespace 
   WHERE n.nspname = $$schema_name$$ AND b.relname = $$table_name$$ and a.attstattarget = -1;"

Answer 3

回答by Ben

The approach described hereworked well for me. Instead of retrieving the pivot table directly. The easier approach is to let the function generate a SQL query string. Dynamically execute the resulting SQL query string on demand.

这里描述的方法对我来说效果很好。而不是直接检索数据透视表。更简单的方法是让函数生成 SQL 查询字符串。按需动态执行生成的 SQL 查询字符串。

Answer 4

回答by Travisty

I realise this is an older post but struggled for a little while on the same issue.

我意识到这是一个较旧的帖子，但在同一问题上挣扎了一段时间。

My Problem Statement:I had a table with muliple values in a field and wanted to create a crosstab query with 40+ column headings per row.

我的问题陈述：我有一个字段中有多个值的表，并且想要创建一个每行 40 多个列标题的交叉表查询。

My Solution was to create a function which looped through the table column to grab values that I wanted to use as column headings within the crosstab query.

我的解决方案是创建一个函数，它循环遍历表列以获取我想在交叉表查询中用作列标题的值。

Within this function I could then Create the crosstab query. In my use case I added this crosstab result into a separate table.

在这个函数中，我可以创建交叉表查询。在我的用例中，我将此交叉表结果添加到单独的表中。

E.g.

例如

CREATE OR REPLACE FUNCTION field_values_ct ()
 RETURNS VOID AS $$
DECLARE rec RECORD;
DECLARE str text;
BEGIN
str := '"Issue ID" text,';
   -- looping to get column heading string
   FOR rec IN SELECT DISTINCT field_name
        FROM issue_fields
        ORDER BY field_name
    LOOP
    str :=  str || '"' || rec.field_name || '" text' ||',';
    END LOOP;
    str:= substring(str, 0, length(str));

    EXECUTE 'CREATE EXTENSION IF NOT EXISTS tablefunc;
    DROP TABLE IF EXISTS temp_issue_fields;
    CREATE TABLE temp_issue_fields AS
    SELECT *
    FROM crosstab(''select issue_id, field_name, field_value from issue_fields order by 1'',
                 ''SELECT DISTINCT field_name FROM issue_fields ORDER BY 1'')
         AS final_result ('|| str ||')';
END;
$$ LANGUAGE plpgsql;

在 PostgreSQL 中动态生成交叉表的列

提问by invinc4u

采纳答案by Erwin Brandstetter

回答by Caullyn

回答by Ben

回答by Travisty

相关推荐

最近更新

标签

在 PostgreSQL 中动态生成交叉表的列

提问by invinc4u

采纳答案by Erwin Brandstetter

回答by Caullyn

回答by Ben

回答by Travisty

相关推荐

postgresql 性能调优：为布尔列创建索引

postgresql Postgres.app 无法在端口 5432 上启动

具有任意精度（低至毫秒）的 Postgresql SQL GROUP BY 时间间隔

我如何从 postgresql 中的查询中获取最小值、中值和最大值

相关推荐

最近更新

标签