使用 GROUP BY 汇总统计信息的 Oracle SQL 查询

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1029032/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 18:26:37  来源:igfitidea点击:

Oracle SQL Query to Summarize Statistics, using GROUP BY

sqloraclegroup-bypivot

提问by sernaferna

I have an Oracle table with data that looks like this:

我有一个包含如下数据的 Oracle 表:

ID   BATCH   STATUS
1    1       0
2    1       0
3    1       1
4    2       0

That is, IDis the primary key, there will be multiple rows for each "batch," and each row will have a status code in the STATUScolumn. There are a bunch of other columns, but these are the important ones.

也就是说,ID是主键,每个“批次”会有多行,每行都会在STATUS列中有一个状态代码。还有很多其他的列,但这些是重要的。

I need to write a query which will summarize the status codesfor each batch; there are three possible values that can go in the STATUS column, 0, 1, and 2, and I would like output that looks something like this:

我需要编写一个查询来汇总每个批次状态代码;STATUS 列中可以包含三个可能的值,0、1 和 2,我希望输出看起来像这样:

BATCH  STATUS0  STATUS1  STATUS2
1      2        1        0
2      1        0        0

Those numbers would be counts; for batch 1, there are

这些数字将是计数;对于第 1 批,有

  • 2 records where STATUSis set to 0
  • 1 record where STATUSis set to 1, and
  • no records where STATUSis set to 0.
  • STATUS设置为 0 的2 条记录
  • STATUS设置为 1 的1 条记录,以及
  • 没有STATUS设置为 0 的记录。

For batch 2, there is

对于第 2 批,有

  • 1 record where STATUSis set to 0, and
  • no records where STATUSis set to 1 or 2.
  • STATUS设置为 0 的1 条记录,以及
  • 没有STATUS设置为 1 或 2 的记录。

Is there a way that I can do this in one query, without having to rewrite the query for each status code? i.e. I can easily write a query like this, and run it three times:

有没有一种方法可以在一个查询中执行此操作,而不必为每个状态代码重写查询?即我可以轻松地编写这样的查询,并运行它三遍:

SELECT batch, COUNT(status)
FROM table
WHERE status = 0
GROUP BY batch

I could run that, then run it again where status = 1, and again where status = 2, but I'm hoping to do it in one query.

我可以运行它,然后在 status = 1 和 status = 2 的地方再次运行它,但我希望在一个查询中完成它。

If it makes a difference, aside from the STATUScolumn there is anothercolumn that I might want to summarize the same way--another reason that I don't want to have to execute SELECT statement after SELECT statement and amalgamate all of the results.

如果它有所不同,除了STATUS列之外,还有另一列我可能想以相同的方式总结——另一个原因是我不想在 SELECT 语句之后执行 SELECT 语句并合并所有结果。

回答by Tony Andrews

select batch 
,      count(case when status=1 then 1 end) status1
,      count(case when status=2 then 1 end) status2
,      count(case when status=3 then 1 end) status3
from   table
group by batch;

This is often called a "pivot" query, and I have written an article about how to generate these queries dynamically on my blog.

这通常被称为“枢轴”查询,我在我的博客上写了一篇关于如何动态生成这些查询的文章

Version using DECODE (Oracle-specific but less verbose):

使用 DECODE 的版本(特定于 Oracle 但不那么冗长):

select batch 
,      count(decode(status,1,1)) status1
,      count(decode(status,2,1)) status2
,      count(decode(status,3,1)) status3
from   table
group by batch;

回答by tekBlues

select batch,
sum(select case when status = 0 then 1 else 0 end) status0,
sum(select case when status = 1 then 1 else 0 end) status1,
sum(select case when status = 2 then 1 else 0 end) status2
from table
group by batch

回答by diederikh

select batch,
sum((decode(status,0,1,0)) status0,
sum((decode(status,1,1,0)) status1,
sum((decode(status,2,1,0)) status2,
from table
group by batch

回答by DCookie

OP asks if there's any performance benefit of one approach (SUM) over the other (COUNT). Running a simpleminded test on a table with 26K rows shows that the COUNT approach is significantly faster. YMMV.

OP 询问一种方法 (SUM) 是否比另一种方法 (COUNT) 有任何性能优势。在具有 26K 行的表上运行简单的测试表明 COUNT 方法要快得多。天啊。

DECLARE
  CURSOR B IS
     select batch_id
       FROM batch
      WHERE ROWNUM < 2000;

  v_t1  NUMBER;
  v_t2  NUMBER;
  v_c1  NUMBER;
  v_c2  NUMBER;
  v_opn INTEGER;
  v_cls INTEGER;
  v_btc VARCHAR2(100);
BEGIN
-- Loop using SUM
  v_t1 := dbms_utility.get_time;
  v_c1 := dbms_utility.get_cpu_time;
  FOR R IN B LOOP
     FOR R2 IN (SELECT batch_type_code
                     , SUM(decode(batch_status_code, 'CLOSED', 1, 0)) closed
                     , SUM(decode(batch_status_code, 'OPEN', 1, 0)) OPEN
                     , SUM(decode(batch_status_code, 'REWORK', 1, 0)) rework
                  FROM batch
                 GROUP BY batch_type_code) LOOP 
        v_opn := R2.open;
        v_cls := R2.closed;
     END LOOP;
  END LOOP;
  v_t2 := dbms_utility.get_time;
  v_c2 := dbms_utility.get_cpu_time;
  dbms_output.put_line('For loop using SUM:');
  dbms_output.put_line('CPU seconds used: '||(v_c2 - v_c1)/100);
  dbms_output.put_line('Elapsed time: '||(v_t2 - v_t1)/100);

-- Loop using COUNT
  v_t1 := dbms_utility.get_time;
  v_c1 := dbms_utility.get_cpu_time;
  FOR R IN B LOOP
     FOR R2 IN (SELECT batch_type_code
                     , COUNT(CASE WHEN batch_status_code = 'CLOSED' THEN 1 END) closed
                     , COUNT(CASE WHEN batch_status_code = 'OPEN' THEN 1 END) OPEN
                     , COUNT(CASE WHEN batch_status_code = 'REWORK' THEN 1 END) rework
                  FROM batch
                 GROUP BY batch_type_code) LOOP 
        v_opn := R2.open;
        v_cls := R2.closed;
     END LOOP;
  END LOOP;
  v_t2 := dbms_utility.get_time;
  v_c2 := dbms_utility.get_cpu_time;
  dbms_output.put_line('For loop using COUNT:');
  dbms_output.put_line('CPU seconds used: '||(v_c2 - v_c1)/100);
  dbms_output.put_line('Elapsed time: '||(v_t2 - v_t1)/100);
END;
/

This yielded the following output:

这产生了以下输出:

For loop using SUM:
CPU seconds used: 40
Elapsed time: 40.09
For loop using COUNT:
CPU seconds used: 33.26
Elapsed time: 33.34

I repeated the test a couple of times to eliminate any effects of caching. I also swapped the select statements. Results were similar across the board.

我重复了几次测试以消除缓存的任何影响。我还交换了选择语句。结果是相似的。

EDIT: this is the same test harness I used to answer a similar questionwith.

编辑:这是我用来回答类似问题的同一个测试工具。