SQL 没有聚合函数的 GROUP BY

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20074562/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 23:07:14  来源:igfitidea点击:

GROUP BY without aggregate function

sqloraclegroup-byaggregate-functions

提问by XForCE07

I am trying to understand GROUP BY (new to oracle dbms)without aggregate function.
How does it operate?
Here is what i have tried.

我试图在没有聚合函数的情况下理解 GROUP BY (oracle dbms 的新手)
它是如何运作的?
这是我尝试过的。

EMP table on which i will run my SQL.
EMP TABLE

我将在其上运行 SQL 的 EMP 表。
电磁脉冲表

SELECT ename , sal
FROM emp
GROUP BY ename , sal

Result

结果

SELECT ename , sal  
FROM emp  
GROUP BY ename;  

Result

ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action:
Error at Line: 397 Column: 16

结果

ORA-00979: 不是 GROUP BY 表达式
00979。00000 - “不是 GROUP BY 表达式”
*原因:
*操作:行
错误:397 列:16

SELECT ename , sal  
FROM emp  
GROUP BY sal;  

Result

ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 411 Column: 8

结果

ORA-00979: 不是 GROUP BY 表达式
00979。00000 - “不是 GROUP BY 表达式”
*原因:
*操作:行错误:411 列:8

SELECT empno , ename , sal  
FROM emp  
GROUP BY sal , ename;  

Result

ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 425 Column: 8

结果

ORA-00979: 不是 GROUP BY 表达式
00979。00000 - “不是 GROUP BY 表达式”
*原因:
*操作:行错误:425 列:8

SELECT empno , ename , sal  
FROM emp  
GROUP BY empno , ename , sal;  

Result

结果

So, basically the number of columns have to be equal to the number of columns in the GROUP BY clause, but i still do not understand why or what is going on.

所以,基本上列数必须等于 GROUP BY 子句中的列数,但我仍然不明白为什么或发生了什么。

回答by Tobberoth

That's how GROUP BY works. It takes several rows and turns them into one row. Because of this, it has to know what to do with all the combined rows where there have different values for some columns (fields). This is why you have two options for every field you want to SELECT : Either include it in the GROUP BY clause, or use it in an aggregate function so the system knows how you want to combine the field.

这就是 GROUP BY 的工作原理。它需要几行并将它们变成一行。因此,它必须知道如何处理某些列(字段)具有不同值的所有组合行。这就是为什么您对要 SELECT 的每个字段都有两个选项:要么将其包含在 GROUP BY 子句中,要么在聚合函数中使用它,以便系统知道您希望如何组合该字段。

For example, let's say you have this table:

例如,假设您有这张表:

Name | OrderNumber
------------------
John | 1
John | 2

If you say GROUP BY Name, how will it know which OrderNumber to show in the result? So you either include OrderNumber in group by, which will result in these two rows. Or, you use an aggregate function to show how to handle the OrderNumbers. For example, MAX(OrderNumber), which means the result is John | 2or SUM(OrderNumber)which means the result is John | 3.

如果你说 GROUP BY Name,它怎么知道在结果中显示哪个 OrderNumber?因此,您可以在 group by 中包含 OrderNumber,这将导致这两行。或者,您使用聚合函数来展示如何处理 OrderNumber。例如,MAX(OrderNumber),表示结果为John | 2SUM(OrderNumber)表示结果为John | 3

回答by oerkelens

Given this data:

鉴于此数据:

Col1  Col2  Col3
 A     X     1
 A     Y     2
 A     Y     3
 B     X     0
 B     Y     3
 B     Z     1

This query

这个查询

SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2, Col3

Would result in exactly the same table.

将导致完全相同的表。

However, this query:

但是,这个查询:

SELECT Col1, Col2 FROM data GROUP BY Col1, Col2

Would result in

会导致

Col1  Col2
 A     X  
 A     Y  
 B     X  
 B     Y  
 B     Z  

Now, a query:

现在,一个查询:

SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2

Would create a problem: the line with A, Y is the result of grouping the two lines

会产生一个问题:带有 A, Y 的行是将这两行分组的结果

 A     Y     2
 A     Y     3

So, which value should be in Col3, '2' or '3'?

那么,哪个值应该在 Col3、'2' 或 '3' 中?

Normally you would use a group by to calculate e.g. a sum:

通常,您会使用 group by 来计算总和:

SELECT Col1, Col2, SUM(Col3) FROM data GROUP BY Col1, Col2

So in the line we had a problem with we now get (2+3) = 5.

所以在这一行中,我们遇到了问题,现在得到 (2+3) = 5。

Grouping by all your columns in your select is effectively the same as using DISTINCT, and it is preferable to use the DISTINCT keyword word readability in this case.

按选择中的所有列进行分组实际上与使用 DISTINCT 相同,在这种情况下最好使用 DISTINCT 关键字词可读性。

So instead of

所以代替

SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2, Col3

use

SELECT DINSTINCT Col1, Col2, Col3 FROM data

回答by Mister_Tom

You're experiencing a strictrequirement of the GROUP BY clause. Every column not in the group-by clause must have a function applied to reduce all records for the matching "group" to a single record (sum, max, min, etc).

您遇到了GROUP BY 子句的严格要求。不在 group-by 子句中的每一列都必须应用一个函数来将匹配“组”的所有记录减少到单个记录(总和、最大值、最小值等)。

If you list all queried (selected) columns in the GROUP BY clause, you are essentially requesting that duplicate records be excluded from the result set. That gives the same effect as SELECT DISTINCT which also eliminates duplicate rows from the result set.

如果您在 GROUP BY 子句中列出所有查询(选择)的列,您实际上是在请求从结果集中排除重复记录。这提供了与 SELECT DISTINCT 相同的效果,它也从结果集中消除了重复的行。

回答by David Aldridge

The only real use case for GROUP BY without aggregation is when you GROUP BY more columns than are selected, in which case the selected columns might be repeated. Otherwise you might as well use a DISTINCT.

没有聚合的 GROUP BY 唯一真正的用例是当您 GROUP BY 的列多于选定的列时,在这种情况下,选定的列可能会重复。否则,您不妨使用 DISTINCT。

It's worth noting that other RDBMS's do not require that all non-aggregated columns be included in the GROUP BY. For example in PostgreSQL if the primary key columns of a table are included in the GROUP BY then other columns of that table need not be as they are guaranteed to be distinct for every distinct primary key column. I've wished in the past that Oracle did the same as it would have made for more compact SQL in many cases.

值得注意的是,其他 RDBMS 不要求所有非聚合列都包含在 GROUP BY 中。例如,在 PostgreSQL 中,如果表的主键列包含在 GROUP BY 中,则该表的其他列不需要,因为它们保证对于每个不同的主键列都是不同的。过去我曾希望 Oracle 在许多情况下能像它为更紧凑的 SQL 所做的那样做。

回答by SriniV

Let me give some examples.

让我举一些例子。

Consider this data.

考虑这个数据。

CREATE TABLE DATASET ( VAL1 CHAR ( 1 CHAR ),
                   VAL2 VARCHAR2 ( 10 CHAR ),
                   VAL3 NUMBER );

INSERT INTO
      DATASET ( VAL1, VAL2, VAL3 )
VALUES
      ( 'b', 'b-details', 2 );

INSERT INTO
      DATASET ( VAL1, VAL2, VAL3 )
VALUES
      ( 'a', 'a-details', 1 );

INSERT INTO
      DATASET ( VAL1, VAL2, VAL3 )
VALUES
      ( 'c', 'c-details', 3 );

INSERT INTO
      DATASET ( VAL1, VAL2, VAL3 )
VALUES
      ( 'a', 'dup', 4 );

INSERT INTO
      DATASET ( VAL1, VAL2, VAL3 )
VALUES
      ( 'c', 'c-details', 5 );

COMMIT;

Whats there in table now

现在表里有什么

SELECT * FROM DATASET;

VAL1 VAL2             VAL3
---- ---------- ----------
b    b-details           2
a    a-details           1
c    c-details           3
a    dup                 4
c    c-details           5

5 rows selected.

--aggregate with group by

--与分组聚合

SELECT
      VAL1,
      COUNT ( * )
FROM
      DATASET A
GROUP BY
      VAL1;

VAL1   COUNT(*)
---- ----------
b             1
a             2
c             2

3 rows selected.

--aggregate with group by multiple columns but select partial column

--aggregate with group by multiple columns 但选择部分列

SELECT
      VAL1,
      COUNT ( * )
FROM
      DATASET A
GROUP BY
      VAL1,
      VAL2;

VAL1  
---- 
b             
c             
a             
a             

4 rows selected.

--No aggregate with group by multiple columns

-- 没有按多列分组的聚合

SELECT
      VAL1,
      VAL2
FROM
      DATASET A
GROUP BY
      VAL1,
      VAL2;

    VAL1  
    ---- 
    b    b-details
    c    c-details
    a    dup
    a    a-details

    4 rows selected.

--No aggregate with group by multiple columns

-- 没有按多列分组的聚合

SELECT
      VAL1
FROM
      DATASET A
GROUP BY
      VAL1,
      VAL2;

    VAL1  
    ---- 
    b
    c
    a
    a

    4 rows selected.

You have N columns in select (excluding aggregations), then you should have N or N+x columns

您在选择中有 N 列(不包括聚合),那么您应该有 N 或 N+x 列

回答by ogres

If you have some column in SELECT clause , how will it select it if there is several rows ? so yes , every column in SELECT clause should be in GROUP BY clause also , you can use aggregate functions in SELECT ...

如果 SELECT 子句中有一些列,如果有几行,它将如何选择它?所以是的,SELECT 子句中的每一列也应该在 GROUP BY 子句中,你可以在 SELECT 中使用聚合函数......

you can have column in GROUP BY clause which is not in SELECT clause , but not otherwise

您可以在 GROUP BY 子句中包含不在 SELECT 子句中的列,否则不能

回答by Munawar Shah Afridi

Use sub query e.g:

使用子查询,例如:

SELECT field1,field2,(SELECT distinct field3 FROM tbl2 WHERE criteria) AS field3
FROM tbl1 GROUP BY field1,field2

OR

或者

SELECT DISTINCT field1,field2,(SELECT distinct field3 FROM tbl2 WHERE criteria) AS field3
FROM tbl1

回答by sancz

As an addition

作为补充

basically the number of columns have to be equal to the number of columns in the GROUP BY clause

基本上列数必须等于 GROUP BY 子句中的列数

is not a correct statement.

不是正确的说法。

  • Any attribute which is not a part of GROUP BY clause can not be used for selection
  • Any attribute which is a part of GROUP BY clause can be used for selection but not mandatory.
  • 任何不属于 GROUP BY 子句的属性都不能用于选择
  • 作为 GROUP BY 子句一部分的任何属性都可以用于选择,但不是强制性的。

回答by user2839702

I know you said you want to understand group by if you have data like this:

我知道你说你想了解 group by 如果你有这样的数据:

COL-A  COL-B  COL-C  COL-D
  1      Ac      C1     D1
  2      Bd      C2     D2
  3      Ba      C1     D3
  4      Ab      C1     D4
  5      C       C2     D5

And you want to make the data appear like:

你想让数据看起来像:

COL-A  COL-B  COL-C  COL-D
  4      Ab      C1     D4
  1      Ac      C1     D1
  3      Ba      C1     D3
  2      Bd      C2     D2
  5      C       C2     D5

You use:

你用:

select * from table_name
order by col-c,colb

Because I think this is what you intend to do.

因为我认为这就是你打算做的。