SQL 如何在 Oracle 的 CLOB 列上使用 GROUP BY?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20678881/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 00:32:39  来源:igfitidea点击:

How to use GROUP BY on a CLOB column with Oracle?

sqloraclegroup-bymax

提问by Xavier

I'm trying to combine the GROUP BYfunction with a MAX in oracle. I read a lot of docs around, try to figure out how to format my request by Oracle always returns:

我正在尝试将该GROUP BY函数与 oracle 中的 MAX结合起来。我阅读了很多文档,试图弄清楚如何格式化我的 Oracle 请求总是返回:

ORA-00979: "not a group by expression"

ORA-00979:“不是按表达式分组”

Here is my request:

这是我的要求:

SELECT A.T_ID, B.T, MAX(A.V) 
FROM bdd.LOG A, bdd.T_B B
WHERE B.T_ID = A.T_ID
GROUP BY A.T_ID
HAVING MAX(A.V) < '1.00';

Any tips ?

有小费吗 ?

EDITIt seems to got some tricky part with the datatype of my fields.

编辑我的字段的数据类型似乎有一些棘手的部分。

  • T_IDis VARCHAR2
  • A.Vis VARCHAR2
  • B.Tis CLOB
  • T_IDVARCHAR2
  • A.VVARCHAR2
  • B.TCLOB

采纳答案by Xavier

After some fixes it seems that the major issue was in the group by

经过一些修复后,似乎主要问题出在 group by

YOu have to use the same tables in the SELECTand in the GROUP BY

您必须在SELECT和 中使用相同的表GROUP BY

I also take only a substring of the CLOB to get it works. THe working request is :

我也只使用 CLOB 的一个子串来让它工作。工作要求是:

    SELECT TABLE_A.ID,
       TABLE_A.VA,
       B.TRACE
FROM
(SELECT A.T_ID ID,
          MAX(A.V) VA
   FROM BDD.LOG A
   GROUP BY A.T_ID HAVING MAX(A.V) <= '1.00') TABLE_A,
                                                                BDD.T B
WHERE TABLE_A.ID = B.T_id;

回答by

I'm very familiar with the phenomenon of writing queries for a table designed by someone else to do something almost completely different from what you want. When I've had this same problem, I've used.

我非常熟悉为其他人设计的表编写查询以执行与您想要的几乎完全不同的事情的现象。当我遇到同样的问题时,我已经使用了。

GROUP BY TO_CHAR(theclob)

and then of course you have to TO_CHARthe clob in your outputs too.

然后当然你也必须TO_CHAR在你的输出中加入clob。

Note that there are 2 levels of this problem... the first is that you have a clob column that didn't need to be a clob; it only holds some smallish strings that would fit in a VARCHAR2. My workaround applies to this.

请注意,此问题有 2 个级别……第一个是您有一个不需要是 clob 的 clob 列;它只包含一些适合VARCHAR2. 我的解决方法适用于此。

The second level is you actually wantto group by a column that contains large strings. In that case the TO_CHARprobably won't help.

第二级是您实际上想要按包含大字符串的列进行分组。在这种情况下,TO_CHAR可能无济于事。

回答by Saharsh Shah

Try this:

尝试这个:

SELECT A.T_ID, B.T, MAX(A.V) 
FROM bdd.LOG A, bdd.T_B B
WHERE B.T_ID = A.T_ID
GROUP BY A.T_ID, B.T
HAVING MAX(A.V) < 1;

回答by ZeroK

This response is a little late, but for those who need values besides the grouping value and the Maximum criteria column, you can use ROW_NUMBER() over a partition to get what you want:

这个响应有点晚了,但对于那些需要除分组值和最大条件列之外的值的人,您可以在分区上使用 ROW_NUMBER() 以获得您想要的:

SELECT T_ID, T, V
FROM 
(
 SELECT A.T_ID, B.T, A.V, ROW_NUMBER() OVER (PARTITION BY A.T_ID ORDER BY to_number(A.V) DESC) rownumber
 FROM bdd.LOG A, bdd.T_B B
 WHERE B.T_ID = A.T_ID
)
WHERE rownumber = 1

Don't forget the DESC modifier on the ORDER BY to get the Maximum value; without it you get the Minimum value. If A.V is nullable you will also need to wrap it in NVL() or you'll just get NULLs; NULL values always sort first (at least in Oracle SQL) regardless of if you choose ascending or descending order.

不要忘记 ORDER BY 上的 DESC 修饰符以获得最大值;没有它,您将获得最小值。如果 AV 可以为空,您还需要将其包装在 NVL() 中,否则您只会得到 NULL;无论您选择升序还是降序,NULL 值总是首先排序(至少在 Oracle SQL 中)。

回答by SkateScout

you can if possible transform the clob into the PK if possible and than to an select on the PK. This is even faster according to execution plan then rowid. I this case i need the first not empty clob. So i say if clob is not empty use the pk else null. The result is not an clob and i can fetch the clob in an outer query.

如果可能的话,您可以将clob转换为PK,而不是PK上的选择。根据执行计划,这比 rowid 更快。在这种情况下,我需要第一个非空的clob。所以我说如果 clob 不为空,则使用 pk else null。结果不是clob,我可以在外部查询中获取clob。

select a.* ,r.DESCRIPTION 
from (
select distinct
FIRST_VALUE(
  case 
     when a.DESCRIPTION is null then null 
     else PK_COL 
  end 
  IGNORE NULLS) 
  OVER (ORDER BY a.sort_col desc ROWS between UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) DESCRIPTION_PK, 
group_column
from reporting a where group_column='xyz) a
join reporting r on (r.PK_COL=a.DESCRIPTION_PK);

回答by Nikita Bosik

This doesn't solve OP's problem, but may help in some cases.

这不能解决 OP 的问题,但在某些情况下可能会有所帮助。

Let we have

让我们有

CREATE TABLE "TEST" (   
  ID NUMBER, 
  DATA CLOB
);

If actual CLOB data doesn't exceed 4000 chars, we can convert it to VARCHAR32using TO_CHARand then perform grouping we need:

如果实际 CLOB 数据不超过 4000 个字符,我们可以将其转换为VARCHAR32usingTO_CHAR然后执行我们需要的分组:

SELECT 
  DATA_AS_CHAR,
  COUNT(*)
FROM
(
  SELECT
    ID,
    TO_CHAR(DATA) DATA_AS_CHAR
  FROM TEST
)
GROUP BY
  DATA_AS_CHAR;

If it exceeds, we have to apply some heuristics, e.g. substring first 4000 chars:

如果超过,我们必须应用一些启发式方法,例如子字符串前 4000 个字符:

SELECT 
  DATA_AS_CHAR,
  COUNT(*)
FROM
(
  SELECT
    ID,
    TO_CHAR(DBMS_LOB.SUBSTR(DATA, 4000)) DATA_AS_CHAR
  FROM TEST
)
GROUP BY
  DATA_AS_CHAR;

Like any other hashing technique, this may lead to collisions (having less number of groups than expected).

与任何其他散列技术一样,这可能会导致冲突(组数少于预期)。

回答by Thomas Tschernich

WITH foo as (
  SELECT A.T_ID, B.T, MAX(A.V) maxav
  FROM bdd.LOG A, bdd.T_B B
  WHERE B.T_ID = A.T_ID
  GROUP BY A.T_ID, B.T
)
SELECT * FROM foo WHERE maxav < 1