Oracle 10g：MIN/MAX 列值估计

Question

提问by Kage

is it possible to retrieve statistics about the minimal or maximal value of a numeric column in Oracle 10g? I have found the table USER_TAB_COL_STATISTICS having a LOW_VALUE and HIGH_VALUE column, but I am not sure whether those are the values I am looking for.

是否可以检索有关 Oracle 10g 中数字列的最小值或最大值的统计信息？我发现表 USER_TAB_COL_STATISTICS 有一个 LOW_VALUE 和 HIGH_VALUE 列，但我不确定这些是否是我正在寻找的值。

I need to find an efficient way to ask the DBS for those statistics. Using a regular MIN(a) and MAX(a) query would be too slow on large tables.

我需要找到一种有效的方法来向 DBS 询问这些统计数据。在大型表上使用常规 MIN(a) 和 MAX(a) 查询会太慢。

Thanks in advance.

提前致谢。

Answer 1

回答by Tony Andrews

Yes, LOW_VALUE and HIGH_VALUE will tell you the minimum and maximum values in the column but:

是的，LOW_VALUE 和 HIGH_VALUE 会告诉你列中的最小值和最大值，但是：

they are stored as RAW(32) columns, so the meaning will not be immediately apparent
they will be as of the last time statistics were gathered for the table, so may not be accurate (unless you explicitly gather stats before using them)

它们存储为 RAW(32) 列，因此含义不会立即显现
它们将是上次为表格收集统计信息时的数据，因此可能不准确（除非您在使用它们之前明确收集统计信息）

If you index the column then MIN(a) and MAX(a) should be very fast as in this example where T1 has 50000 rows and is indexed on OBJECT_ID:

如果对列进行索引，则 MIN(a) 和 MAX(a) 应该非常快，如本示例中 T1 有 50000 行并在 OBJECT_ID 上建立索引：

SQL> select min(object_id) from t1;

MIN(OBJECT_ID)
--------------
           100

------------------------------------------------------------------------------------
| Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |       |     1 |     5 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |       |     1 |     5 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_ID | 53191 |   259K|     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        419  bytes sent via SQL*Net to client
        380  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

The result is the same if you select the MAX instead of the MIN. However, if you select the MIN and MAX in a single select statement the result is different:

如果您选择 MAX 而不是 MIN，结果是相同的。但是，如果在单个 select 语句中选择 MIN 和 MAX，结果会有所不同：

SQL> select min(object_id), max(object_id) from t1;

MIN(OBJECT_ID) MAX(OBJECT_ID)
-------------- --------------
           100          72809


-------------------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |     5 |    34   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |       |     1 |     5 |            |          |
|   2 |   INDEX FAST FULL SCAN| T1_ID | 53191 |   259K|    34   (0)| 00:00:01 |
-------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        125  consistent gets
          0  physical reads
          0  redo size
        486  bytes sent via SQL*Net to client
        380  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

This suggests that it may be better to get them separately, though I haven't conclusively proved that.

这表明分开获取它们可能更好，尽管我还没有最终证明这一点。

Answer 2

回答by Rob van Wijk

An example with a table containing numbers from 1 up to 1234:

包含从 1 到 1234 的数字的表格的示例：

SQL> create table t (nr) as select level from dual connect by level <= 1234
  2  /

Tabel is aangemaakt.

SQL> select min(nr)
  2       , max(nr)
  3    from t
  4  /

   MIN(NR)    MAX(NR)
---------- ----------
         1       1234

1 rij is geselecteerd.

If you analyze the table, the low_value and high_value columns contain the right numbers.

如果您分析该表，low_value 和 high_value 列包含正确的数字。

SQL> exec dbms_stats.gather_table_stats(user,'t')

PL/SQL-procedure is geslaagd.

SQL> select low_value
  2       , high_value
  3    from user_tab_columns
  4   where table_name = 'T'
  5     and column_name = 'NR'
  6  /

LOW_VALUE                                                        HIGH_VALUE
---------------------------------------------------------------- ----------------
C102                                                             C20D23

1 rij is geselecteerd.

They are raw, so they cannot be read easily. Using the utl_raw.cast_to_number function makes them readable:

它们是原始的，因此不容易阅读。使用 utl_raw.cast_to_number 函数使它们可读：

SQL> select utl_raw.cast_to_number(low_value)
  2       , utl_raw.cast_to_number(high_value)
  3    from user_tab_columns
  4   where table_name = 'T'
  5     and column_name = 'NR'
  6  /

UTL_RAW.CAST_TO_NUMBER(LOW_VALUE) UTL_RAW.CAST_TO_NUMBER(HIGH_VALUE)
--------------------------------- ----------------------------------
                                1                               1234

1 rij is geselecteerd.

However, be careful: the numbers may be inaccurate when updates have taken place between the time the statistics were gathered and the time the query ran.

但是，请注意：在收集统计信息和运行查询之间发生更新时，数字可能不准确。

Regards, Rob.

问候，罗伯。

Answer 3

回答by Jeffrey Kemp

The other answers here (using an index fast full scan; or examining the user_tab_columnsstatistics) are excellent.

这里的其他答案（使用索引快速全扫描；或检查user_tab_columns统计数据）非常好。

Here's another method that might be suitable - if you're only interested in a rough estimate, you can use the SAMPLEclause (and adjust the sample size up or down depending on how accurate you need it):

这是另一种可能合适的方法 - 如果您只对粗略估计感兴趣，您可以使用该SAMPLE条款（并根据您需要的准确程度向上或向下调整样本量）：

SELECT max(value), min(value) FROM t SAMPLE(1);

This takes a 1% sample from the table. It will generally sample different rows each time it is run, so don't expect the results to be identical run-to-run. If you want it to run quicker, you can have lower sample sizes, e.g. SAMPLE(0.01), or if you want to sample half the table, SAMPLE(50).

这需要从表中抽取 1% 的样本。它通常会在每次运行时对不同的行进行采样，因此不要期望每次运行的结果都相同。如果您希望它运行得更快，您可以使用较小的样本量，例如SAMPLE(0.01)，或者如果您想对一半的表进行抽样，则SAMPLE(50)。

The advantage of this approach over the "analyze, then-query-user-tab-cols" approach is that the analyze runs queries like this anyway in order to generate the statistics - so doing it this way may mean less work overall.

这种方法相对于“分析，然后查询用户选项卡列”方法的优势在于，分析无论如何都会运行这样的查询以生成统计信息 - 所以这样做可能意味着整体工作量减少。

Answer 4

回答by Vadzim

In my case the column of interest has TIMESTAMPtype and it seems there's no UTL_RAW.CAST_TO_TIMESTAMPfunction.

在我的情况下，感兴趣的列有TIMESTAMP类型，似乎没有UTL_RAW.CAST_TO_TIMESTAMP功能。

It helped to use the trick from http://www.oaktable.net/content/convert-rawhex-timestamp-0to convert Oracle RAWtype to TIMESTAMP:

它有助于使用http://www.oaktable.net/content/convert-rawhex-timestamp-0 中的技巧将 OracleRAW类型转换为TIMESTAMP：

select to_timestamp(
        to_char( to_number( substr( p_str, 1, 2 ), 'xx' ) - 100, 'fm00' ) ||
        to_char( to_number( substr( p_str, 3, 2 ), 'xx' ) - 100, 'fm00' ) ||
        to_char( to_number( substr( p_str, 5, 2 ), 'xx' ), 'fm00' ) ||
        to_char( to_number( substr( p_str, 7, 2 ), 'xx' ), 'fm00' ) ||
        to_char( to_number( substr( p_str,9, 2 ), 'xx' )-1, 'fm00' ) ||
        to_char( to_number( substr( p_str,11, 2 ), 'xx' )-1, 'fm00' ) ||
        to_char( to_number( substr( p_str,13, 2 ), 'xx' )-1, 'fm00' ), 'yyyymmddhh24miss' )
from (
select low_value p_str from user_tab_columns
   where table_name = 'MESSAGE' and column_name = 'TS'
)

Oracle 10g：MIN/MAX 列值估计

提问by Kage

回答by Tony Andrews

回答by Rob van Wijk

回答by Jeffrey Kemp

回答by Vadzim

相关推荐

最近更新

标签

Oracle 10g：MIN/MAX 列值估计

提问by Kage

回答by Tony Andrews

回答by Rob van Wijk

回答by Jeffrey Kemp

回答by Vadzim

相关推荐

oracle 内联视图上的谓词推送如何减慢查询速度？

Oracle 通用数据库链接不起作用

oracle 在只读同义词上禁用刷新时休眠自动更新

相当于 Oracle 的虚拟专用数据库的 SQL Server 是什么？

相关推荐

最近更新

标签