SQL 表的 Oracle 行计数(*)与来自 DBA_TABLES 的 NUM_ROWS

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14125731/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 12:55:07  来源:igfitidea点击:

Oracle row count of table by count(*) vs NUM_ROWS from DBA_TABLES

sqloraclecount

提问by Srujan Kumar Gulla

Looks like count(*) is slower than NUM_ROWS. Can experts in this area throw some light on this.

看起来 count(*) 比 NUM_ROWS 慢。这方面的专家能否对此有所了解。

回答by Ben

According to the documentationNUM_ROWS is the "Number of rows in the table", so I can see how this might be confusing. There, however, is a major difference between these two methods.

根据文档NUM_ROWS 是“表中的行数”,所以我可以看到这可能会令人困惑。但是,这两种方法之间存在重大差异。

This query selects the number of rows in MY_TABLE from a system view. This is data that Oracle has previously collected and stored.

此查询从系统视图中选择 MY_TABLE 中的行数。这是 Oracle 之前收集和存储的数据。

select num_rows from all_tables where table_name = 'MY_TABLE'

This query counts the current number of rows in MY_TABLE

此查询计算 MY_TABLE 中的当前行数

select count(*) from my_table

By definition they are difference pieces of data. There are two additional pieces of information you need about NUM_ROWS.

根据定义,它们是不同的数据。关于 NUM_ROWS,您还需要两条额外的信息。

  1. In the documentation there's an asterisk by the column name, which leads to this note:

    Columns marked with an asterisk (*) are populated only if you collect statistics on the table with the ANALYZE statement or the DBMS_STATS package.

    This means that unless you have gathered statistics on the table then this column will not have any data.

  2. Statistics gathered in 11g+ with the default estimate_percent, or with a 100% estimate, will return an accurate number for that point in time. But statistics gathered before 11g, or with a custom estimate_percentless than 100%, uses dynamic sampling and may be incorrect. If you gather 99.999% a single row may be missed, which in turn means that the answer you get is incorrect.

  1. 在文档中,列名旁边有一个星号,这导致了这个注释:

    仅当您使用 ANALYZE 语句或 DBMS_STATS 包收集有关表的统计信息时,才会填充标有星号 (*) 的列。

    这意味着除非您收集了表的统计信息,否则此列将没有任何数据。

  2. 使用默认值estimate_percent或 100% 估计在 11g+ 中收集的统计数据将返回该时间点的准确数字。但是在 11g 之前收集的统计数据,或自定义estimate_percent小于 100% 的统计数据,使用动态采样并且可能不正确。如果您收集了 99.999%,则可能会遗漏一行,这反过来意味着您得到的答案是不正确的。

If your table is neverupdated then it is certainly possible to use ALL_TABLES.NUM_ROWS to find out the number of rows in a table. However, and it's a big however, if any process inserts or deletes rows from your table it will be at best a good approximation and depending on whether your database gathers statistics automatically could be horribly wrong.

如果您的表从未更新,那么当然可以使用 ALL_TABLES.NUM_ROWS 来找出表中的行数。然而,这是一个很大的问题,如果任何进程从您的表中插入或删除行,它充其量只是一个很好的近似值,并且取决于您的数据库是否自动收集统计信息可能是非常错误的。

Generally speaking, it is always better to actually count the number of rows in the table rather then relying on the system tables.

一般来说,实际计算表中的行数总是比依赖系统表更好。