SQL 表的 Oracle 行计数(*)与来自 DBA_TABLES 的 NUM_ROWS
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14125731/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Oracle row count of table by count(*) vs NUM_ROWS from DBA_TABLES
提问by Srujan Kumar Gulla
Looks like count(*) is slower than NUM_ROWS. Can experts in this area throw some light on this.
看起来 count(*) 比 NUM_ROWS 慢。这方面的专家能否对此有所了解。
回答by Ben
According to the documentationNUM_ROWS is the "Number of rows in the table", so I can see how this might be confusing. There, however, is a major difference between these two methods.
根据文档NUM_ROWS 是“表中的行数”,所以我可以看到这可能会令人困惑。但是,这两种方法之间存在重大差异。
This query selects the number of rows in MY_TABLE from a system view. This is data that Oracle has previously collected and stored.
此查询从系统视图中选择 MY_TABLE 中的行数。这是 Oracle 之前收集和存储的数据。
select num_rows from all_tables where table_name = 'MY_TABLE'
This query counts the current number of rows in MY_TABLE
此查询计算 MY_TABLE 中的当前行数
select count(*) from my_table
By definition they are difference pieces of data. There are two additional pieces of information you need about NUM_ROWS.
根据定义,它们是不同的数据。关于 NUM_ROWS,您还需要两条额外的信息。
In the documentation there's an asterisk by the column name, which leads to this note:
Columns marked with an asterisk (*) are populated only if you collect statistics on the table with the ANALYZE statement or the DBMS_STATS package.
This means that unless you have gathered statistics on the table then this column will not have any data.
Statistics gathered in 11g+ with the default
estimate_percent
, or with a 100% estimate, will return an accurate number for that point in time. But statistics gathered before 11g, or with a customestimate_percent
less than 100%, uses dynamic sampling and may be incorrect. If you gather 99.999% a single row may be missed, which in turn means that the answer you get is incorrect.
在文档中,列名旁边有一个星号,这导致了这个注释:
仅当您使用 ANALYZE 语句或 DBMS_STATS 包收集有关表的统计信息时,才会填充标有星号 (*) 的列。
这意味着除非您收集了表的统计信息,否则此列将没有任何数据。
使用默认值
estimate_percent
或 100% 估计在 11g+ 中收集的统计数据将返回该时间点的准确数字。但是在 11g 之前收集的统计数据,或自定义estimate_percent
小于 100% 的统计数据,使用动态采样并且可能不正确。如果您收集了 99.999%,则可能会遗漏一行,这反过来意味着您得到的答案是不正确的。
If your table is neverupdated then it is certainly possible to use ALL_TABLES.NUM_ROWS to find out the number of rows in a table. However, and it's a big however, if any process inserts or deletes rows from your table it will be at best a good approximation and depending on whether your database gathers statistics automatically could be horribly wrong.
如果您的表从未更新,那么当然可以使用 ALL_TABLES.NUM_ROWS 来找出表中的行数。然而,这是一个很大的问题,如果任何进程从您的表中插入或删除行,它充其量只是一个很好的近似值,并且取决于您的数据库是否自动收集统计信息可能是非常错误的。
Generally speaking, it is always better to actually count the number of rows in the table rather then relying on the system tables.
一般来说,实际计算表中的行数总是比依赖系统表更好。