oracle last_analyzed 和 stale_stats 为空是什么意思

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33400267/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 00:58:31  来源:igfitidea点击:

What does it mean when last_analyzed and stale_stats is null

oracleoptimizationoracle11gdatabase-performance

提问by John

We are currently running Oracle 11g and I am looking into if we need to run statistics after a large import. We have statistics_level set to 'TYPICAL'. Based on this I'm thinking that we do NOT need to update statistics:

我们目前正在运行 Oracle 11g,我正在研究是否需要在大量导入后运行统计信息。我们将 statistics_level 设置为“典型”。基于此,我认为我们不需要更新统计信息:

Starting with Oracle Database 11g, the MONITORING and NOMONITORING keywords have been deprecated and statistics are collected automatically.

从 Oracle 数据库 11g 开始,不推荐使用 MONITORING 和 NOMONITORING 关键字并自动收集统计信息。

https://docs.oracle.com/cd/B28359_01/server.111/b28310/tables005.htm

https://docs.oracle.com/cd/B28359_01/server.111/b28310/tables005.htm

However, after creating my database and running my modest import (100's of thousands to millions of records in a handful of tables and the creation of a number of indexes) all of the tables affected by the import show null for last_analyzed and stale_stats using the query below.

但是,在创建我的数据库并运行我的适度导入(少数表中的 100 到数百万条记录以及创建多个索引)之后,所有受导入影响的表都使用查询为 last_analyzed 和 stale_stats 显示 null以下。

select 
    table_name, 
    stale_stats, 
    last_analyzed
from 
    dba_tab_statistics
 where 
    owner = 'MY_SCHEMA'
order by 
    last_analyzed desc, table_name asc
;

Should I expect certain queries to have poor performance in this state?

我应该期望某些查询在这种状态下性能不佳吗?

Should I expect the statistics to eventually run and last_analyzed and stale_stats to eventually be populated (the documentation suggests that these values are updated about every three hours by default)?

我是否应该期望统计信息最终运行并最终填充 last_analyzed 和 stale_stats(文档表明这些值默认每三个小时更新一次)?

It has been my experience that for moderately sized databases (tables with millions of records and less than 10's of millions of records) that mucking around with stats is not necessary and generally causes more problems than it solves. Is this generally the case?

我的经验是,对于中等大小的数据库(包含数百万条记录的表和少于 10 条记录的数百万条记录),没有必要处理统计数据,而且通常会导致比解决的问题更多的问题。一般都是这样吗?

* * * NOTES ON OUR RESOLUTION * * *

* * * 关于我们决议的说明 * * *

We were using this:

我们正在使用这个:

analyze table my_table compute statistics

We switched to this:

我们切换到这个:

dbms_stats.gather_table_stats('MY_SCHEMA', 'MY_TABLE');

The analyze table statement took about 1:30 minutes in one environment and about 15:00 - 20:00 minutes in the second environment.

分析表语句在一种环境中大约需要 1:30 分钟,在第二种环境中大约需要 15:00 - 20:00 分钟。

The gather_table_stats statement took about 0:30 to 1:00 minutes in both of the two instances we were able to examine.

在我们能够检查的两个实例中,gather_table_stats 语句花费了大约 0:30 到 1:00 分钟。

Our plan moving forward is to switch our analyze table statements to gather_table_stats calls.

我们的计划是将我们的分析表语句切换到gather_table_stats 调用。

采纳答案by Husqvik

STATISTICS_LEVEL and gathering table/index statistics are entirely different things. STATISTICS_LEVEL affect if row source statistics are gathered during command execution. So then you're able to compare the optimizer estimates and actual values for each step in display cursor.

STATISTICS_LEVEL 和收集表/索引统计信息是完全不同的事情。STATISTICS_LEVEL 影响在命令执行期间是否收集行源统计信息。这样您就可以比较显示光标中每个步骤的优化器估计值和实际值。

So table/index statistics are used for execution plan optimization and STATISTICS_LEVEL for gathering execution statistics when execution plan is being executed and it's mostly for diagnostic purposes.

因此,表/索引统计信息用于执行计划优化,STATISTICS_LEVEL 用于在执行计划执行时收集执行统计信息,主要用于诊断目的。

When last_analyzedis null it means that table statistics hasn't been gathered yet.

last_analyzed为 null 时表示尚未收集表统计信息。

stale_statssays whether the stats are considered fresh or stale, or if the stats will be gathered automatically next time or not. The default settings is 10 percent. If you gather table statistics and then insert/update/delete less than 10 percent of rows the statistics is considered fresh. When you reach 10 percent of modified rows they become stale.

stale_stats表示统计数据是新鲜的还是陈旧的,或者下次是否会自动收集统计数据。默认设置为 10%。如果您收集表统计信息,然后插入/更新/删除少于 10% 的行,则统计信息被认为是新鲜的。当您达到 10% 的修改行时,它们就会变得陈旧。

Oracle by default gathers table/index statistics automatically during maintenance window which is automatically configured when a database is created. It's usually reconfigured by DBAs if there are specific requirements.

默认情况下,Oracle 在维护窗口期间自动收集表/索引统计信息,这是在创建数据库时自动配置的。如果有特定要求,它通常由 DBA 重新配置。

Regarding the STATISTICS_LEVEL, with default value TYPICALit looks like this:

关于STATISTICS_LEVEL,默认值TYPICAL如下所示:

HUSQVIK@hq_pdb_tcp> select * from dual;

D
-
X
HUSQVIK@hq_pdb_tcp> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
SQL_ID  a5ks9fhw2v9s1, child number 0
-------------------------------------
select * from dual

Plan hash value: 272002086

-------------------------------------------
| Id  | Operation         | Name | E-Rows |
-------------------------------------------
|   0 | SELECT STATEMENT  |      |        |
|   1 |  TABLE ACCESS FULL| DUAL |      1 |
-------------------------------------------

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

We don't see anything more than estimated number of rows. If you set ALTER SESSION SET statistics_level = ALLthen

除了估计的行数之外,我们看不到任何东西。如果你设置ALTER SESSION SET statistics_level = ALL那么

HUSQVIK@hq_pdb_tcp> ALTER SESSION SET statistics_level = ALL;
HUSQVIK@hq_pdb_tcp> select * from dual;

D
-
X
HUSQVIK@hq_pdb_tcp> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------
SQL_ID  a5ks9fhw2v9s1, child number 1
-------------------------------------
select * from dual

Plan hash value: 272002086

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS FULL| DUAL |      1 |      1 |      1 |00:00:00.01 |       3 |
------------------------------------------------------------------------------------

Now we see also the actual number of rows and time taken to execute each step as well as number of consistent reads (buffers column).

现在我们还看到了执行每个步骤所花费的实际行数和时间以及一致读取的数量(缓冲区列)。

With more complex queries you will get much more information than this. You should check the documentation at https://docs.oracle.com/database/121/ARPLS/d_xplan.htm

通过更复杂的查询,您将获得比这更多的信息。您应该查看https://docs.oracle.com/database/121/ARPLS/d_xplan.htm 上的文档

Also be aware that the statistics sampling is not done with every row but by default every 128 rows (can be changed using undocumented _rowsource_statistics_sampfreqparameter)

另请注意,统计采样不是对每一行进行的,而是默认每 128 行进行一次(可以使用未记录的_rowsource_statistics_sampfreq参数进行更改)

回答by Jon Heller

(Husqvik thoroughly explained the meaning of the columns and parameters, this answer only addresses how to gather statistics.)

(Husqvik 彻底解释了列和参数的含义,此答案仅涉及如何收集统计信息。)

Statistics should be manually gathered after any significant*change to a table. Oracle has a great default, automatic statistics gathering processes since 11g. But even with that new system there are still at least two good reasons to manually gather statistics. The default statistics gathering auto-task is normally meant for slowly-changing OLTP tables, not fast-changing data warehouse tables.

在对表进行任何重大*更改后,应手动收集统计信息。自 11g 以来,Oracle 有一个很好的默认自动统计收集过程。但即使有了这个新系统,至少仍然有两个很好的理由来手动收集统计数据。默认的统计信息收集自动任务通常用于缓慢变化的 OLTP 表,而不是快速变化的数据仓库表。

  1. Significant data changes can easily lead to significant performance problems.If the tables are going to be used right after they are loaded then they need good statistics immediately.

    A common problem in ETL processes is when tables go from 1 row to a million rows. The optimizer thinks there is still only one row in large tables and uses lots of nested loops joins instead of hash joins. Those algorithms work well in different contexts; without good statistics Oracle does not know the correct context.

    It's important to note that a NULL LAST_ANALYZEDis not the worst case scenario. When there are no statistics at all, Oracle will use dynamic sampling to generate quick statistics estimates. The worst case is when the statistics job ran last night when the table is empty; Oracle thinks it has good statistics when it really doesn't.

  2. The statistics auto-task may not be able to keep up with large changes.The statistics auto-task is a low-priority, single-threaded process. If there are too many large tables left to the automatic process it may not be able to process them during the maintenance window.

  1. 显着的数据更改很容易导致严重的性能问题。如果要在加载后立即使用这些表,则它们需要立即进行良好的统计。

    ETL 过程中的一个常见问题是当表从 1 行变为一百万行时。优化器认为大表中仍然只有一行,并使用大量嵌套循环连接而不是散列连接。这些算法在不同的环境中都运行良好;没有好的统计数据 Oracle 不知道正确的上下文。

    重要的是要注意 NULLLAST_ANALYZED不是最坏的情况。当根本没有统计数据时,Oracle 将使用动态抽样来生成快速统计估计值。最坏的情况是昨晚当表为空时统计作业运行;Oracle 认为它有很好的统计数据,但实际上并没有。

  2. 统计自动任务可能跟不上大的变化。统计自动任务是一个低优先级的单线程进程。如果有太多大表留给自动处理,它可能无法在维护窗口期间处理它们。



The bad news is that developers can't ignore optimizer statistics. The DBAs can't just handle it later. It might help to read some of the chapters from the manuals, such as Managing Optimizer Statistics.

坏消息是开发人员不能忽略优化器统计信息。DBA 不能只是在以后处理它。阅读手册中的一些章节可能会有所帮助,例如管理优化器统计信息

The goods news is that Oracle 11g finally has nice default settings. You usually don't need to muck around with the parameters. In most cases there's a simple rule to follow: if the table changed significantly, run this:

好消息是 Oracle 11g 终于有了不错的默认设置。您通常不需要处理参数。在大多数情况下,有一个简单的规则要遵循:如果表发生了显着变化,请运行以下命令:

dbms_stats.gather_table_stats('SCHEMA_NAME', 'TABLE_NAME');


*: "Significant" is a subjective word. A change is normally significant in terms of relative size, not absolute. Adding one million rows to a table is significant if the table currently has one row, but not if the table has a billion rows.

*:“显着”是一个主观词。就相对大小而言,变化通常是显着的,而不是绝对的。如果表当前只有一行,则向表中添加一百万行很重要,但如果表有十亿行则不然。