oracle 为什么相同的查询需要不同的时间来运行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16948164/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why does the same query takes different amount of time to run?
提问by lalachka
I have this problem that has been going on for months. I automate reports at my job, we use oracle. I write a procedure, time it, it runs in a few minutes. I then set it up for monthly runs.
我有这个问题已经持续了几个月。我在工作中自动生成报告,我们使用 oracle。我写了一个程序,计时,它在几分钟内运行。然后我将其设置为每月运行。
And then every month, some report runs for hours. It's all the same queries that ran in a few minutes for months before and all of a sudden they're taking hours to run.
然后每个月都会有一些报告运行数小时。几个月前几分钟内运行的所有查询都是相同的,突然之间它们需要几个小时才能运行。
I end up rewriting my procedures every now and then and to me this defeats the purpose of automating. No one here can help me.
我最终时不时地重写我的程序,对我来说这违背了自动化的目的。这里没有人可以帮助我。
What am I doing wrong? How can I ensure that my queries will always take the same amount of time to run.
我究竟做错了什么?我如何确保我的查询总是花费相同的时间来运行。
I did some research and it says that in a correctly setup database with correct statistics you don't even have to use hints, everything should consistently run in about the same time.
我做了一些研究,它说在具有正确统计数据的正确设置数据库中,您甚至不必使用提示,所有内容都应该在大约相同的时间内始终如一地运行。
Is this true? Or does everyone have this problem and everyone just rewrites their procedures whenever they run?
这是真的?还是每个人都有这个问题,每个人只要运行就重写他们的程序?
Sorry for 100 questions, I'm really frustrated about this.
抱歉我问了 100 个问题,我对此感到非常沮丧。
My main question is, why does the same query takes different amount of time (drastic difference, from minutes to hours) to run on different days?
我的主要问题是,为什么相同的查询在不同的日子运行需要不同的时间(从几分钟到几小时的巨大差异)?
采纳答案by Justin Cave
There are three broad reasons that queries take longer at different times. Either you are getting different performance because the system is under a different sort of load, you are getting different performance because of data volume changes, or you are getting different performance because you are getting different query plans.
查询在不同时间花费更长的时间主要有以下三个原因。要么因为系统处于不同类型的负载下而获得不同的性能,要么因为数据量变化而获得不同的性能,要么因为获得不同的查询计划而获得不同的性能。
Different Data Volume
不同的数据量
When you generate your initial timings, are you using data volumes that are similar to the volumes that your query will encounter when it is actually run? If you test a query on the first of the month and that query is getting all the data for the current month and performing a bunch of aggregations, you would expect that the query would get slower and slower over the course of the month because it had to process more and more data. Or you may have a query that runs quickly outside of month-end processing because various staging tables that it depends on only get populated at month end. If you are generating your initial timings in a test database, you'l very likely get different performance because test databases frequently have a small subset of the actual production data.
当您生成初始时间时,您使用的数据量是否与查询在实际运行时遇到的数据量相似?如果您在一个月的第一天测试一个查询,并且该查询正在获取当月的所有数据并执行一堆聚合,您会期望该查询在一个月的过程中变得越来越慢,因为它已经处理越来越多的数据。或者您可能有一个查询在月末处理之外快速运行,因为它所依赖的各种临时表仅在月末填充。如果您在测试数据库中生成初始计时,您很可能会获得不同的性能,因为测试数据库通常只有一小部分实际生产数据。
Different System Load
不同的系统负载
If I take a query and run it during the middle of the day against my data warehouse, there is a good chance that the data warehouse is mostly idle and therefore has lots of resources to give me to process the query. If I'm the only user, my query may run very quickly. If I try to run exactly the same query during the middle of the nightly load process, on the other hand, my query will be competing for resources with a number of other processes. Even if my query has to do exactly the same amount of work, it can easily take many times more clock time to run. If you are writing reports that will run at month end and they're all getting kicked off at roughly the same time, it's entirely possible that they're all competing with each other for the limited system resources available and that your system simply isn't sized for the load it needs to process.
如果我在中午对我的数据仓库进行查询并运行它,则数据仓库很可能大部分处于空闲状态,因此有很多资源可以让我处理查询。如果我是唯一的用户,我的查询可能会运行得非常快。另一方面,如果我尝试在每晚加载过程的中间运行完全相同的查询,则我的查询将与许多其他进程竞争资源。即使我的查询必须完成完全相同的工作量,它也很容易花费数倍的时钟时间来运行。如果您正在编写将在月底运行的报告,并且它们几乎都在同一时间启动,则完全有可能它们都在为可用的有限系统资源而相互竞争,而您的系统根本就不是“
Different system load can also encompass things like differences in what data is cached at any point in time. If I'm testing a particular query in prod and I run it a few times in a row, it is very likely that most of the data I'm interested in will be cached by Oracle, by the operating system, by the SAN, etc. That can make a dramatic difference in performance if every read is coming from one of the caches rather than requiring a disk read. If you run the same query later after other work has flushed out most of the blocks your query is interested in, you may end up doing a ton of physical reads rather than being able to use the nicely warmed up cache. There's not generally much you can do about this sort of thing-- you may be able to cache more data or arrange for processes that need similar data to be run at similar times so that the cache is more efficient ut that is generally expensive and hard to do.
不同的系统负载还可能包含诸如在任何时间点缓存的数据的差异等问题。如果我在 prod 中测试特定查询并连续运行几次,我感兴趣的大部分数据很可能会被 Oracle、操作系统、SAN 缓存,等等。如果每次读取都来自一个缓存而不是磁盘读取,那么这可能会对性能产生巨大的影响。如果在其他工作清除了您的查询感兴趣的大部分块之后运行相同的查询,您最终可能会执行大量物理读取,而无法使用经过良好预热的缓存。那里'
Different Query Plans
不同的查询计划
Over time, your query plan may also change because statistics have changed (or not changed depending on the statistic in question). Normally, that indicates that Oracle has found a more efficient plan or that your data volumes have changed and Oracle expects a different plan would be more efficient with the new data volume. If, however, you are giving Oracle bad statistics (if, for example, you have tables that get much larger during month-end processing but you gather statistics when the tables are almost empty), you may induce Oracle to choose a very bad query plan. Depending on the version of Oracle, there are various ways to force Oracle to use the same query plan. If you can drill down and figure out what the problem with statistics is, Oracle probably provides a way to give the optimizer better statistics.
随着时间的推移,您的查询计划也可能会因为统计信息发生变化(或不发生变化,取决于所讨论的统计信息)而发生变化。通常,这表明 Oracle 找到了更有效的计划,或者您的数据量发生了变化,并且 Oracle 预计不同的计划对于新的数据量会更有效。但是,如果您向 Oracle 提供错误的统计信息(例如,如果您的表在月末处理期间变得更大,但您在表几乎为空时收集统计信息),您可能会导致 Oracle 选择一个非常糟糕的查询计划。根据 Oracle 的版本,有多种方法可以强制 Oracle 使用相同的查询计划。如果您可以深入了解统计信息的问题所在,Oracle 可能会提供一种方法来为优化器提供更好的统计信息。
If you take a look at AWR/ ASH data (if you have the appropriate licenses) or Statspace data (if your DBA has installed that), you should be able to figure out which camp your problems originate in. Are you getting different query plans for different executions (you may need to capture a query plan from your initial benchmarks and compare it to the current plan or you may need to increase your AWR retention to retain query plans for a few months in order to see this). Are you doing the same number of buffer gets over time but getting vastly different amounts of I/O waits? Do you see a lot of contention for resources from other sessions?If so, that probably indicates that the issue is different load at different times.
如果您查看 AWR/ASH 数据(如果您有适当的许可证)或 Statspace 数据(如果您的 DBA 已安装),您应该能够确定您的问题来自哪个阵营。您是否得到了不同的查询计划对于不同的执行(您可能需要从初始基准测试中获取查询计划并将其与当前计划进行比较,或者您可能需要增加 AWR 保留以将查询计划保留几个月才能看到这一点)。随着时间的推移,您是否在执行相同数量的缓冲区获取,但获得的 I/O 等待量却大不相同?您是否看到来自其他会话的大量资源争用?如果是这样,则可能表明问题是不同时间的负载不同。
回答by lalachka
this is not an answer, this is a reply to Justin Cave, i couldn't format it in any readable way in the comments.
这不是答案,这是对 Justin Cave 的回复,我无法在评论中以任何可读的方式对其进行格式化。
Different Data Volume When ….. data.
不同的数据量 当……数据。
Yes, I'm using the same archive tables that I then use for months to come. Of course, data changes but it's a pretty consistent rise, for example, if a table has 10M rows this month – it might gain 100K rows the next, 200K the next, 100K the next and so on. There are no drastic jumps as far as I know. And I'd understand if today the query took 2 minutes and next month it'd take 5. But not 3 hours. However, thank you for the idea, I will start counting rows in tables from month to month as well. Question though, so how do people code to account for this? let's say someone works with tables that will get large amounts of data at random times, is there a way to write the query to ensure the run times are at least in the ball park? Or do people just put up with the fact that any month their reports will run 10-20 hours.
是的,我正在使用相同的存档表,然后我将在未来几个月内使用这些表。当然,数据会发生变化,但它是一个相当一致的上升,例如,如果一个表本月有 10M 行——它可能会在下一个增加 100K 行,下一个增加 200K,下一个增加 100K,依此类推。据我所知,没有剧烈的跳跃。我会理解,如果今天查询需要 2 分钟,而下个月需要 5 分钟。但不是 3 小时。但是,感谢您的想法,我也将开始逐月计算表中的行数。问题是,那么人们如何编码来解决这个问题?假设有人使用会随机获取大量数据的表,有没有办法编写查询以确保运行时间至少在棒球场内?或者人们只是忍受这样一个事实,即他们的报告每个月都会运行 10-20 个小时。
Different System Load If I take a …. to process.
不同的系统负载如果我采取...... 处理。
**No, I run my queries on different days and times but I have logs of the days and the times so I will see if I can find a pattern.
* *不,我在不同的日期和时间运行我的查询,但我有日期和时间的日志,所以我会看看我是否能找到一个模式。
Different system load …hard to do.
不同的系统负载……很难做到。
So are you saying that the fast times I may be getting at the time of the report design might be fast because of the things I ran on my computer previously? Also, does the cache get stored on my computer or on the database under my login or where?**
那么你是说我在设计报告时可能会因为我之前在计算机上运行的东西而获得的快速时间可能很快吗?此外,缓存是存储在我的计算机上还是存储在我登录名下的数据库中?**
Different Query Plans Over time, your query plan … different load at different times.
不同的查询计划随着时间的推移,您的查询计划……在不同的时间有不同的负载。
Thank you for your explanations, you've given me enough to start digging.
谢谢你的解释,你给了我足够的时间开始挖掘。
回答by Jim
One possibility is that your execution plan is cached so it takes a short amount of time to rerun the query, but when the plan is no longer cached (like after the DB is restarted) it might take significantly longer.
一种可能性是您的执行计划被缓存,因此重新运行查询需要很短的时间,但是当计划不再被缓存时(例如在数据库重新启动后),它可能需要更长的时间。
I had a similar issue with Oracle a long while ago where a very complex query for a report ran against a very large amount of data, and it would take hours to complete the first time it was run after the DB was restarted, but after that it finished in a few minutes.
很久以前,我在 Oracle 中遇到了类似的问题,其中针对大量数据运行了一个非常复杂的报告查询,并且在数据库重新启动后第一次运行它需要几个小时才能完成,但在那之后它在几分钟内完成。