java 秒表基准测试是否可以接受?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/410437/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is stopwatch benchmarking acceptable?
提问by Bill the Lizard
Does anyone ever use stopwatch benchmarking, or should a performance tool always be used? Are there any good free tools available for Java? What tools do you use?
有没有人使用过秒表基准测试,还是应该始终使用性能工具?有没有什么好用的免费 Java 工具?你使用什么工具?
To clarify my concerns, stopwatch benchmarking is subject to error due to operating system scheduling. On a given run of your program the OS might schedule another process (or several) in the middle of the function you're timing. In Java, things are even a little bit worse if you're trying to time a threaded application, as the JVM scheduler throws even a little bit more randomness into the mix.
为了澄清我的担忧,秒表基准测试会因操作系统调度而出错。在程序的给定运行中,操作系统可能会在您计时的函数中间安排另一个(或多个)进程。在 Java 中,如果您尝试为线程应用程序计时,情况甚至会更糟,因为 JVM 调度程序会在混合中加入更多的随机性。
How do you address operating system scheduling when benchmarking?
在进行基准测试时,您如何解决操作系统调度问题?
采纳答案by Lawrence Dol
Stopwatch benchmarking is fine, provided you measure enoughiterations to be meaningful. Typically, I require a total elapsed time of some number of single digit seconds. Otherwise, your results are easily significantly skewed by scheduling, and other O/S interruptions to your process.
秒表基准测试很好,只要您测量足够的迭代有意义。通常,我需要一定数量的个位数秒的总经过时间。否则,您的结果很容易因日程安排和其他 O/S 中断而严重偏离您的流程。
For this I use a little set of static methods I built a long time ago, which are based on System.currentTimeMillis().
为此,我使用了我很久以前构建的一些静态方法,它们基于System.currentTimeMillis().
For the profiling work I have used jProfilerfor a number of years and have found it very good. I have recently looked over YourKit, which seems great from the WebSite, but I've not used it at all, personally.
对于分析工作,我已经使用jProfiler多年,并且发现它非常好。我最近查看了YourKit,它在网站上看起来很棒,但我个人根本没有使用过它。
To answer the question on scheduling interruptions, I find that doing repeated runs until consistency is achieved/observed works in practice to weed out anomalous results from process scheduling. I also find that thread scheduling has no practical impact for runs of between 5 and 30 seconds. Lastly, after you pass the few seconds threshold scheduling has, in my experience, negligible impact on the results - I find that a 5 second run consistently averages out the same as a 5 minute run for time/iteration.
为了回答有关调度中断的问题,我发现重复运行直到达到/观察到一致性在实践中可以清除进程调度中的异常结果。我还发现线程调度对 5 到 30 秒之间的运行没有实际影响。最后,在您通过几秒阈值后,根据我的经验,调度对结果的影响可以忽略不计 - 我发现 5 秒的运行始终与 5 分钟的运行时间/迭代平均相同。
You may also want to consider prerunning the tested code about 10,000 times to "warm up" the JIT, depending on the number of times you expect the tested code to run over time in real life.
您可能还需要考虑预运行测试代码大约 10,000 次以“预热”JIT,具体取决于您期望测试代码在现实生活中随时间运行的次数。
回答by cliff.meyers
It's totally valid as long as you measure large enough intervals of time. I would execute 20-30 runs of what you intend to test so that the total elapsed time is over 1 second. I've noticed that time calculations based off System.currentTimeMillis() tend to be either 0ms or ~30ms; I don't think you can get anything more precise than that. You may want to try out System.nanoTime() if you really need to measure a small time interval:
只要您测量足够大的时间间隔,它就完全有效。我会执行 20-30 次您打算测试的内容,以便总经过时间超过 1 秒。我注意到基于 System.currentTimeMillis() 的时间计算往往是 0ms 或 ~30ms;我不认为你能得到比这更精确的东西。如果你真的需要测量一个小的时间间隔,你可能想尝试 System.nanoTime():
- documentation: http://java.sun.com/javase/6/docs/api/java/lang/System.html#nanoTime()
- SO question about measuring small time spans, since System.nanoTime() has some issues, too: How can I measure time with microsecond precision in Java?
- 文档:http: //java.sun.com/javase/6/docs/api/java/lang/System.html#nanoTime()
- 关于测量小时间跨度的问题,因为 System.nanoTime() 也有一些问题:如何在 Java 中以微秒精度测量时间?
回答by James Anderson
Stopwatch is actually the best benchmark!
秒表实际上是最好的基准!
The real end to end user response time is the time that actually matters.
真正的端到端用户响应时间才是真正重要的时间。
It is not always possible to obtain this time using the available tools, for instance most testing tools do not include the time it takes for a browser to render a page so an overcomplex page with badly written css will show sub second response times to the testing tools, but, 5 seconds plus response time to the user.
使用可用的工具并不总是可以获得这个时间,例如大多数测试工具不包括浏览器呈现页面所需的时间,因此具有编写错误的 css 的过于复杂的页面将显示对测试的亚秒响应时间工具,但是,5 秒加上对用户的响应时间。
The tools are great for automated testing, and for problem determinittion but dont lose sight of what you really want to measure.
这些工具非常适合自动化测试和问题确定,但不要忽视您真正想要测量的内容。
回答by Scott Wisniewski
A profiler gives you more detailed information, which can help to diagnose and fix performance problems.
分析器为您提供更详细的信息,这有助于诊断和修复性能问题。
In terms of actual measurement, stopwatch time is what users notice so if you want to validate that things are within acceptable limits, stopwatch time is fine.
在实际测量方面,秒表时间是用户注意到的,所以如果你想验证事情在可接受的范围内,秒表时间就可以了。
When you want to actually fix problems, however, a profiler can be really helpful.
但是,当您想要真正解决问题时,分析器会非常有用。
回答by Peter Lawrey
You need to test a realistic number of iterations as you will get different answers depending on how you test the timing. If you only perform an operation once, it could be misleading to take the average of many iterations. If you want to know the time it takes after the JVM has warmed up you might run many (e.g. 10,000) iterations which are not included in the timings.
您需要测试实际的迭代次数,因为根据您测试时间的方式,您会得到不同的答案。如果您只执行一次操作,取多次迭代的平均值可能会产生误导。如果您想知道 JVM 预热后所需的时间,您可能会运行许多(例如 10,000 次)迭代,这些迭代不包括在计时中。
I also suggest you use System.nanoTime()as it's much more accurate. If your test time is around 10 micro-seconds or less, you don't want to call this too often or it can change your result. (e.g. If I am testing for say 5 seconds and I want to know when this is up I only get the nanoTime every 1000 iterations, if I know an iteration is very quick)
我还建议您使用System.nanoTime()它,因为它更准确。如果您的测试时间大约为 10 微秒或更短,则您不想过于频繁地调用它,否则它会改变您的结果。(例如,如果我正在测试 5 秒,并且我想知道何时结束,我只会每 1000 次迭代获得 nanoTime,如果我知道迭代非常快)
回答by Peter Lawrey
How do you address operating system scheduling when benchmarking?
在进行基准测试时,您如何解决操作系统调度问题?
Benchmark for long enoughon a system which is representative of the machine you will be using. If your OS slows down your application, then that should be part of the result.
在代表您将使用的机器的系统上进行足够长的基准测试。如果您的操作系统减慢了您的应用程序的速度,那么这应该是结果的一部分。
There is no point in saying, my program would be faster, if only I didn't have an OS.
没有必要说,如果我没有操作系统,我的程序会更快。
If you are using Linux, you can use tools such as numactl, chrtand tasksetto control how CPUs are used and the scheduling.
如果您使用的是Linux,则可以使用numactl、chrt和等工具taskset来控制 CPU 的使用方式和调度。
回答by Daniel Paull
Profilers can get in the way of timings, so I would use a combination of stopwatch timing to identify overall performance problems, then use the profiler to work out where the time is being spent. Repeat the process as required.
探查器可能会妨碍计时,因此我会使用秒表计时的组合来确定整体性能问题,然后使用探查器计算出时间花在了何处。根据需要重复该过程。
回答by jakobengblom2
I think a key question is the complexity and length of time of the operation.
我认为一个关键问题是操作的复杂性和时间长度。
I sometimes even use physical stopwatch measurements to see if something takes minutes, hours, days, or even weeks to compute (I am working with an application where run times on the orders of several days are not unheard of, even if seconds and minutes are the most common time spans).
有时我什至使用物理秒表测量来查看是否需要几分钟、几小时、几天甚至几周的时间来计算(我正在使用一个应用程序,其中几天的运行时间并非闻所未闻,即使秒和分钟是最常见的时间跨度)。
However, the automation afforded by calls to any kind of clock system on the computer, like the java millis call referred to in the linked article, is clearly superior to manually seeing how long something runs.
但是,通过调用计算机上任何类型的时钟系统(如链接文章中提到的 java millis 调用)所提供的自动化显然优于手动查看某项运行的时间。
Profilers are nice, when they work, but I have had problems applying them to our application, which usually involves dynamic code generation, dynamic loading of DLLs, and work performed in the two built-in just-in-time-compiled scripting languages of my application. They are quite often limited to assuming a single source language and other unrealistic expectations for complex software.
探查器在工作时很好,但我在将它们应用到我们的应用程序时遇到了问题,这通常涉及动态代码生成、动态加载 DLL 以及在两种内置的即时编译脚本语言中执行的工作我的应用程序。它们通常仅限于假设单一源语言和对复杂软件的其他不切实际的期望。
回答by Robert Gamble
I ran a program today that searched through and collected information from a bunch of dBase files, it took just over an hourto run. I took a look at the code, made an educated guess at what the bottleneck was, made a minor improvement to the algorithm, and re-ran the program, this time it completed in 2.5 minutes.
我今天运行了一个程序,它从一堆 dBase 文件中搜索并收集信息,运行只花了一个多小时。我看了一下代码,对瓶颈是什么进行了有根据的猜测,对算法进行了微小的改进,然后重新运行程序,这次它在2.5 分钟内完成。
I didn't need any fancy profiling tools or benchmark suites to tell me the new version was a significant improvement. If I needed to further optimize the running time I probably would have done some more sophisticated analysis but this wasn't necessary. I find that this sort of "stopwatch benchmarking" is an acceptable solution in quite a number of cases and resorting to more advanced tools would actually be more time-consuming in these cases.
我不需要任何花哨的分析工具或基准测试套件来告诉我新版本是一个重大改进。如果我需要进一步优化运行时间,我可能会做一些更复杂的分析,但这不是必需的。我发现这种“秒表基准测试”在很多情况下都是可以接受的解决方案,而在这些情况下诉诸更高级的工具实际上会更耗时。
回答by dkretz
After all, it's probably the second most popular form of benchmarking, right after "no-watch benchmarking" - where we say "this activity seems slow, that one seems fast."
毕竟,它可能是第二流行的基准测试形式,紧随“无观察基准测试”之后——我们说“这个活动看起来很慢,那个活动看起来很快”。
Usually what's most important to optimize is whatever interferes with the user experience - which is most often a function of how frequently you perform the action, and whatever else is going on at the same time. Other forms of benchmarking often just help zero in on these.
通常,最重要的优化是干扰用户体验的任何东西 - 这通常取决于您执行操作的频率以及同时发生的其他事情。其他形式的基准测试通常只是帮助归零。

