如何获得Win32中的CPU周期计数?
在Win32中,有没有办法获得唯一的cpu周期数或者类似的东西,对于多个进程/语言/系统/等都是统一的。
我正在创建一些日志文件,但是由于我们要托管.NET运行时,因此必须生成多个日志文件,并且我希望避免从一个调用另一个到另一个来记录日志。因此,我当时想只生成两个文件,将它们组合在一起,然后对它们进行排序,以获得涉及跨世界调用的连贯时间表。
但是,GetTickCount不会在每次调用时都增加,因此并不可靠。有没有更好的电话号码,以便我在排序时能以正确的顺序接听电话?
编辑:感谢@Greg,这使我走上了QueryPerformanceCounter的轨道,从而达到了目的。
解决方案
我们可以使用RDTSC CPU指令(假设x86)。该指令提供了CPU周期计数器,但是请注意,它将很快增加到最大值,然后重置为0。正如Wikipedia文章所述,使用QueryPerformanceCounter函数可能会更好。
这是一篇有趣的文章!表示不使用RDTSC,而是使用QueryPerformanceCounter。
结论:
Using regular old timeGetTime() to do timing is not reliable on many Windows-based operating systems because the granularity of the system timer can be as high as 10-15 milliseconds, meaning that timeGetTime() is only accurate to 10-15 milliseconds. [Note that the high granularities occur on NT-based operation systems like Windows NT, 2000, and XP. Windows 95 and 98 tend to have much better granularity, around 1-5 ms.] However, if you call timeBeginPeriod(1) at the beginning of your program (and timeEndPeriod(1) at the end), timeGetTime() will usually become accurate to 1-2 milliseconds, and will provide you with extremely accurate timing information. Sleep() behaves similarly; the length of time that Sleep() actually sleeps for goes hand-in-hand with the granularity of timeGetTime(), so after calling timeBeginPeriod(1) once, Sleep(1) will actually sleep for 1-2 milliseconds,Sleep(2) for 2-3, and so on (instead of sleeping in increments as high as 10-15 ms). For higher precision timing (sub-millisecond accuracy), you'll probably want to avoid using the assembly mnemonic RDTSC because it is hard to calibrate; instead, use QueryPerformanceFrequency and QueryPerformanceCounter, which are accurate to less than 10 microseconds (0.00001 seconds). For simple timing, both timeGetTime and QueryPerformanceCounter work well, and QueryPerformanceCounter is obviously more accurate. However, if you need to do any kind of "timed pauses" (such as those necessary for framerate limiting), you need to be careful of sitting in a loop calling QueryPerformanceCounter, waiting for it to reach a certain value; this will eat up 100% of your processor. Instead, consider a hybrid scheme, where you call Sleep(1) (don't forget timeBeginPeriod(1) first!) whenever you need to pass more than 1 ms of time, and then only enter the QueryPerformanceCounter 100%-busy loop to finish off the last < 1/1000th of a second of the delay you need. This will give you ultra-accurate delays (accurate to 10 microseconds), with very minimal CPU usage. See the code above.
合并日志文件时,请使用GetTickCount并添加另一个计数器。不会为我们提供不同日志文件之间的完美顺序,但它至少会以正确的顺序保留每个文件中的所有日志。
System.Diagnostics.Stopwatch.GetTimestamp()返回自时间起点以来的CPU周期数(也许是计算机启动时的时间,但我不确定),但我从未见过它在两次调用之间没有增加。
CPU周期将特定于每台计算机,因此我们不能使用它来在两台计算机之间合并日志文件。
RDTSC输出可能取决于当前内核的时钟频率,对于现代CPU,该频率既不是恒定的,也不是在多核计算机中一致的。
使用系统时间,如果要处理来自多个系统的提要,则使用NTP时间源。我们可以通过这种方式获得可靠,一致的时间读数。如果开销对于目的而言过多,则使用HPET来计算自上次已知的可靠时间读数以来经过的时间要比单独使用HPET更好。