C++ , 定时器, 毫秒
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15235218/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C++ , Timer, Milliseconds
提问by Przmak
#include <iostream>
#include <conio.h>
#include <ctime>
using namespace std;
double diffclock(clock_t clock1,clock_t clock2)
{
double diffticks=clock1-clock2;
double diffms=(diffticks)/(CLOCKS_PER_SEC/1000);
return diffms;
}
int main()
{
clock_t start = clock();
for(int i=0;;i++)
{
if(i==10000)break;
}
clock_t end = clock();
cout << diffclock(start,end)<<endl;
getch();
return 0;
}
So my problems comes to that it returns me a 0, well to be stright i want to check how much time my program does operate... I found tons of crap over the internet well mostly it comes to the same point of getting a 0 beacuse the start and the end is the same
所以我的问题是它返回给我一个 0,准确地说,我想检查我的程序运行了多少时间......我在互联网上发现了大量的废话,大部分都与获得 0 的结果相同因为开始和结束是一样的
This problems goes to C++ remeber : <
这个问题归于 C++ 记住:<
回答by
There are a few problems in here. The first is that you obviously switched start and stop time when passing to diffclock()
function. The second problem is optimization. Any reasonably smart compiler with optimizations enabled would simply throw the entire loop away as it does not have any side effects. But even you fix the above problems, the program would most likely still print 0. If you try to imagine doing billions operations per second, throw sophisticated out of order execution, prediction and tons of other technologies employed by modern CPUs, even a CPU may optimize your loop away. But even if it doesn't, you'd need a lot more than 10K iterations in order to make it run longer. You'd probably need your program to run for a second or two in order to get clock()
reflect anything.
这里有几个问题。首先是你在传递到diffclock()
函数时明显切换了开始和停止时间。第二个问题是优化。任何启用了优化的合理智能编译器都会简单地将整个循环扔掉,因为它没有任何副作用。但是即使您解决了上述问题,程序很可能仍会打印 0。如果您尝试想象每秒执行数十亿次操作,则会抛出复杂的乱序执行、预测和现代 CPU 所采用的大量其他技术,即使是 CPU 也可能优化你的循环。但即使没有,您也需要超过 10K 次迭代才能使其运行更长时间。您可能需要您的程序运行一两秒钟才能clock()
反映任何内容。
But the most important problem is clock()
itself. That function is not suitable for any time of performance measurements whatsoever. What it does is gives you an approximationof processor time used by the program. Aside of vague nature of the approximation method that might be used by any given implementation (since standard doesn't require it of anything specific), POSIX standard also requires CLOCKS_PER_SEC
to be equal to 1000000
independent of the actual resolution. In other words — it doesn't matter how precise the clock is, it doesn't matter at what frequency your CPU is running. To put simply — it is a totally useless number and therefore a totally useless function. The only reason why it still exists is probably for historical reasons. So, please do not use it.
但最重要的问题还是clock()
它本身。该功能不适用于任何时间的性能测量。它的作用是为您提供程序使用的处理器时间的近似值。除了任何给定实现可能使用的近似方法的模糊性质(因为标准不要求任何特定的东西),POSIX 标准还要求CLOCKS_PER_SEC
等于1000000
独立于实际分辨率。换句话说——时钟有多精确并不重要,你的 CPU 以什么频率运行并不重要。简单地说——它是一个完全无用的数字,因此是一个完全无用的函数。它仍然存在的唯一原因可能是历史原因。所以,请不要使用它。
To achieve what you are looking for, people have used to read the CPU Time Stampalso known as "RDTSC" by the name of the corresponding CPU instruction used to read it. These days, however, this is also mostly useless because:
为了实现您正在寻找的东西,人们习惯于通过用于读取它的相应 CPU 指令的名称来读取CPU 时间戳,也称为“RDTSC”。然而,如今,这也几乎毫无用处,因为:
- Modern operating systems can easily migrate the program from one CPU to another. You can imagine that reading time stamp on another CPU after running for a second on another doesn't make a lot of sense. It is only in latest Intel CPUs the counter is synchronized across CPU cores. All in all, it is still possible to do this, but a lot of extra care must be taken (i.e. once can setup the affinity for the process, etc. etc).
- Measuring CPU instructions of the program oftentimes does not give an accurate picture of how much time it is actually using. This is because in real programs there could be some system calls where the work is performed by the OS kernel on behalf of the process. In that case, that time is not included.
- It could also happen that OS suspends an execution of the process for a long time. And even though it took only a few instructions to execute, for user it seemed like a second. So such a performance measurement may be useless.
- 现代操作系统可以轻松地将程序从一个 CPU 迁移到另一个 CPU。您可以想象在另一个 CPU 上运行一秒钟后读取另一个 CPU 上的时间戳没有多大意义。只有在最新的 Intel CPU 中,计数器才会跨 CPU 内核同步。总而言之,仍然可以这样做,但必须格外小心(即一次可以设置进程的亲和力等)。
- 测量程序的 CPU 指令通常不能准确地了解它实际使用了多少时间。这是因为在实际程序中可能会有一些系统调用,其中的工作由操作系统内核代表进程执行。在这种情况下,该时间不包括在内。
- 也可能发生操作系统长时间暂停进程的执行。即使只需要执行几条指令,对于用户来说,它似乎也只是一秒钟。所以这样的性能测量可能是无用的。
So what to do?
那么该怎么办?
When it comes to profiling, a tool like perf
must be used. It can track a number of CPU clocks, cache misses, branches taken, branches missed, a number of times the process was moved from one CPU to another, and so on. It can be used as a tool, or can be embedded into your application (something like PAPI).
perf
在进行分析时,必须使用类似的工具。它可以跟踪多个 CPU 时钟、缓存未命中、采用的分支、错过的分支、进程从一个 CPU 移动到另一个 CPU 的次数等。它可以用作工具,也可以嵌入到您的应用程序中(类似于PAPI)。
And if the question is about actual time spent, people use a wall clock. Preferably, a high-precision one, that is also not a subject to NTP adjustments (monotonic). That shows exactly how much time elapsed, no matter what was going on. For that purpose clock_gettime()
can be used. It is part of SUSv2, POSIX.1-2001 standard. Given that use you getch()
to keep the terminal open, I'd assume you are using Windows. There, unfortunately, you don't have clock_gettime()
and the closest thing would be performance counters API:
如果问题是关于实际花费的时间,人们会使用挂钟。最好是高精度的,也不受 NTP 调整(单调)的影响。无论发生什么,这都能准确显示已经过去了多长时间。为此clock_gettime()
可以使用。它是 SUSv2、POSIX.1-2001 标准的一部分。鉴于使用您getch()
保持终端打开,我假设您使用的是 Windows。不幸的是,你没有clock_gettime()
,最接近的是性能计数器 API:
BOOL QueryPerformanceFrequency(LARGE_INTEGER *lpFrequency);
BOOL QueryPerformanceCounter(LARGE_INTEGER *lpPerformanceCount);
For a portable solution, the best bet is on std::chrono::high_resolution_clock()
. It was introduced in C++11, but is supported by most industrial grade compilers (GCC, Clang, MSVC).
对于便携式解决方案,最好的选择是std::chrono::high_resolution_clock()
. 它是在 C++11 中引入的,但大多数工业级编译器(GCC、Clang、MSVC)都支持它。
Below is an example of how to use it. Please note that since I know that my CPU will do 10000 increments of an integer way faster than a millisecond, I have changed it to microseconds. I've also declared the counter as volatile
in hope that compiler won't optimize it away.
下面是一个如何使用它的例子。请注意,由于我知道我的 CPU 将执行 10000 次整数增量的速度快于一毫秒,因此我将其更改为微秒。我还声明了计数器volatile
,希望编译器不会优化它。
#include <ctime>
#include <chrono>
#include <iostream>
int main()
{
volatile int i = 0; // "volatile" is to ask compiler not to optimize the loop away.
auto start = std::chrono::steady_clock::now();
while (i < 10000) {
++i;
}
auto end = std::chrono::steady_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
std::cout << "It took me " << elapsed.count() << " microseconds." << std::endl;
}
When I compile and run it, it prints:
当我编译并运行它时,它会打印:
$ g++ -std=c++11 -Wall -o test ./test.cpp && ./test
It took me 23 microseconds.
Hope it helps. Good Luck!
希望能帮助到你。祝你好运!
回答by
At a glance, it seems like you are subtracting the larger value from the smaller value. You call:
乍一看,您似乎是从较小的值中减去较大的值。你打电话:
diffclock( start, end );
But then diffclock is defined as:
但随后 diffclock 被定义为:
double diffclock( clock_t clock1, clock_t clock2 ) {
double diffticks = clock1 - clock2;
double diffms = diffticks / ( CLOCKS_PER_SEC / 1000 );
return diffms;
}
Apart from that, it may have something to do with the way you are converting units. The use of 1000 to convert to milliseconds is different on this page:
除此之外,它可能与您转换单位的方式有关。使用 1000 转换为毫秒在此页面上有所不同:
回答by The_Sympathizer
The problem appears to be the loop is just too short. I tried it on my system and it gave 0 ticks. I checked what diffticks was and it was 0. Increasing the loop size to 100000000, so there was a noticeable time lag and I got -290 as output (bug -- I think that the diffticks should be clock2-clock1 so we should get 290 and not -290). I tried also changing "1000" to "1000.0" in the division and that didn't work.
问题似乎是循环太短了。我在我的系统上试过了,它给了 0 个滴答声。我检查了 diffticks 是什么,它是 0。将循环大小增加到 100000000,所以有一个明显的时间延迟,我得到了 -290 作为输出(错误——我认为 diffticks 应该是 clock2-clock1 所以我们应该得到 290而不是-290)。我还尝试将分区中的“1000”更改为“1000.0”,但没有用。
Compiling with optimization does remove the loop, so you have to not use it, or make the loop "do something", e.g. increment a counter other than the loop counter in the loop body. At least that's what GCC does.
使用优化编译确实删除了循环,因此您必须不使用它,或者使循环“做某事”,例如增加循环体中的循环计数器以外的计数器。至少这就是 GCC 所做的。
回答by Slava
First of all you should subtract end - start not vice versa.
Documentation says if value is not available clock() returns -1, did you check that?
What optimization level do you use when compile your program? If optimization is enabled compiler can effectively eliminate your loop entirely.
首先,您应该减去结束 - 开始而不是相反。
文档说如果值不可用,clock() 返回 -1,你检查过了吗?编译程序时使用什么优化级别?如果启用优化,编译器可以有效地完全消除您的循环。