C语言 编写一个 C 程序来测量 Linux 操作系统中上下文切换所花费的时间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2368384/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 04:43:13  来源:igfitidea点击:

Write a C program to measure time spent in context switch in Linux OS

clinuxcontext-switch

提问by Gautham

Can we write a c program to find out time spent in context switch in Linux? Could you please share code if you have one? Thanks

我们可以编写 ac 程序来找出 Linux 中上下文切换所花费的时间吗?如果你有代码,可以分享一下吗?谢谢

采纳答案by Tronic

Profiling the switching time is very difficult, but the in-kernel latency profiling tools, as well as oprofile (which can profile the kernel itself) will help you there.

分析切换时间非常困难,但内核延迟分析工具以及 oprofile(可以分析内核本身)将帮助您实现这一点。

For benchmarking the interactive application performance, I have written a small tool called latencybench that measures unexpected latency spikes:

为了对交互式应用程序性能进行基准测试,我编写了一个名为latencybench 的小工具,用于测量意外的延迟峰值:

// Compile with g++ latencybench.cc -o latencybench -lboost_thread-mt
// Should also work on MSVC and other platforms supported by Boost.

#include <boost/format.hpp>
#include <boost/thread/thread.hpp>
#include <boost/date_time.hpp>
#include <algorithm>
#include <cstdlib>
#include <csignal>

volatile bool m_quit = false;

extern "C" void sighandler(int) {
    m_quit = true;
}

std::string num(unsigned val) {
    if (val == 1) return "one occurrence";
    return boost::lexical_cast<std::string>(val) + " occurrences";
}

int main(int argc, char** argv) {
    using namespace boost::posix_time;
    std::signal(SIGINT, sighandler);
    std::signal(SIGTERM, sighandler);
    time_duration duration = milliseconds(10);
    if (argc > 1) {
        try {
            if (argc != 2) throw 1;
            unsigned ms = boost::lexical_cast<unsigned>(argv[1]);
            if (ms > 1000) throw 2;
            duration = milliseconds(ms);
        } catch (...) {
            std::cerr << "Usage: " << argv[0] << " milliseconds" << std::endl;
            return EXIT_FAILURE;
        }
    }
    typedef std::map<long, unsigned> Durations;
    Durations durations;
    unsigned samples = 0, wrongsamples = 0;
    unsigned max = 0;
    long last = -1;
    std::cout << "Measuring actual sleep delays when requesting " << duration.total_milliseconds() << " ms: (Ctrl+C when done)" << std::endl;
    ptime begin = boost::get_system_time();
    while (!m_quit) {
        ptime start = boost::get_system_time();
        boost::this_thread::sleep(start + duration);
        long actual = (boost::get_system_time() - start).total_milliseconds();
        ++samples;
        unsigned num = ++durations[actual];
        if (actual != last) {
            std::cout << "\r  " << actual << " ms " << std::flush;
            last = actual;
        }
        if (actual != duration.total_milliseconds()) {
            ++wrongsamples;
            if (num > max) max = num;
            std::cout << "spike at " << start - begin << std::endl;
            last = -1;
        }
    }
    if (samples == 0) return 0;
    std::cout << "\rTotal measurement duration:  " << boost::get_system_time() - begin << "\n";
    std::cout << "Number of samples collected: " << samples << "\n";
    std::cout << "Incorrect delay count:       " << wrongsamples << boost::format(" (%.2f %%)") % (100.0 * wrongsamples / samples) << "\n\n";
    std::cout << "Histogram of actual delays:\n\n";
    unsigned correctsamples = samples - wrongsamples;
    const unsigned line = 60;
    double scale = 1.0;
    char ch = '+';
    if (max > line) {
        scale = double(line) / max;
        ch = '*';
    }
    double correctscale = 1.0;
    if (correctsamples > line) correctscale = double(line) / correctsamples;
    for (Durations::const_iterator it = durations.begin(); it != durations.end(); ++it) {
        std::string bar;
        if (it->first == duration.total_milliseconds()) bar = std::string(correctscale * it->second, '>');
        else bar = std::string(scale * it->second, ch);
        std::cout << boost::format("%5d ms | %s %d") % it->first % bar % it->second << std::endl;
    }
    std::cout << "\n";
    std::string indent(30, ' ');
    std::cout << indent << "+-- Legend ----------------------------------\n";
    std::cout << indent << "|  >  " << num(1.0 / correctscale) << " (of " << duration.total_milliseconds() << " ms delay)\n";
    if (wrongsamples > 0) std::cout << indent << "|  " << ch << "  " << num(1.0 / scale) << " (of any other delay)\n";
}

Results on Ubuntu 2.6.32-14-generic kernel. While measuring, I was compiling C++ code with four cores and playing a game with OpenGL graphics at the same time (to make it more interesting):

在 Ubuntu 2.6.32-14 通用内核上的结果。在测量时,我正在编译具有四个内核的 C++ 代码并同时玩具有 OpenGL 图形的游戏(使其更有趣):

Total measurement duration:  00:01:45.191465
Number of samples collected: 10383
Incorrect delay count:       196 (1.89 %)

Histogram of actual delays:

   10 ms | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 10187
   11 ms | *************************************************** 70
   12 ms | ************************************************************ 82
   13 ms | ********* 13
   14 ms | ********* 13
   15 ms | ** 4
   17 ms | *** 5
   18 ms | * 2
   19 ms | **** 6
   20 ms |  1

                              +-- Legend ----------------------------------
                              |  >  169 occurrences (of 10 ms delay)
                              |  *  one occurrence (of any other delay)

With rt-patched kernels I get much better results, pretty much 10-12 ms only.

使用 rt-patched kernels 我得到了更好的结果,几乎只有 10-12 毫秒。

The legend in the printout appears to be suffering of a rounding error or something (and the source code pasted is not the exact same version). I never really polished this application for a release...

打印输出中的图例似乎存在舍入错误或其他问题(并且粘贴的源代码版本不完全相同)。我从来没有真正打磨过这个应用程序来发布......

回答by Nathan Kitchen

If you have superuser privileges, you can run a SystemTap program with probe points for context switches and print the current time at each one:

如果你有超级用户权限,你可以运行一个带有探测点的 SystemTap 程序来进行上下文切换,并在每个探测点上打印当前时间:

probe scheduler.ctxswitch {
    printf("Switch from %d to %d at %d\n", prev_pid, next_pid, gettimeofday_us())
}

I'm not sure how reliable the output data is, but it's a quick and easy way to get some numbers.

我不确定输出数据的可靠性,但这是获取一些数字的一种快速简便的方法。

回答by Amir Naghizadeh

What do you think , measuring the context switching with seconds or milliseconds or even microseconds . All happening less than nano-sec . If your want to spend that huge of time for context switching which could be measured ,then ... Try some real-mode kernel type code written on Assembly , you might see something.

你怎么看,用秒或毫秒甚至微秒来衡量上下文切换。所有发生的时间都小于纳秒。如果您想花费大量时间进行可以测量的上下文切换,那么...尝试一些在 Assembly 上编写的实模式内核类型代码,您可能会看到一些东西。

回答by Nikolai Fetissov

Short answer - no. Long answer bellow.

简短的回答 - 不。下面长回答。

Context switch roughly happens when either:

上下文切换大致发生在以下任一情况:

  1. User process enters the kernel via system call or a trap (e.g. page fault) and requested data (e.g. file contents) is not yet available, so the kernel puts said user process into sleep state and switches to another runnable process.
  2. Kernel detects that given user process consumed its full time quanta (this happens in code invoked from timer interrupt.)
  3. Data becomes available for higher current priority process that is presently sleeping (this happens from code invoked from/around IO interrupts.)
  1. 用户进程通过系统调用或陷阱(例如页面错误)进入内核,并且请求的数据(例如文件内容)尚不可用,因此内核将所述用户进程置于睡眠状态并切换到另一个可运行进程。
  2. 内核检测到给定的用户进程消耗了它的全部时间量(这发生在从定时器中断调用的代码中。)
  3. 数据可用于当前处于休眠状态的较高当前优先级进程(这发生在从 IO 中断调用/围绕 IO 中断调用的代码。)

The switch itself is one-way, so the best we can do in userland (I assume that's what you are asking) is to measure sort of an RTT, from our process to another and back. The other process also takes time to do its work. We can of course make two or more processes cooperate on this, but the thing is that the kernel doesn't guarantee that one of our processes is going to be picked next. It's probably possible to predictably switch to a given process with RT scheduler, but I have no advise here, suggestions welcome.

切换本身是单向的,因此我们在用户空间中可以做的最好的事情(我假设这就是您要问的)是测量某种 RTT,从我们的进程到另一个进程并返回。另一个过程也需要时间来完成它的工作。我们当然可以让两个或更多进程在这方面进行合作,但问题是内核不保证我们的一个进程将被下一个选择。使用 RT 调度程序可能可以预见地切换到给定的进程,但我在这里没有建议,欢迎提出建议。

回答by Sumit Gemini

Measuring the cost of a context switch is a little trickier. We can compute time spent in context switch by running two processes on a single CPU, and setting up three Linux pipes between them;

测量上下文切换的成本有点棘手。我们可以通过在单个 CPU 上运行两个进程并在它们之间设置三个 Linux 管道来计算上下文切换所花费的时间;

  • two pipes for sharing string between the process and
  • 3rd one will be used to share a time spent at child process.
  • 用于在进程之间共享字符串的两个管道和
  • 第三个将用于共享在子进程中花费的时间。

The first process then issues a write to the first pipe, and waits for a read on the second; upon seeing the first process waiting for something to read from the second pipe, the OS puts the first process in the blocked state, and switches to the other process, which reads from the first pipe and then writes to the second. When the second process tries to read from the first pipe again, it blocks, and thus the back-and-forth cycle of communication continues. By measuring the cost of communicating like this repeatedly, you can make a good estimate of the cost of a context switch.

然后第一个进程向第一个管道发出写入,并等待第二个管道的读取;在看到第一个进程等待从第二个管道读取某些内容时,操作系统将第一个进程置于阻塞状态,并切换到另一个进程,该进程从第一个管道读取然后写入第二个管道。当第二个进程再次尝试从第一个管道读取时,它会阻塞,因此来回通信循环继续。通过重复测量这样的通信成本,您可以很好地估计上下文切换的成本。

One difficulty in measuring context-switch cost arises in systems with more than one CPU; what you need to do on such a system is ensure that your context-switching processes are located on the same processor . Fortunately, most operating systems have calls to bind a process to a particular processor; on Linux, for example, the sched_setaffinity()call is what you're looking for. By ensuring both processes are on the same processor, you are making sure to measure the cost of the OS stopping one process and restoring another on the same CPU.

具有多个 CPU 的系统会出现衡量上下文切换成本的一项困难;在这样的系统上您需要做的是确保您的上下文切换进程位于同一处理器上。幸运的是,大多数操作系统都有将进程绑定到特定处理器的调用。例如,在 Linux 上, sched_setaffinity() 调用就是您要寻找的。通过确保两个进程在同一个处理器上,您可以确保衡量操作系统在同一个 CPU 上停止一个进程和恢复另一个进程的成本。

Here I'm posting my solution for computing context-switch between processes.

在这里,我发布了我的解决方案,用于计算进程之间的上下文切换。

    #define _GNU_SOURCE
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <sched.h>
#include <stdlib.h>
#include <string.h>
#include <linux/unistd.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <errno.h>

pid_t getpid( void )
{
    return syscall( __NR_getpid );
}

int main()
{
    /*********************************************************************************************
        To make sure context-switching processes are located on the same processor :
        1. Bind a process to a particular processor using sched_setaffinity.    
        2. To get the maximum priority value (sched_get_priority_max) that can be used with 
           the scheduling algorithm identified by policy (SCHED_FIFO).** 
        **********************************************************************************************/

    cpu_set_t set;
    struct sched_param prio_param;
    int prio_max;

    CPU_ZERO( &set );
    CPU_SET( 0, &set );
        memset(&prio_param,0,sizeof(struct sched_param));

    if (sched_setaffinity( getpid(), sizeof( cpu_set_t ), &set ))
    {
        perror( "sched_setaffinity" );
                exit(EXIT_FAILURE);
    }

    if( (prio_max = sched_get_priority_max(SCHED_FIFO)) < 0 )
    {
                perror("sched_get_priority_max");
        }

    prio_param.sched_priority = prio_max;
    if( sched_setscheduler(getpid(),SCHED_FIFO,&prio_param) < 0 )
    {
                perror("sched_setscheduler");
                exit(EXIT_FAILURE);
        }

    /*****************************************************************************************************
        1. To create a pipe for a fork, the parent and child processes use pipe to read and write, 
           read and write string, using this for context switch.
        2. The parent process first to get the current timestamp (gettimeofday), then write to the pipe,. 
           Then the child should be read in from the back, 
           then the child process to write string, the parent process reads. 
           After the child process to get the current timestamp. 
           This is roughly the difference between two timestamps n * 2 times the context switch time.
    *******************************************************************************************************/

    int     ret=-1;
    int     firstpipe[2];
    int     secondpipe[2];
    int     timepipe[2];
        int     nbytes;
        char    string[] = "Hello, world!\n";
        char    temp[] = "Sumit Gemini!\n";
        char    readbuffer[80];
        char    tempbuffer[80];
    int     i=0;
    struct  timeval start,end;

    // Create an unnamed first pipe
        if (pipe(firstpipe) == -1) 
    {
            fprintf(stderr, "parent: Failed to create pipe\n");
            return -1;
        }

    // create an unnamed Second pipe
        if (pipe(secondpipe) == -1) 
    {
            fprintf(stderr, "parent: Failed to create second pipe\n");
            return -1;
        }

    // Create an unnamed time pipe which will share in order to show time spend between processes
        if (pipe(timepipe) == -1) 
    {
            fprintf(stderr, "parent: Failed to create time pipe\n");
            return -1;
        }


    if((ret=fork())==-1)
        perror("fork");
    else if(ret==0)
    {
                int n=-1;
        printf("Child  ----> %d\n",getpid());

        for(n=0;n<5;n++)
        {
                    nbytes = read(firstpipe[0], readbuffer, sizeof(readbuffer));
                    printf("Received string: %s", readbuffer);
            write(secondpipe[1], temp, strlen(temp)+1);
        }

        gettimeofday(&end,0);
                n = sizeof(struct timeval);

                if( write(timepipe[1],&end,sizeof(struct timeval)) != n )
        {
                fprintf(stderr, "child: Failed to write in time pipe\n");
                        exit(EXIT_FAILURE);
                }

    }
    else
    {
        double switch_time;
                int n=-1;
        printf("Parent  ----> %d\n",getpid());
        gettimeofday(&start,0);
                /* Read in a string from the pipe */

        for(n=0;n<5;n++)
        {
            write(firstpipe[1], string, strlen(string)+1);
            read(secondpipe[0], tempbuffer, sizeof(tempbuffer));
                    printf("Received temp: %s", tempbuffer);
        }

        n = sizeof(struct timeval);
                if( read(timepipe[0],&end,sizeof(struct timeval)) != n )
        {
                fprintf(stderr, "Parent: Failed to read from time pipe\n");
                        exit(EXIT_FAILURE);
                }

        wait(NULL);
        switch_time = ((end.tv_sec-start.tv_sec)*1000000+(end.tv_usec-start.tv_usec))/1000.0;
                printf("context switch between two processes: %0.6lfms\n",switch_time/(5*2));


    }   

    return 0;
}

回答by asanoki

Why not just this one as rough estimation?

为什么不只是这个粗略估计呢?

#include <ctime>
#include <cstdio>
#include <sys/time.h>
#include <unistd.h>

int main(int argc, char **argv) {
        struct timeval tv, tvt;
        int diff;
        gettimeofday(&tv, 0);
        diff = tvt.tv_usec - tv.tv_usec;
        if (fork() != 0) {
                gettimeofday(&tvt, 0);
                diff = tvt.tv_usec - tv.tv_usec;
                printf("%d\n", diff);
        }
        return 0;
}

Note: Actually we shouldn't put null as the second argument, check man gettimeofday. Also, we should check if tvt.tv_usec > tv.tv_usec! Just a draft.

注意:实际上我们不应该把 null 作为第二个参数,检查 man gettimeofday。另外,我们应该检查 tvt.tv_usec > tv.tv_usec!只是一个草稿。