multithreading 上下文切换的步骤
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7439608/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Steps in Context Switching
提问by raphnguyen
I am asked to describe the steps involved in a context switch (1) between two different processes and (2) between two different threads in the same process.
我被要求描述 (1) 两个不同进程之间和 (2) 同一进程中两个不同线程之间的上下文切换所涉及的步骤。
- During a context switch, the kernel will save the context of the old process in its PCB and then load the saved context of the new process scheduled to run.
- Context switching between two different threads in the same process can be scheduled by the operating system so that they appear to execute in parallel, and is thus usually faster than context switches between two different processes.
- 在上下文切换期间,内核将在其 PCB 中保存旧进程的上下文,然后加载计划运行的新进程的保存上下文。
- 操作系统可以调度同一进程中两个不同线程之间的上下文切换,使它们看起来并行执行,因此通常比两个不同进程之间的上下文切换更快。
Is this too general or what would you add to explain the process clearer?
这是否过于笼统,或者您会添加什么来更清楚地解释该过程?
回答by Martin James
It's much easier to explain those in reverse order because a process-switch always involves a thread-switch.
以相反的顺序解释它们要容易得多,因为进程切换总是涉及线程切换。
A typical thread context switch on a single-core CPU happens like this:
单核 CPU 上的典型线程上下文切换是这样发生的:
All context switches are initiated by an 'interrupt'. This could be an actual hardware interrupt that runs a driver, (eg. from a network card, keyboard, memory-management or timer hardware), or a software call, (system call), that performs a hardware-interrupt-like call sequence to enter the OS. In the case of a driver interrupt, the OS provides an entry point that the driver can call instead of performing the 'normal' direct interrupt-return & so allows a driver to exit via the OS scheduler if it needs the OS to set a thread ready, (eg. it has signaled a semaphore).
Non-trivial systems will have to initiate a hardware-protection-level change to enter a kernel-state so that the kernel code/data etc. can be accessed.
Core state for the interrupted thread has to be saved. On a simple embedded system, this might just be pushing all registers onto the thread stack and saving the stack pointer in its Thread Control Block (TCB).
Many systems switch to an OS-dedicated stack at this stage so that the bulk of OS-internal stack requirements are not inflicted on the stack of every thread.
It may be necessary to mark the thread stack position where the change to interrupt-state occurred to allow for nested interrupts.
The driver/system call runs and may change the set of ready threads by adding/removing TCB's from internal queues for the different thread priorities, eg. network card driver may have set an event or signaled a semaphore that another thread was waiting on, so that thread will be added to the ready set, or a running thread may have called sleep() and so elected to remove itself from the ready set.
The OS scheduler algorithm is run to decide which thread to run next, typically the highest-priority ready thread that is at the front of the queue for that priority. If the next-to-run thread belongs to a different process to the previously-run thread, some extra stuff is needed here, (see later).
The saved stack pointer from the TCB for that thread is retrieved and loaded into the hardware stack pointer.
The core state for the selected thread is restored. On my simple system, the registers would be popped from the stack of the selected thread. More complex systems will have to handle a return to user-level protection.
An interrupt-return is performed, so transferring execution to the selected thread.
所有上下文切换均由“中断”启动。这可能是运行驱动程序的实际硬件中断(例如,来自网卡、键盘、内存管理或计时器硬件),或执行类似硬件中断的调用序列的软件调用(系统调用)进入操作系统。在驱动程序中断的情况下,操作系统提供了一个驱动程序可以调用的入口点,而不是执行“正常”直接中断返回,因此如果需要操作系统设置线程,则允许驱动程序通过操作系统调度程序退出准备好,(例如,它已发出信号量)。
非平凡系统必须启动硬件保护级别更改才能进入内核状态,以便可以访问内核代码/数据等。
必须保存被中断线程的核心状态。在简单的嵌入式系统上,这可能只是将所有寄存器压入线程堆栈并将堆栈指针保存在其线程控制块 (TCB) 中。
许多系统在此阶段切换到 OS 专用堆栈,以便大量 OS 内部堆栈要求不会对每个线程的堆栈造成影响。
可能需要标记发生中断状态更改的线程堆栈位置,以允许嵌套中断。
驱动程序/系统调用运行并可以通过从内部队列中为不同的线程优先级添加/删除 TCB 来更改就绪线程集,例如。网卡驱动程序可能设置了一个事件或发出信号表示另一个线程正在等待,因此该线程将被添加到就绪集,或者正在运行的线程可能调用了 sleep() 并因此选择将自身从就绪集中删除.
运行操作系统调度程序算法来决定接下来运行哪个线程,通常是处于该优先级队列前面的最高优先级就绪线程。如果下一个运行的线程与先前运行的线程属于不同的进程,则此处需要一些额外的东西(见下文)。
为该线程从 TCB 中保存的堆栈指针被检索并加载到硬件堆栈指针中。
所选线程的核心状态已恢复。在我的简单系统上,寄存器将从所选线程的堆栈中弹出。更复杂的系统将不得不处理返回到用户级保护的问题。
执行中断返回,因此将执行转移到所选线程。
In the case of a multicore CPU, things are more complex. The scheduler may decide that a thread that is currently running on another core may need to be stopped and replaced by a thread that has just become ready. It can do this by using its interprocessor driver to hardware-interrupt the core running the thread that has to be stopped. The complexities of this operation, on top of all the other stuff, is a good reason to avoid writing OS kernels :)
在多核 CPU 的情况下,事情更加复杂。调度器可能会决定当前正在另一个内核上运行的线程可能需要停止并替换为刚准备就绪的线程。它可以通过使用其处理器间驱动程序来硬件中断运行必须停止的线程的核心来实现这一点。除了所有其他内容之外,此操作的复杂性是避免编写操作系统内核的一个很好的理由:)
A typical process context switch happens like this:
典型的进程上下文切换是这样发生的:
Process context switches are initiated by a thread-context switch, so all of the above, 1-9, is going to need to happen.
At step 5 above, the scheduler decides to run a thread belonging to a different process from the one that owned the previously-running thread.
The memory-management hardware has to be loaded with the address-space for the new process, ie whatever selectors/segments/flags/whatever that allow the thread/s of the new process to access its memory.
The context of any FPU hardware needs to be saved/restored from the PCB.
There may be other process-dedicated hardware that needs to be saved/restored.
进程上下文切换是由线程上下文切换启动的,因此上述所有 1-9 项都需要发生。
在上面的第 5 步,调度程序决定运行一个线程,该线程与拥有先前运行的线程的进程不同。
内存管理硬件必须加载新进程的地址空间,即允许新进程的线程访问其内存的任何选择器/段/标志/任何东西。
任何 FPU 硬件的上下文都需要从 PCB 保存/恢复。
可能还有其他进程专用硬件需要保存/恢复。
On any real system, the mechanisms are architecture-dependent and the above is a rough and incomplete guide to the implications of either context switch. There are other overheads generated by a process-switch that are not strictly part of the switch - there may be extra cache-flushes and page-faults after a process-switch since some of its memory may have been paged out in favour of pages belonging to the process owning the thread that was running before.
在任何真实系统上,这些机制都依赖于体系结构,以上是对任一上下文切换影响的粗略和不完整的指南。进程切换产生的其他开销严格来说不是切换的一部分 - 进程切换后可能会有额外的缓存刷新和页面错误,因为它的一些内存可能已被调出以支持属于页面的页面到拥有之前运行的线程的进程。
回答by ZarathustrA
I hope that I can provide a more detailed/clear picture.
我希望我能提供更详细/清晰的图片。
First of all, the OS schedules threads, not processes, because threads are the only executable units in the system. Process switch is just a thread switch where the threads belong to different processes, and therefore the procedure is basically the same.
首先,操作系统调度线程,而不是进程,因为线程是系统中唯一的可执行单元。进程切换只是线程切换,线程属于不同的进程,所以过程基本相同。
The scheduler is invoked. There are three basic scenarios in which this may happen:
- Involuntary switch. Some external event affecting scheduling has occurred outside the currently running thread. For example, an expired timer has woken up a thread with a high priority; or the disk controller has reported that the requested part of a file has been read into the memory and the thread waiting for it can continue its execution; or the system timer has told the kernel that your thread ran out of its time quantum; and so on.
- Voluntary switch. The thread explicitly requests rescheduling through a system call. For example, it may have requested to yield the CPU to some other thread, be put asleep or wait until a mutex is released.
- Semi-voluntary switch. The thread implicitly triggered rescheduling by performing some unrelated system call. For example, it asked to read a file. The OS has forwarded this request to the disk controller, and not to waste time by having the calling thread busy waiting, it decided to switch to another thread.
In all cases, to be able to perform a context switch, control should be passed to the kernel. In the case of involuntary switches, this is performed by an interrupt. In the case of voluntary (and semi-voluntary) context switches, control is passed to the kernel via a system call.
In both cases, kernel entry is CPU-assisted. The processor performs a permissions check, saves the instruction pointer (so that execution can be continued from the right instruction later), switches from user user mode to kernel mode, activates the kernel stack (specific to the current thread) and jumps to a predefined and well-known point in the kernel code.
The first action performed by the kernel is saving the content of CPU registers, which it needs to use for its own purposes. Usually the kernel uses only general purpose CPU registers and saves them by pushing them onto the stack.
The kernel then handles a primary request if needed. It may handle an interrupt, prepare a file read request, reload a timer etc.
At some point during request handling, the kernel performs an action that affects the state of either the current thread (decided that there is currently nothing to be done in this thread as it is waiting for something) or that of another thread (or threads) (a thread became ready to run because an event it was waiting for occurred - a mutex was released, for example).
The kernel invokes the scheduller. The scheduler has to make made two decisions.
- What to do with the current thread? Should it be blocked? If so, which wait queue should it be placed in? If the switch is involuntary, it is placed at the end of the ready queue. Otherwise, the thread is placed in one of the wait queues.
- Which thread should be run next?
Once both decisions have been made, the scheduler performs the context switch using the TCB of the current thread as well as that of the thread that is to be run next.
A context switch itself consist of three main steps.
- The kernel figures out what CPU registers the thread actually uses and saves their content either on the stack or in the TCB of the unscheduled thread. In the case of the IA-32 CPU platform, if the thread does not use FPU and SSE registers, their content will not be saved.
- The kernel pushes the instruction pointer onto the stack and saves the value of the stack pointer in the TCB of the unscheduled thread. It then loads the stack pointer from the TCB of the scheduled thread and pops the instruction pointer from the top of its stack.
- The kernel figures out which registers are actually used by the scheduled thread and loads them with their previously stored contents (see step 1 above).
At this point the kernel checks if the scheduled and unscheduled threads belong to the same process. If not ("process" rather than "thread" switch), the kernel resets the current address space by pointing the MMU (Memory Management Unit) to the page table of the scheduled process. The TLB (Translation Lookaside Buffer), which is a cache containing recent virtual to physical address translations, is also flushed to prevent erroneous address translation. Note that this is the only step in the entire set of context switch actions that cares about processes!
The kernel prepares Thread Local Storage for the scheduled thread. For example, it maps respective memory pages to the specified addresses. As another example, on the IA-32 platform a common approach is to load a new segment which point to the TLS data of the incoming thread.
The kernel loads the current thread's kernel stack address into the CPU. After this, every kernel invocation will use this kernel stack instead of the kernel stack of the unscheduled thread.
Another step which may be performed by the kernel is reprogramming the system timer. When the timer fires, control is returned to the kernel. The time period between the context switch and the timer firing is called a time quantum and indicates how much execution time the current thread is given at that time. This is known as pre-emptive scheduling.
Kernels usually collect statistics during context switches to improve scheduling as well as to show system administrators and users what is going on in the system. These statistics may include such information as how much CPU time the thread has consumed, how many times it has been scheduled, how many times its time quantum has expired, how frequently context switches are occurring in the system etc.
The context switch can be considered ready at this point, and the kernel continues previously interrupted system actions. For example, if the thread had tried to acquire a mutex during a system call, and the mutex is now free, the kernel may finish the interrupted operation.
At some point the thread finishes its system activities and wants to return back to user mode to execute non-system code. The kernel pops from the kernel stack content of general-purpose registers which was previously saved upon kernel entry and makes the CPU execute a special instruction to return to user mode.
The CPU captures the values of the instruction pointer and stack pointer, which were previously saved kernel mode was entered, and restores them. At this point the thread's user mode stack is also activated and kernel mode exited (this prohibits the use of special system instructions).
Finally, the CPU continues execution from the point where the thread was when it was unscheduled. If it happened during a system clal, the thread will proceed from the point where the system call was invoked, by capturing and handling its result. In the case of pre-emption by interrupt, the thread will continue its execution as if nothing happened.
调度程序被调用。有可能发生这种情况的三种基本情况:
- 不自觉的切换。在当前运行的线程之外发生了一些影响调度的外部事件。例如,一个过期的定时器唤醒了一个高优先级的线程;或者磁盘控制器报告文件的请求部分已经读入内存,等待它的线程可以继续执行;或者系统定时器告诉内核你的线程用完了它的时间量;等等。
- 随意切换。线程通过系统调用显式请求重新调度。例如,它可能请求将 CPU 交给其他线程、休眠或等待互斥锁被释放。
- 半自愿切换。线程通过执行一些不相关的系统调用隐式触发重新调度。例如,它要求读取一个文件。操作系统已将此请求转发给磁盘控制器,为了不让调用线程忙于等待而浪费时间,它决定切换到另一个线程。
在所有情况下,为了能够执行上下文切换,控制权应该传递给内核。在非自愿切换的情况下,这是由中断执行的。在自愿(和半自愿)上下文切换的情况下,控制通过系统调用传递给内核。
在这两种情况下,内核入口都是 CPU 辅助的。处理器执行权限检查,保存指令指针(以便以后可以从正确的指令继续执行),从用户用户模式切换到内核模式,激活内核堆栈(特定于当前线程)并跳转到预定义的和内核代码中的众所周知的点。
内核执行的第一个动作是保存 CPU 寄存器的内容,它需要用于自己的目的。通常内核只使用通用 CPU 寄存器并通过将它们压入堆栈来保存它们。
如果需要,内核然后处理主要请求。它可以处理中断、准备文件读取请求、重新加载计时器等。
在请求处理过程中的某个时刻,内核会执行一个操作,该操作会影响当前线程的状态(决定该线程当前没有任何事情可做,因为它正在等待某事)或另一个线程(或多个线程)的状态(线程准备运行是因为它正在等待的事件发生 - 例如,互斥锁被释放)。
内核调用调度程序。调度器必须做出两个决定。
- 如何处理当前线程?该不该屏蔽?如果是这样,它应该放在哪个等待队列中?如果交换机是非自愿的,则将其放置在就绪队列的末尾。否则,线程被放置在等待队列之一中。
- 接下来应该运行哪个线程?
一旦做出了这两个决定,调度程序就会使用当前线程的 TCB 以及接下来要运行的线程的 TCB 执行上下文切换。
上下文切换本身包括三个主要步骤。
- 内核计算出线程实际使用的 CPU 寄存器,并将它们的内容保存在堆栈或未调度线程的 TCB 中。在 IA-32 CPU 平台的情况下,如果线程不使用 FPU 和 SSE 寄存器,它们的内容将不会被保存。
- 内核将指令指针压入堆栈,并将堆栈指针的值保存在非调度线程的TCB中。然后它从被调度线程的 TCB 加载堆栈指针,并从其堆栈顶部弹出指令指针。
- 内核确定调度线程实际使用了哪些寄存器,并用它们之前存储的内容加载它们(参见上面的步骤 1)。
此时内核会检查调度线程和非调度线程是否属于同一个进程。如果不是(“进程”而不是“线程”切换),内核通过将 MMU(内存管理单元)指向被调度进程的页表来重置当前地址空间。TLB(Translation Lookaside Buffer)是一个缓存,包含最近的虚拟到物理地址转换,也被刷新以防止错误的地址转换。请注意,这是整个上下文切换操作集中关心进程的唯一步骤!
内核为调度线程准备线程本地存储。例如,它将各个内存页映射到指定的地址。作为另一个例子,在 IA-32 平台上,一种常见的方法是加载一个新段,该段指向传入线程的 TLS 数据。
内核将当前线程的内核堆栈地址加载到 CPU 中。此后,每次内核调用都将使用此内核堆栈而不是未调度线程的内核堆栈。
内核可以执行的另一个步骤是重新编程系统定时器。当定时器触发时,控制权返回给内核。上下文切换和定时器触发之间的时间段称为时间片,它指示当前线程在当时被给予多少执行时间。这称为抢占式调度。
内核通常在上下文切换期间收集统计信息以改进调度以及向系统管理员和用户显示系统中正在发生的事情。这些统计数据可能包括诸如线程消耗了多少 CPU 时间、它被调度了多少次、它的时间片已经到期多少次、系统中发生上下文切换的频率等信息。
此时可以认为上下文切换准备就绪,内核继续先前中断的系统操作。例如,如果线程在系统调用期间试图获取互斥锁,而互斥锁现在空闲,内核可能会完成被中断的操作。
在某个时刻,线程完成其系统活动并希望返回到用户模式以执行非系统代码。内核从内核栈中的通用寄存器的内容中弹出,这些内容以前在内核进入时保存,并使 CPU 执行一条特殊指令以返回到用户模式。
CPU 捕获之前保存的进入内核模式的指令指针和堆栈指针的值,并恢复它们。此时线程的用户态堆栈也被激活并退出内核态(这禁止使用特殊的系统指令)。
最后,CPU 从线程未调度时所在的点继续执行。如果它发生在系统调用期间,线程将从调用系统调用的点开始,通过捕获和处理其结果。在被中断抢占的情况下,线程将继续执行,就像什么也没发生一样。
Some summary notes:
一些总结笔记:
The kernel only schedules and executes threads, not processes - context switches take place between threads.
The procedure of switching to the context of a thread from another process is essentially the same in a context switch between threads belonging to the same process. Only one additional step is required: changing page tables (and flushing the TLB).
Thread context is stored either in kernel stack or in the TCB (not PCB!).
Context switching is an expensive operation - it has a significant direct cost in performance, and the indirect cost caused by cache pollution (and TLB flush if the switch occurred between processes) is even greater.
内核只调度和执行线程,而不是进程——上下文切换发生在线程之间。
从另一个进程切换到一个线程的上下文的过程在属于同一进程的线程之间的上下文切换中本质上是相同的。只需要一个额外的步骤:更改页表(并刷新 TLB)。
线程上下文存储在内核堆栈或 TCB(不是 PCB!)中。
上下文切换是一项代价高昂的操作——它在性能上有显着的直接成本,而缓存污染(如果进程之间发生切换,则为 TLB 刷新)造成的间接成本甚至更大。
回答by Abhijeet Ashok Muneshwar
- In a switch, the state of process currently executing must be saved somehow, so that when it is rescheduled, this state can be restored.
- The process state includes all the registers that the process may be using, especially the program counter, plus any other operating system specific data that may be necessary. This is usually stored in a data structure called a process control block(PCB) or switchframe.
- The PCB might be stored on a per-process stack in kernel memory (as opposed to the user-mode call stack), or there may be some specific operating system defined data structure for this information. A handle to the PCB is added to a queue of processes that are ready to run, often called the ready queue.
- Since the operating system has effectively suspended the execution of one process, it can then switch context by choosing a process from the ready queue and restoring its PCB. In doing so, the program counter from the PCB is loaded, and thus execution can continue in the chosen process. Process and thread priority can influence which process is chosen from the ready queue (i.e., it may be a priority queue).
- 在交换机中,当前正在执行的进程的状态必须以某种方式保存,以便在重新调度时,可以恢复该状态。
- 进程状态包括进程可能使用的所有寄存器,尤其是程序计数器,以及可能需要的任何其他操作系统特定数据。这通常存储在称为过程控制块(PCB) 或开关帧的数据结构中。
- PCB 可能存储在内核内存中的每个进程堆栈上(与用户模式调用堆栈相反),或者可能有一些特定的操作系统定义的数据结构用于此信息。PCB 的句柄被添加到准备运行的进程队列中,通常称为就绪队列。
- 由于操作系统有效地暂停了一个进程的执行,因此它可以通过从就绪队列中选择一个进程并恢复其 PCB 来切换上下文。在这样做时,来自 PCB 的程序计数器被加载,因此可以在所选进程中继续执行。进程和线程优先级会影响从就绪队列中选择哪个进程(即,它可能是一个优先级队列)。
(Source: Context switch)
(来源:上下文切换)
回答by red_herring
1.Save the context of the process that is currently running on the CPU. Update the process control block and other important fields.
1.保存当前在CPU上运行的进程的上下文。更新过程控制块和其他重要字段。
2.Move the process control block of the above process into the relevant queue such as the ready queue, I/O queue etc.
2.将上述进程的进程控制块移动到相关队列中,如就绪队列、I/O队列等。
3.Select a new process for execution.
3.选择一个新的进程执行。
4.Update the process control block of the selected process. This includes updating the process state to running.
4.更新所选进程的进程控制块。这包括将进程状态更新为正在运行。
5.Update the memory management data structures as required.
5.根据需要更新内存管理数据结构。
6.Restore the context of the process that was previously running when it is loaded again on the processor. This is done by loading the previous values of the process control block and registers.
6.再次加载到处理器上时,恢复之前运行的进程的上下文。这是通过加载进程控制块和寄存器的先前值来完成的。