multithreading 线程和纤维有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/796217/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the difference between a thread and a fiber?
提问by tatsuhirosatou
What is the difference between a thread and a fiber? I've heard of fibers from ruby and I've read heard they're available in other languages, could somebody explain to me in simple terms what is the difference between a thread and a fiber.
线程和纤维有什么区别?我听说过来自 ruby 的纤维,我读过听说它们在其他语言中可用,有人可以用简单的术语向我解释线程和纤维之间的区别。
采纳答案by Jason Coco
In the most simple terms, threads are generally considered to be preemptive (although this may not always be true, depending on the operating system) while fibers are considered to be light-weight, cooperative threads. Both are separate execution paths for your application.
用最简单的术语来说,线程通常被认为是抢占式的(尽管这可能并不总是正确的,这取决于操作系统),而纤程被认为是轻量级的协作线程。两者都是您的应用程序的单独执行路径。
With threads: the current execution path may be interrupted or preempted at any time (note: this statement is a generalization and may not always hold true depending on OS/threading package/etc.). This means that for threads, data integrity is a big issue because one thread may be stopped in the middle of updating a chunk of data, leaving the integrity of the data in a bad or incomplete state. This also means that the operating system can take advantage of multiple CPUs and CPU cores by running more than one thread at the same time and leaving it up to the developer to guard data access.
使用线程:当前执行路径可能随时被中断或抢占(注意:此声明是一种概括,可能并不总是适用,具体取决于操作系统/线程包/等)。这意味着对于线程来说,数据完整性是一个大问题,因为一个线程可能会在更新一大块数据的过程中停止,从而使数据的完整性处于不良或不完整的状态。这也意味着操作系统可以通过同时运行多个线程来利用多个 CPU 和 CPU 内核,并将其留给开发人员来保护数据访问。
With fibers: the current execution path is only interrupted when the fiber yields execution (same note as above). This means that fibers always start and stop in well-defined places, so data integrity is much less of an issue. Also, because fibers are often managed in the user space, expensive context switches and CPU state changes need not be made, making changing from one fiber to the next extremely efficient. On the other hand, since no two fibers can run at exactly the same time, just using fibers alone will not take advantage of multiple CPUs or multiple CPU cores.
使用纤程:当前执行路径仅在纤程产生执行时才中断(与上面的注释相同)。这意味着光纤总是在明确定义的位置开始和停止,因此数据完整性问题要少得多。此外,由于光纤通常在用户空间进行管理,因此无需进行昂贵的上下文切换和 CPU 状态更改,从而使从一根光纤更改为下一条光纤的效率极高。另一方面,由于没有两条光纤可以完全同时运行,因此仅使用光纤不会利用多个 CPU 或多个 CPU 内核。
回答by Adam Rosenfield
Threads use pre-emptivescheduling, whereas fibers use cooperativescheduling.
线程使用抢占式调度,而纤程使用协作式调度。
With a thread, the control flow could get interrupted at any time, and another thread can take over. With multiple processors, you can have multiple threads all running at the same time (simultaneousmultithreading, or SMT). As a result, you have to be verycareful about concurrent data access, and protect your data with mutexes, semaphores, condition variables, and so on. It is often very tricky to get right.
对于一个线程,控制流可以随时中断,另一个线程可以接管。使用多个处理器,您可以同时运行多个线程(同步多线程,或 SMT)。因此,您必须非常小心并发数据访问,并使用互斥锁、信号量、条件变量等保护您的数据。做对通常是非常棘手的。
With a fiber, control only switches when you tell it to, typically with a function call named something like yield()
. This makes concurrent data access easier, since you don't have to worry about atomicity of data structures or mutexes. As long as you don't yield, there's no danger of being preemptedand having another fiber trying to read or modify the data you're working with. As a result, though, if your fiber gets into an infinite loop, no other fiber can run, since you're not yielding.
使用光纤,控制仅在您告诉它时切换,通常使用名为yield()
. 这使得并发数据访问更容易,因为您不必担心数据结构或互斥锁的原子性。只要您不让步,就不会有被抢占和让另一条光纤尝试读取或修改您正在处理的数据的危险。但是,结果是,如果您的光纤进入无限循环,则其他光纤无法运行,因为您没有屈服。
You can also mix threads and fibers, which gives rise to the problems faced by both. Not recommended, but it can sometimes be the right thing to do if done carefully.
您还可以混合线程和纤维,这会导致两者面临的问题。不推荐,但如果仔细完成,有时可能是正确的做法。
回答by itowlson
In Win32, a fiber is a sort of user-managed thread. A fiber has its own stack and its own instruction pointer etc., but fibers are not scheduled by the OS: you have to call SwitchToFiber explicitly. Threads, by contrast, are pre-emptively scheduled by the operation system. So roughly speaking a fiber is a thread that is managed at the application/runtime level rather than being a true OS thread.
在 Win32 中,纤程是一种用户管理的线程。光纤有自己的堆栈和自己的指令指针等,但光纤不是由操作系统调度的:您必须显式调用 SwitchToFiber。相比之下,线程是由操作系统预先调度的。所以粗略地说,纤程是在应用程序/运行时级别管理的线程,而不是真正的操作系统线程。
The consequences are that fibers are cheaper and that the application has more control over scheduling. This can be important if the app creates a lot of concurrent tasks, and/or wants to closely optimise when they run. For example, a database server might choose to use fibers rather than threads.
结果是光纤更便宜并且应用程序对调度有更多控制。如果应用程序创建了大量并发任务,和/或希望在它们运行时进行密切优化,这可能很重要。例如,数据库服务器可能选择使用纤程而不是线程。
(There may be other usages for the same term; as noted, this is the Win32 definition.)
(同一术语可能还有其他用法;如前所述,这是 Win32 定义。)
回答by Robert S. Barnes
First I would recommend reading this explanation of the difference between processes and threadsas background material.
首先,我建议阅读这篇关于进程和线程之间差异的解释作为背景材料。
Once you've read that it's pretty straight forward. Threads cans be implemented either in the kernel, in user space, or the two can be mixed. Fibers are basically threads implemented in user space.
一旦你读到它是非常简单的。线程既可以在内核中实现,也可以在用户空间中实现,也可以将两者混合使用。纤程基本上是在用户空间中实现的线程。
- What is typically called a thread is a thread of execution implemented in the kernel: what's known as a kernel thread. The scheduling of a kernel thread is handled exclusively by the kernel, although a kernel thread can voluntarily release the CPU by sleeping if it wants. A kernel thread has the advantage that it can use blocking I/O and let the kernel worry about scheduling. It's main disadvantage is that thread switching is relatively slow since it requires trapping into the kernel.
- Fibers are user space threads whose scheduling is handled in user space by one or more kernel threads under a single process. This makes fiber switching very fast. If you group all the fibers accessing a particular set of shared data under the context of a single kernel thread and have their scheduling handled by a single kernel thread, then you can eliminate synchronization issues since the fibers will effectively run in serial and you have complete control over their scheduling. Grouping related fibers under a single kernel thread is important, since the kernel thread they are running in can be pre-empted by the kernel. This point is not made clear in many of the other answers. Also, if you use blocking I/O in a fiber, the entire kernel thread it is a part of blocks including all the fibers that are part of that kernel thread.
- 通常所谓的线程是在内核中实现的执行线程:所谓的内核线程。内核线程的调度由内核专门处理,尽管内核线程可以根据需要通过休眠来自愿释放 CPU。内核线程的优点是它可以使用阻塞 I/O,让内核担心调度。它的主要缺点是线程切换相对较慢,因为它需要陷入内核。
- 纤程是用户空间线程,其调度在用户空间由单个进程下的一个或多个内核线程处理。这使得光纤交换非常快。如果您在单个内核线程的上下文中将访问特定共享数据集的所有纤程分组,并由单个内核线程处理它们的调度,那么您可以消除同步问题,因为纤程将有效地串行运行并且您已经完成控制他们的日程安排。在单个内核线程下对相关纤程进行分组很重要,因为它们运行的内核线程可以被内核抢占。这一点在许多其他答案中都没有说清楚。此外,如果您在纤程中使用阻塞 I/O,则整个内核线程都是块的一部分,包括属于该内核线程的所有纤程。
In section 11.4 "Processes and Threads in Windows Vista" in Modern Operating Systems, Tanenbaum comments:
在现代操作系统的第 11.4 节“Windows Vista 中的进程和线程”中,Tanenbaum 评论道:
Although fibers are cooperatively scheduled, if there are multiple threads scheduling the fibers, a lot of careful synchronization is required to make sure fi-bers do not interfere with each other. To simplify the interaction between threads and fibers, it is often useful to create only as many threads as there are processors to run them, and affinitize the threads to each run only on a distinct set of avail-able processors, or even just one processor. Each thread can then run a particular subset of the fibers, establishing a one- to-many relationship between threads and fibers which simplifies synchronization. Even so there are still many difficulties with fibers. Most Win32 libraries are completely unaware of fibers, and applications that attempt to use fibers as if they were threads will encounter various failures. The kernel has no knowledge of fi-bers, and when a fiber enters the kernel, the thread it is executing on may block and the kernel will schedule an arbitrary thread on the processor, making it unavailable to run other fibers. For these reasons fibers are rarely used except when porting code from other systems that explicitly need the functionality pro-vided by fibers.
尽管纤程是协同调度的,但如果有多个线程在调度纤程,则需要进行大量仔细的同步以确保纤程不会相互干扰。为了简化线程和纤程之间的交互,创建与运行它们的处理器一样多的线程通常很有用,并且将线程关联到每个线程仅在一组不同的可用处理器上运行,甚至只在一个处理器上运行. 然后每个线程可以运行特定的纤程子集,在线程和纤程之间建立一对多关系,从而简化同步。即便如此,光纤仍然存在许多困难。大多数 Win32 库完全不知道纤程,并且尝试将纤程当作线程使用的应用程序会遇到各种故障。内核不知道光纤,当光纤进入内核时,它正在执行的线程可能会阻塞,内核将在处理器上调度任意线程,使其无法运行其他光纤。由于这些原因,很少使用纤程,除非从明确需要纤程提供的功能的其他系统移植代码。
回答by Grant Wagner
Note that in addition to Threads and Fibers, Windows 7 introduces User-Mode Scheduling:
请注意,除了 Threads 和 Fibers,Windows 7 还引入了User-Mode Scheduling:
User-mode scheduling (UMS) is a light-weight mechanism that applications can use to schedule their own threads. An application can switch between UMS threads in user mode without involving the system scheduler and regain control of the processor if a UMS thread blocks in the kernel. UMS threads differ from fibers in that each UMS thread has its own thread context instead of sharing the thread context of a single thread. The ability to switch between threads in user mode makes UMS more efficient than thread pools for managing large numbers of short-duration work items that require few system calls.
用户模式调度 (UMS) 是一种轻量级机制,应用程序可以使用它来调度自己的线程。应用程序可以在用户模式下在 UMS 线程之间切换,而无需系统调度程序,如果 UMS 线程在内核中阻塞,则可以重新获得对处理器的控制。UMS 线程与纤程的不同之处在于,每个 UMS 线程都有自己的线程上下文,而不是共享单个线程的线程上下文。在用户模式下在线程之间切换的能力使 UMS 比线程池更有效地管理需要很少系统调用的大量短期工作项。
More information about threads, fibers and UMS is available by watching Dave Probert: Inside Windows 7 - User Mode Scheduler (UMS).
有关线程、光纤和 UMS 的更多信息,请观看Dave Probert:Windows 7 内部 - 用户模式调度程序 (UMS)。
回答by paxdiablo
Threads were originally created as lightweight processes. In a similar fashion, fibers are a lightweight thread, relying (simplistically) on the fibers themselves to schedule each other, by yielding control.
线程最初是作为轻量级进程创建的。以类似的方式,纤程是一个轻量级线程,通过让步控制(简单地)依赖纤程本身来相互调度。
I guess the next step will be strands where you have to send them a signal every time you want them to execute an instruction (not unlike my 5yo son :-). In the old days (and even now on some embedded platforms), all threads were fibers, there was no pre-emption and you had to write your threads to behave nicely.
我想下一步将是在每次您希望他们执行指令时都必须向他们发送信号的地方(与我 5 岁的儿子不同:-)。在过去(甚至现在在某些嵌入式平台上),所有线程都是纤程,没有抢占权,您必须编写线程以使其表现良好。
回答by Arnold Spence
Threads are scheduled by the OS (pre-emptive). A thread may be stopped or resumed at any time by the OS, but fibers more or less manage themselves (co-operative) and yield to each other. That is, the programmer controls when fibers do their processing and when that processing switches to another fiber.
线程由操作系统调度(先发制人)。操作系统可以随时停止或恢复线程,但光纤或多或少地管理自己(合作)并相互让步。也就是说,程序员控制纤程何时进行处理以及该处理何时切换到另一纤程。
回答by Mike Lowen
Threads generally rely on the kernel to interrupt the thread so it or another thread can run (which is better known as Pre-emptive multitasking) whereas fibers use co-operative multitasking where it is the fiber itself that give up the its running time so that other fibres can run.
线程通常依赖内核来中断线程以便它或另一个线程可以运行(这更好地称为抢占式多任务处理),而纤程使用协作式多任务处理,其中纤程本身放弃其运行时间,以便其他光纤可以运行。
Some useful links explaining it better than I probably did are:
一些有用的链接比我可能更好地解释它是:
回答by billmic
Win32 fiber definition is in fact "Green Thread" definition established at Sun Microsystems. There is no need to waste the term fiber on the thread of some kind, i.e., a thread executing in user space under user code/thread-library control.
Win32 光纤定义实际上是 Sun Microsystems 建立的“绿色线程”定义。没有必要在某种线程上浪费术语纤维,即在用户代码/线程库控制下在用户空间中执行的线程。
To clarify the argument look at the following comments:
要澄清论点,请查看以下评论:
- With hyper-threading, multi-core CPU can accept multiple threads and distribute them one on each core.
- Superscalar pipelined CPU accepts one thread for execution and uses Instruction Level Parallelism (ILP) to to run the the thread faster. We may assume that one thread is broken into parallel fibers running in parallel pipelines.
- SMT CPU can accept multiple threads and brake them into instruction fibers for parallel execution on multiple pipelines, using pipelines more efficiently.
- 通过超线程,多核 CPU 可以接受多个线程并在每个内核上分配一个线程。
- 超标量流水线 CPU 接受一个线程来执行,并使用指令级并行 (ILP) 来更快地运行线程。我们可以假设一个线程被分解为在并行管道中运行的并行纤程。
- SMT CPU 可以接受多个线程并将它们制动为指令纤维,以便在多个流水线上并行执行,更有效地使用流水线。
We should assume that processes are made of threads and that threads should be made of fibers. With that logic in mind, using fibers for other sorts of threads is wrong.
我们应该假设进程是由线程组成的,而线程应该由纤维组成。考虑到这一逻辑,将纤程用于其他类型的线程是错误的。