multithreading 从技术上讲,为什么 Erlang 中的进程比 OS 线程更高效?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2708033/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 01:08:27  来源:igfitidea点击:

Technically, why are processes in Erlang more efficient than OS threads?

multithreadingerlanggreen-threadslightweight-processes

提问by Jonas

Erlang's Characteristics

Erlang的特性

From Erlang Programming(2009):

来自Erlang 编程(2009 年):

Erlang concurrency is fast and scalable. Its processes are lightweight in that the Erlang virtual machine does not create an OS thread for every created process. They are created, scheduled, and handled in the VM, independent of underlying operating system. As a result, process creation time is of the order of microseconds and independent of the number of concurrently existing processes. Compare this with Java and C#, where for every process an underlying OS thread is created: you will get some very competitive comparisons, with Erlang greatly outperforming both languages.

Erlang 并发速度快且可扩展。它的进程是轻量级的,因为 Erlang 虚拟机不会为每个创建的进程创建一个操作系统线程。它们在 VM 中创建、调度和处理,独立于底层操作系统。因此,进程创建时间为微秒级,并且与同时存在的进程数无关。将其与 Java 和 C# 进行比较,其中为每个进程创建一个底层操作系统线程:您将得到一些非常有竞争力的比较,Erlang 大大优于这两种语言。

From Concurrency oriented programming in Erlang (pdf)(slides)(2003):

来自Erlang 中面向并发的编程 (pdf) (slides)(2003):

We observe that the time taken to create an Erlang process is constant 1μs up to 2,500 processes; thereafter it increases to about 3μs for up to 30,000 processes. The performance of Java and C# is shown at the top of the figure. For a small number of processes it takes about 300μs to create a process. Creating more than two thousand processes is impossible.

We see that for up to 30,000 processes the time to send a message between two Erlang processes is about 0.8μs. For C# it takes about 50μs per message, up to the maximum number of processes (which was about 1800 processes). Java was even worse, for up to 100 process it took about 50μs per message thereafter it increased rapidly to 10ms per message when there were about 1000 Java processes.

我们观察到,创建一个 Erlang 进程所花费的时间是恒定的 1μs,最多可达 2,500 个进程;此后,对于多达 30,000 个过程,它增加到大约 3μs。Java 和 C# 的性能显示在图的顶部。对于少数进程,创建一个进程大约需要 300μs。创建超过两千个进程是不可能的。

我们看到,对于多达 30,000 个进程,在两个 Erlang 进程之间发送消息的时间约为 0.8μs。对于 C#,每条消息大约需要 50 微秒,达到最大进程数(大约 1800 个进程)。Java 更糟糕,对于多达 100 个进程,每条消息需要大约 50 微秒,此后当有大约 1000 个 Java 进程时,它迅速增加到每条消息 10 毫秒。

My thoughts

我的想法

I don't fully understand technically why Erlang processes are so much more efficient in spawning new processes and have much smaller memory footprints per process. Both the OS and Erlang VM have to do scheduling, context switches, and keep track of the values in the registers and so on...

从技术上讲,我不完全理解为什么 Erlang 进程在生成新进程方面效率如此之高,并且每个进程的内存占用要小得多。OS 和 Erlang VM 都必须进行调度、上下文切换以及跟踪寄存器中的值等等......

Simply why aren't OS threads implemented in the same way as processes in Erlang? Do they have to support something more? And why do they need a bigger memory footprint? And why do they have slower spawning and communication?

为什么操作系统线程的实现方式与 Erlang 中的进程不同?他们必须支持更多的东西吗?为什么他们需要更大的内存占用?为什么它们的产卵和交流速度较慢?

Technically, why are processes in Erlang more efficient than OS threads when it comes to spawning and communication? And why can't threads in the OS be implemented and managed in the same efficient way? And why do OS threads have a bigger memory footprint, plus slower spawning and communication?

从技术上讲,为什么 Erlang 中的进程在生成和通信方面比 OS 线程更有效?为什么不能以同样有效的方式实现和管理操作系统中的线程?为什么操作系统线程有更大的内存占用,加上更慢的生成和通信?

More reading

更多阅读

采纳答案by Marcelo Cantos

There are several contributing factors:

有几个促成因素:

  1. Erlang processes are not OS processes. They are implemented by the Erlang VM using a lightweight cooperative threading model (preemptive at the Erlang level, but under the control of a cooperatively scheduled runtime). This means that it is much cheaper to switch context, because they only switch at known, controlled points and therefore don't have to save the entire CPU state (normal, SSE and FPU registers, address space mapping, etc.).
  2. Erlang processes use dynamically allocated stacks, which start very small and grow as necessary. This permits the spawning of many thousands — even millions — of Erlang processes without sucking up all available RAM.
  3. Erlang used to be single-threaded, meaning that there was no requirement to ensure thread-safety between processes. It now supports SMP, but the interaction between Erlang processes on the same scheduler/core is still very lightweight (there are separate run queues per core).
  1. Erlang 进程不是操作系统进程。它们由 Erlang VM 使用轻量级协作线程模型(在 Erlang 级别抢占,但在协作调度运行时的控制下)实现。这意味着切换上下文要便宜得多,因为它们只在已知的受控点切换,因此不必保存整个 CPU 状态(正常、SSE 和 FPU 寄存器、地址空间映射等)。
  2. Erlang 进程使用动态分配的堆栈,这些堆栈开始时很小,并根据需要增长。这允许在不占用所有可用 RAM 的情况下生成数千甚至数百万个 Erlang 进程。
  3. Erlang 曾经是单线程的,这意味着不需要确保进程之间的线程安全。它现在支持 SMP,但同一调度程序/内核上的 Erlang 进程之间的交互仍然非常轻量级(每个内核有单独的运行队列)。

回答by Jonas

After some more research I found a presentation by Joe Armstrong.

经过更多研究,我找到了 Joe Armstrong 的演讲。

From Erlang - software for a concurrent world (presentation)(at 13 min):

来自Erlang - 并发世界的软件(演示)(13 分钟):

[Erlang] is a concurrent language – by that I mean that threads are part of the programming language, they do not belong to the operating system. That's really what's wrong with programming languages like Java and C++. It's threads aren't in the programming language, threads are something in the operating system – and they inherit all the problems that they have in the operating system. One of the problems is granularity of the memory management system.The memory management in the operating system protects whole pages of memory, so the smallest size that a thread can be is the smallest size of a page.That's actually too big.

If you add more memory to your machine – you have the same number of bits that protects the memory so the granularity of the page tables goes upyou end up using say 64kB for a process you know running in a few hundred bytes.

[Erlang] 是一种并发语言——我的意思是线程是编程语言的一部分,它们不属于操作系统。这就是 Java 和 C++ 等编程语言的真正问题所在。它的线程不在编程语言中,线程是操作系统中的东西——它们继承了操作系统中的所有问题。问题之一是内存管理系统的粒度。操作系统中的内存管理保护整个内存页面,因此线程可以达到的最小尺寸是页面的最小尺寸。这实在是太大了。

如果你给你的机器增加更多的内存——你有相同数量的位来保护内存,所以页表的粒度会增加——你最终会使用 64kB 来处理你知道运行几百字节的进程。

I think it answers if not all, at least a few of my questions

我认为它至少回答了我的一些问题

回答by Surfer Jeff

I've implemented coroutines in assembler, and measured performance.

我已经在汇编程序中实现了协程,并测量了性能。

Switching between coroutines, a.k.a. Erlang processes, takes about 16 instructions and 20 nanoseconds on a modern processor. Also, you often know the process you are switching to (example: a process receiving a message in its queue can be implemented as straight hand-off from the calling process to the receiving process) so the scheduler doesn't come into play, making it an O(1) operation.

在现代处理器上,协程(又名 Erlang 进程)之间的切换大约需要 16 条指令和 20 纳秒。此外,您通常知道要切换到的进程(例如:在其队列中接收消息的进程可以实现为从调用进程到接收进程的直接切换),因此调度程序不会发挥作用,使得它是一个 O(1) 操作。

To switch OS threads, it takes about 500-1000 nanoseconds, because you're calling down to the kernel. The OS thread scheduler might run in O(log(n)) or O(log(log(n))) time, which will start to be noticeable if you have tens of thousands, or even millions of threads.

要切换操作系统线程,大约需要 500-1000 纳秒,因为您正在调用内核。操作系统线程调度程序可能会在 O(log(n)) 或 O(log(log(n))) 时间内运行,如果您有数万甚至数百万个线程,这将开始变得明显。

Therefore, Erlang processes are faster and scale better because both the fundamental operation of switching is faster, and the scheduler runs less often.

因此,Erlang 进程更快且扩展性更好,因为切换的基本操作更快,并且调度程序运行的频率更低。

回答by Donal Fellows

Erlang processes correspond (approximately) to green threadsin other languages; there's no OS-enforced separation between the processes. (There may well be language-enforced separation, but that's a lesser protection despite Erlang doing a better job than most.) Because they're so much lighter-weight, they can be used far more extensively.

Erlang 进程(大约)对应于其他语言的绿色线程;进程之间没有操作系统强制分离。(很可能存在语言强制分离,但尽管 Erlang 比大多数人做得更好,但这是一种较弱的保护。)因为它们重量轻得多,所以它们可以更广泛地使用。

OS threads on the other hand are able to be simply scheduled on different CPU cores, and are (mostly) able to support independent CPU-bound processing. OS processes are like OS threads, but with much stronger OS-enforced separation. The price of these capabilities is that OS threads and (even more so) processes are more expensive.

另一方面,操作系统线程能够简单地在不同的 CPU 内核上进行调度,并且(大部分)能够支持独立的 CPU 绑定处理。操作系统进程类似于操作系统线程,但具有更强的操作系统强制分离。这些功能的代价是操作系统线程和(甚至更多)进程更加昂贵。



Another way to understand the difference is this. Supposing you were going to write an implementation of Erlang on top of the JVM (not a particularly crazy suggestion) then you'd make each Erlang process be an object with some state. You'd then have a pool of Thread instances (typically sized according to the number of cores in your host system; that's a tunable parameter in real Erlang runtimes BTW) which run the Erlang processes. In turn, that will distribute the work that is to be done across the real system resources available. It's a pretty neat way of doing things, but relies utterlyon the fact that each individual Erlang process doesn't do very much. That's OK of course; Erlang is structured to not require those individual processes to be heavyweight since it is the overall ensemble of them which execute the program.

理解差异的另一种方法是这样。假设您要在 JVM 之上编写 Erlang 的实现(不是特别疯狂的建议),那么您将使每个 Erlang 进程成为具有某种状态的对象。然后,您将拥有一个运行 Erlang 进程的 Thread 实例池(通常根据主机系统中的内核数量确定大小;这是实际 Erlang 运行时中的可调参数)。反过来,这将在可用的实际系统资源之间分配要完成的工作。这是一种非常简洁的做事方式,但完全依赖事实上,每个单独的 Erlang 进程并没有做太多事情。那当然没问题;Erlang 的结构不要求这些单个进程是重量级的,因为执行程序的是它们的整体集合。

In many ways, the real problem is one of terminology. The things that Erlang calls processes (and which correspond strongly to the same concept in CSP, CCS, and particularly the π-calculus) are simply not the same as the things that languages with a C heritage (including C++, Java, C#, and many others) call a process or a thread. There are somesimilarities (all involve some notion of concurrent execution) but there's definitely no equivalence. So be careful when someone says “process” to you; they might understand it to mean something utterly different…

在许多方面,真正的问题是术语之一。Erlang 调用进程的东西(它们与 CSP、CCS 中的相同概念强烈对应,特别是 π 演算)与具有 C 遗产的语言(包括 C++、Java、C# 和许多其他人)调用一个进程或一个线程。有一些相似之处(都涉及一些并发执行的概念),但绝对没有等价之处。所以当有人对你说“过程”时要小心;他们可能将其理解为完全不同的意思……

回答by Jurnell

I think Jonas wanted some numbers on comparing OS threads to Erlang processes. The author of Programming Erlang, Joe Armstrong, a while back tested the scalability of the spawning of Erlang processes to OS threads. He wrote a simple web server in Erlang and tested it against multi-threaded Apache (since Apache uses OS threads). There's an old website with the data dating back to 1998. I've managed only to find that site exactly once. So I can't supply a link. But the information is out there. The main point of the study showed that Apache maxed out just under 8K processes, while his hand written Erlang server handled 10K+ processes.

我认为 Jonas 想要一些比较 OS 线程和 Erlang 进程的数字。Programming Erlang 的作者 Joe Armstrong 不久前测试了将 Erlang 进程生成为 OS 线程的可扩展性。他用 Erlang 编写了一个简单的 Web 服务器,并针对多线程 Apache(因为 Apache 使用操作系统线程)对其进行了测试。有一个旧网站的数据可以追溯到 1998 年。我只找到了一次。所以我无法提供链接。但是信息就在那里。研究的要点表明,Apache 的最大进程数不到 8K,而他手写的 Erlang 服务器处理了 10K+ 个进程。

回答by Francisco Soto

Because Erlang interpreter has only to worry about itself, the OS has many other things to worry about.

因为 Erlang 解释器只需要担心自己,所以操作系统还有很多其他的事情要担心。

回答by ratzily

one of the reason is erlang process is created not in the OS, but in the evm(erlang virtual machine), so the cost is smaller.

原因之一是erlang进程不是在OS中创建的,而是在evm(erlang虚拟机)中创建的,所以成本较小。