Javascript 当 Node.js 在内部仍然依赖线程时,它如何本质上更快?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3629784/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 05:29:31  来源:igfitidea点击:

How is Node.js inherently faster when it still relies on Threads internally?

javascriptarchitectureconcurrencynode.js

提问by Ralph Caraveo

I just watched the following video: Introduction to Node.jsand still don't understand how you get the speed benefits.

我刚刚看了下面的视频:Node.js 简介,但仍然不明白你是如何获得速度优势的。

Mainly, at one point Ryan Dahl (Node.js' creator) says that Node.js is event-loop based instead of thread-based. Threads are expensive and should only be left to the experts of concurrent programming to be utilized.

主要是,Ryan Dahl(Node.js 的创建者)曾说 Node.js 是基于事件循环而不是基于线程的。线程很昂贵,只能留给并发编程专家使用。

Later, he then shows the architecture stack of Node.js which has an underlying C implementation which has its own Thread pool internally. So obviously Node.js developers would never kick off their own threads or use the thread pool directly...they use async call-backs. That much I understand.

随后,他展示了 Node.js 的架构栈,它有一个底层的 C 实现,内部有自己的线程池。所以很明显,Node.js 开发人员永远不会启动他们自己的线程或直接使用线程池……他们使用异步回调。我明白的就这么多。

What I don't understand is the point that Node.js still is using threads...it's just hiding the implementation so how is this faster if 50 people request 50 files (not currently in memory) well then aren't 50 threads required?

我不明白的是,Node.js 仍然在使用线程......它只是隐藏了实现,所以如果 50 人请求 50 个文件(当前不在内存中)那么这会更快,然后不需要 50 个线程?

The only difference being that since it's managed internally the Node.js developer doesn't have to code the threaded details but underneath it's still using the threads to process the IO (blocking) file requests.

唯一的区别在于,由于它是在内部管理的,Node.js 开发人员不必编写线程细节,但在其底层仍然使用线程来处理 IO(阻塞)文件请求。

So aren't you really just taking one problem (threading) and hiding it while that problem still exists: mainly multiple threads, context switching, dead-locks...etc?

所以你真的不只是解决一个问题(线程)并在该问题仍然存在的时候隐藏它:主要是多线程、上下文切换、死锁......等?

There must be some detail I still do not understand here.

这里一定有一些细节我仍然不明白。

采纳答案by jrtipton

There are actually a few different things being conflated here. But it starts with the meme that threads are just really hard. So if they're hard, you are more likely, when using threads to 1) break due to bugs and 2) not use them as efficiently as possible. (2) is the one you're asking about.

实际上有一些不同的东西在这里被混为一谈。但它始于线程真的很难的模因。因此,如果它们很难,您更有可能在使用线程时 1) 由于错误而中断和 2) 没有尽可能有效地使用它们。(2) 是你要问的那个。

Think about one of the examples he gives, where a request comes in and you run some query, and then do something with the results of that. If you write it in a standard procedural way, the code might look like this:

想想他给出的一个例子,一个请求进来,你运行一些查询,然后对结果做一些事情。如果您以标准程序方式编写它,代码可能如下所示:

result = query( "select smurfs from some_mushroom" );
// twiddle fingers
go_do_something_with_result( result );

If the request coming in caused you to create a new thread that ran the above code, you'll have a thread sitting there, doing nothing at all while while query()is running. (Apache, according to Ryan, is using a single thread to satisfy the original request whereas nginx is outperforming it in the cases he's talking about because it's not.)

如果传入的请求导致您创建一个运行上述代码的新线程,那么您将有一个线程坐在那里,在query()运行时什么也不做。(根据 Ryan 的说法,Apache 使用单个线程来满足原始请求,而 nginx 在他所谈论的情况下表现优于它,因为事实并非如此。)

Now, if you were really clever, you would express the code above in a way where the environment could go off and do something else while you're running the query:

现在,如果你真的很聪明,你会用一种方式来表达上面的代码,当你运行查询时,环境可以关闭并做其他事情:

query( statement: "select smurfs from some_mushroom", callback: go_do_something_with_result() );

This is basically what node.js is doing. You're basically decorating -- in a way that is convenient because of the language and environment, hence the points about closures -- your code in such a way that the environment can be clever about what runs, and when. In that way, node.js isn't newin the sense that it invented asynchronous I/O (not that anyone claimed anything like this), but it's new in that the way it's expressed is a little different.

这基本上就是 node.js 正在做的事情。你基本上是在装饰——以一种方便的方式,因为语言和环境,因此是关于闭包的要点——你的代码使环境可以聪明地知道什么运行,何时运行。这样一来,node.js 就其发明了异步 I/O 的意义而言并不新鲜(并不是任何人都声称这样的东西),但它的新颖之处在于它的表达方式略有不同。

Note: when I say that the environment can be clever about what runs and when, specifically what I mean is that the thread it used to start some I/O can now be used to handle some other request, or some computation that can be done in parallel, or start some other parallel I/O. (I'm not certain node is sophisticated enough to start more work for the same request, but you get the idea.)

注意:当我说环境可以很聪明地决定运行什么以及何时运行时,特别是我的意思是它用于启动某些 I/O 的线程现在可以用于处理某些其他请求,或者某些可以完成的计算并行,或启动其他一些并行 I/O。(我不确定 node 是否足够复杂,可以为相同的请求开始更多的工作,但你明白了。)

回答by nalply

Note!This is an old answer. While it's still true in the rough outline, some details might have changed because of Node's rapid development in the last few years.

笔记!这是一个旧答案。虽然在粗略的大纲中仍然如此,但由于 Node 在过去几年的快速发展,一些细节可能已经发生了变化。

It is using threads because:

它使用线程是因为:

  1. The O_NONBLOCK option of open() does not work on files.
  2. There are third-party libraries which don't offer non-blocking IO.
  1. open()O_NONBLOCK 选项对 files 不起作用
  2. 有些第三方库不提供非阻塞 IO。

To fake non-blocking IO, threads are neccessary: do blocking IO in a separate thread. It is an ugly solution and causes much overhead.

要伪造非阻塞 IO,线程是必要的:在单独的线程中进行阻塞 IO。这是一个丑陋的解决方案,会导致很多开销。

It's even worse on the hardware level:

在硬件层面上更糟:

  • With DMAthe CPU asynchronously offloads IO.
  • Data is transferred directly between the IO device and the memory.
  • The kernel wraps this in a synchronous, blocking system call.
  • Node.js wraps the blocking system call in a thread.
  • 使用DMA,CPU 异步卸载 IO。
  • 数据直接在 IO 设备和内存之间传输。
  • 内核将其封装在一个同步的、阻塞的系统调用中。
  • Node.js 将阻塞系统调用包装在一个线程中。

This is just plain stupid and inefficient. But it works at least! We can enjoy Node.js because it hides the ugly and cumbersome details behind an event-driven asynchronous architecture.

这简直是​​愚蠢且低效的。但它至少有效!我们可以享受 Node.js,因为它在事件驱动的异步架构背后隐藏了丑陋和繁琐的细节。

Maybe someone will implement O_NONBLOCK for files in the future?...

也许将来有人会为文件实现 O_NONBLOCK ?...

Edit:I discussed this with a friend and he told me that an alternative to threads is polling with select: specify a timeout of 0 and do IO on the returned file descriptors (now that they are guaranteed not to block).

编辑:我和一个朋友讨论过这个问题,他告诉我线程的替代方法是使用select轮询:指定超时 0 并对返回的文件描述符执行 IO(现在保证它们不会阻塞)。

回答by Toby Eggitt

I fear I'm "doing the wrong thing" here, if so delete me and I apologize. In particular, I fail to see how I create the neat little annotations that some folks have created. However, I have many concerns/observations to make on this thread.

我担心我在这里“做错了事”,如果是这样,请删除我并道歉。特别是,我看不到我是如何创建一些人创建的整洁的小注释的。但是,我对这个线程有很多关注/观察。

1) The commented element in the pseudo-code in one of the popular answers

1) 流行答案之一的伪代码中的注释元素

result = query( "select smurfs from some_mushroom" );
// twiddle fingers
go_do_something_with_result( result );

is essentially bogus. If the thread is computing, then it's not twiddling thumbs, it's doing necessary work. If, on the other hand, it's simply waiting for the completion of IO, then it's notusing CPU time, the whole point of the thread control infrastructure in the kernel is that the CPU will find something useful to do. The only way to "twiddle your thumbs" as suggested here would be to create a polling loop, and nobody who has coded a real webserver is inept enough to do that.

本质上是假的。如果线程正在计算,那么它不是在摆弄拇指,而是在做必要的工作。另一方面,如果它只是在等待 IO 的完成,那么它就没有使用 CPU 时间,内核中线程控制基础结构的全部意义在于 CPU 会找到一些有用的事情来做。像这里建议的那样“摆弄你的拇指”的唯一方法是创建一个轮询循环,并且没有人编写过真正的网络服务器,没有人会这样做。

2) "Threads are hard", only makes sense in the context of data sharing. If you have essentially independent threads such as is the case when handling independent web requests, then threading is trivially simple, you just code up the linear flow of how to handle one job, and sit pretty knowing that it will handle multiple requests, and each will be effectively independent. Personally, I would venture that for most programmers, learning the closure/callback mechanism is more complex than simply coding the top-to-bottom thread version. (But yes, if you have to communicate between the threads, life gets really hard really fast, but then I'm unconvinced that the closure/callback mechanism really changes that, it just restricts your options, because this approach is still achievable with threads. Anyway, that's a whole other discussion that's really not relevant here).

2)“线程很难”,只有在数据共享的上下文中才有意义。如果您有本质上独立的线程,例如处理独立的 Web 请求时的情况,那么线程化非常简单,您只需编写如何处理一项工作的线性流程,并且知道它将处理多个请求,并且每个将有效地独立。就我个人而言,我敢说,对于大多数程序员来说,学习闭包/回调机制比简单地编写自上而下的线程版本要复杂得多。(但是,是的,如果您必须在线程之间进行通信,那么生活会变得非常艰难,很快,但是我不相信闭包/回调机制真的会改变这一点,它只是限制了您的选择,因为这种方法仍然可以通过线程实现. 无论如何,那个'

3) So far, nobody has presented any real evidence as to why one particular type of context switch would be more or less time consuming than any other type. My experience in creating multi-tasking kernels (on a small scale for embedded controllers, nothing so fancy as a "real" OS) suggests that this would not be the case.

3) 到目前为止,没有人提出任何真实的证据来说明为什么一种特定类型的上下文切换会比任何其他类型更耗时或更短。我在创建多任务内核方面的经验(嵌入式控制器的小规模,没有什么比“真正的”操作系统更花哨)表明情况并非如此。

4) All the illustrations that I have seen to date that purport to show how much faster Node is than other webservers are horribly flawed, however, they're flawed in a way that does indirectly illustrate one advantage I would definitely accept for Node (and it's by no means insignificant). Node doesn't look like it needs (nor even permits, actually) tuning. If you have a threaded model, you need to create sufficient threads to handle the expected load. Do this badly, and you'll end up with poor performance. If there are too few threads, then the CPU is idle, but unable to accept more requests, create too many threads, and you will waste kernel memory, and in the case of a Java environment, you'll also be wasting main heap memory. Now, for Java, wasting heap is the first, best, way to screw up the system's performance, because efficient garbage collection (currently, this might change with G1, but it seems that the jury is still out on that point as of early 2013 at least) depends on having lots of spare heap. So, there's the issue, tune it with too few threads, you have idle CPUs and poor throughput, tune it with too many, and it bogs down in other ways.

4) 迄今为止,我所看到的所有插图都旨在表明 Node 比其他网络服务器快得多,但它们存在严重缺陷,但它们确实间接地说明了我肯定会接受 Node 的一个优势(和这绝不是微不足道的)。Node 看起来不需要(实际上甚至不允许)调整。如果您有线程模型,则需要创建足够的线程来处理预期的负载。做得不好,最终会导致性能不佳。如果线程太少,那么CPU是空闲的,但是无法接受更多的请求,创建的线程太多,会浪费内核内存,而在Java环境下,也会浪费主堆内存. 现在,对于 Java,浪费堆是破坏系统性能的第一个也是最好的方法,因为高效的垃圾收集(目前,这可能会随着 G1 的变化而改变,但至少在 2013 年初,陪审团似乎仍然在这一点上)取决于有大量的备用堆。所以,有一个问题,用太少的线程来调整它,你有空闲的 CPU 和低吞吐量,用太多来调整它,它会以其他方式陷入困境。

5) There is another way in which I accept the logic of the claim that Node's approach "is faster by design", and that is this. Most thread models use a time-sliced context switch model, layered on top of the more appropriate (value judgement alert :) and more efficient (not a value judgement) preemptive model. This happens for two reasons, first, most programmers don't seem to understand priority preemption, and second, if you learn threading in a windows environment, the timeslicing is there whether you like it or not (of course, this reinforces the first point; notably, the first versions of Java used priority preemption on Solaris implementations, and timeslicing in Windows. Because most programmers didn't understand and complained that "threading doesn't work in Solaris" they changed the model to timeslice everywhere). Anyway, the bottom line is that timeslicing creates additional (and potentially unnecessary) context switches. Every context switch takes CPU time, and that time is effectively removed from the work that can be done on the real job at hand. However, the amount of time invested in context switching because of timeslicing should not be more than a very small percentage of the overall time, unless something pretty outlandish is happening, and there's no reason I can see to expect that to be the case in a simple webserver). So, yes, the excess context switches involved in timeslicing are inefficient (and these don't happen in kernelthreads as a rule, btw) but the difference will be a few percent of throughput, not the kind of whole number factors that are implied in the performance claims that are often implied for Node.

5) 还有另一种方式,我接受 Node 的方法“设计速度更快”这一说法的逻辑,就是这样。大多数线程模型使用时间切片上下文切换模型,分层在更合适(价值判断警报:)和更高效(不是价值判断)的抢占模型之上。发生这种情况有两个原因,第一,大多数程序员似乎不理解优先级抢占,第二,如果你在 windows 环境中学习线程,无论你喜欢与否,时间片都在那里(当然,这加强了第一点; 值得注意的是,Java 的第一个版本在 Solaris 实现中使用了优先级抢占,在 Windows 中使用了时间片。因为大多数程序员不理解并抱怨“线程在 Solaris 中不起作用” 他们在任何地方都将模型更改为时间片)。无论如何,底线是时间切片会创建额外的(并且可能是不必要的)上下文切换。每次上下文切换都会占用 CPU 时间,而这些时间会从手头实际工作中可以完成的工作中有效地消除。但是,由于时间切片而在上下文切换上投入的时间量不应超过总时间的很小一部分,除非发生了一些非常古怪的事情,而且我没有理由认为情况会如此简单的网络服务器)。所以,是的,时间片中涉及的过多上下文切换是低效的(这些不会发生在 并且从手头的实际工作中可以完成的工作有效地消除了时间。但是,由于时间切片而在上下文切换上投入的时间量不应超过总时间的很小一部分,除非发生了一些非常古怪的事情,而且我没有理由认为情况会如此简单的网络服务器)。所以,是的,时间片中涉及的过多上下文切换是低效的(这些不会发生在 并且从手头的实际工作中可以完成的工作有效地消除了时间。但是,由于时间切片而在上下文切换上投入的时间量不应超过总时间的很小一部分,除非发生了一些非常古怪的事情,而且我没有理由认为情况会如此简单的网络服务器)。所以,是的,时间片中涉及的过多上下文切换是低效的(这些不会发生在内核线程作为一项规则,顺便说一句),但差异将是吞吐量的百分之几,而不是通常隐含在 Node.js 中的性能声明中隐含的那种整数因素。

Anyway, apologies for that all being long and rambly, but I really feel that so far, the discussion hasn't proved anything, and I would be pleased to hear from someone in either of these situations:

无论如何,为所有冗长而漫不经心的事情道歉,但我真的觉得到目前为止,讨论还没有证明任何事情,我很高兴听到有人在这两种情况下的意见:

a) a real explanation of why Node should be better (beyond the two scenarios I've outlined above, the first of which (poor tuning) I believe is the real explanation for all the tests I've seen so far. ([edit], actually, the more I think about it, the more I'm wondering if the memory used by vast numbers of stacks might be significant here. The default stack sizes for modern threads tend to be pretty huge, but the memory allocated by a closure-based event system would be only what's needed)

a) 对为什么 Node 应该更好的真正解释(除了我上面概述的两个场景,我认为第一个(调优不佳)是我迄今为止看到的所有测试的真实解释。([编辑] ], 实际上, 我越想, 我就越想知道这里大量堆栈使用的内存是否很重要. 现代线程的默认堆栈大小往往非常大, 但由 a 分配的内存基于闭包的事件系统只是需要的)

b) a real benchmark that actually gives a fair chance to the threaded server of choice. At least that way, I'd have to stop believing that the claims are essentially false ;> ([edit] that's probably rather stronger than I intended, but I do feel that the explanations given for performance benefits are incomplete at best, and the benchmarks shown are unreasonable).

b) 一个真正的基准测试,它实际上为选择的线程服务器提供了公平的机会。至少那样,我不得不停止相信这些声明本质上是错误的;>([编辑]这可能比我预期的要强得多,但我确实觉得对性能优势的解释充其量是不完整的,并且显示的基准是不合理的)。

Cheers, Toby

干杯,托比

回答by Alfred

What I don't understand is the point that Node.js still is using threads.

我不明白的是 Node.js 仍然在使用线程。

Ryan uses threads for that parts that are blocking(Most of node.js uses non-blocking IO) because some parts are insane hard to write non blocking. But I believe Ryan wish is to have everything non-blocking. On slide 63(internal design)you see Ryan uses libev(library that abstracts asynchronous event notification) for the non-blocking eventloop. Because of the event-loop node.js needs lesser threads which reduces context switching, memory consumption etc.

Ryan 使用线程来处理阻塞的部分(大多数 node.js 使用非阻塞 IO),因为有些部分非常难以编写非阻塞。但我相信 Ryan 的愿望是让一切都无阻塞。在幻灯片 63(内部设计)上,您会看到 Ryan 使用libev(抽象异步事件通知的库)作为非阻塞eventloop。由于事件循环 node.js 需要较少的线程,从而减少上下文切换、内存消耗等。

回答by gawi

Threads are used only to deal with functions having no asynchronous facility, like stat().

线程仅用于处理没有异步功能的函数,例如stat().

The stat()function is always blocking, so node.js needs to use a thread to perform the actual call without blocking the main thread (event loop). Potentially, no thread from the thread pool will ever be used if you don't need to call those kind of functions.

stat()函数始终处于阻塞状态,因此 node.js 需要使用线程来执行实际调用,而不阻塞主线程(事件循环)。如果您不需要调用这些类型的函数,则可能永远不会使用线程池中的任何线程。

回答by BGerrissen

I know nothing about the internal workings of node.js, but I can see how using an event loop can outperform threaded I/O handling. Imagine a disc request, give me staticFile.x, make it 100 requests for that file. Each request normally takes up a thread retreiving that file, thats 100 threads.

我对 node.js 的内部工作一无所知,但我可以看到使用事件循环如何优于线程 I/O 处理。想象一个光盘请求,给我 staticFile.x,让它对该文件发出 100 个请求。每个请求通常占用一个线程来检索该文件,即 100 个线程。

Now imagine the first request creating one thread that becomes a publisher object, all 99 other requests first look if there's a publisher object for staticFile.x, if so, listen to it while it's doing it's work, otherwise start a new thread and thus a new publisher object.

现在想象第一个请求创建一个成为发布者对象的线程,所有 99 个其他请求首先查看是否有 staticFile.x 的发布者对象,如果有,则在它工作时监听它,否则启动一个新线程,因此新的发布者对象。

Once the single thread is done, it passes staticFile.x to all 100 listeners and destroys itself, so the next request creates a fresh new thread and publisher object.

单线程完成后,它会将 staticFile.x 传递给所有 100 个侦听器并销毁自身,因此下一个请求会创建一个全新的线程和发布者对象。

So it's 100 threads vs 1 thread in the above example, but also 1 disc lookup instead of 100 disc lookups, the gain can be quite phenominal. Ryan is a smart guy!

所以在上面的例子中是 100 个线程 vs 1 个线程,但也是 1 个磁盘查找而不是 100 个磁盘查找,增益可能非常显着。瑞安是个聪明人!

Another way to look at is is one of his examples at the start of the movie. Instead of:

另一种看待方式是他在电影开头的一个例子。代替:

pseudo code:
result = query('select * from ...');

Again, 100 seperate queries to a database versus...:

同样,对数据库的 100 个单独查询与...:

pseudo code:
query('select * from ...', function(result){
    // do stuff with result
});

If a query was already going, other equal queries would simply jump on the bandwagon, so you can have 100 queries in a single database roundtrip.

如果一个查询已经在进行,其他相同的查询将简单地跟上潮流,因此您可以在单个数据库往返中拥有 100 个查询。