node.js 线程池什么时候用?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22644328/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 17:04:31  来源:igfitidea点击:

When is the thread pool used?

node.jsevents

提问by Haney

So I have an understanding of how Node.js works: it has a single listener thread that receives an event and then delegates it to a worker pool. The worker thread notifies the listener once it completes the work, and the listener then returns the response to the caller.

所以我了解了 Node.js 的工作原理:它有一个侦听器线程,它接收一个事件,然后将它委托给一个工作池。工作线程完成工作后通知侦听器,然后侦听器将响应返回给调用者。

My question is this: if I stand up an HTTP server in Node.js and call sleep on one of my routed path events (such as "/test/sleep"), the whole system comes to a halt. Even the single listener thread. But my understanding was that this code is happening on the worker pool.

我的问题是:如果我在 Node.js 中建立一个 HTTP 服务器并在我的路由路径事件之一(例如“/test/sleep”)上调用 sleep,整个系统就会停止。即使是单个侦听器线程。但我的理解是这段代码发生在工作池上。

Now, by contrast, when I use Mongoose to talk to MongoDB, DB reads are an expensive I/O operation. Node seems to be able to delegate the work to a thread and receive the callback when it completes; the time taken to load from the DB does not seem to block the system.

现在,相比之下,当我使用 Mongoose 与 MongoDB 对话时,DB 读取是一项昂贵的 I/O 操作。Node 似乎能够将工作委托给一个线程并在它完成时接收回调;从数据库加载所需的时间似乎不会阻塞系统。

How does Node.js decide to use a thread pool thread vs the listener thread? Why can't I write event code that sleeps and only blocks a thread pool thread?

Node.js 如何决定使用线程池线程与侦听器线程?为什么我不能编写睡眠并且只阻塞线程池线程的事件代码?

回答by Jason

Your understanding of how node works isn't correct... but it's a common misconception, because the reality of the situation is actually fairly complex, and typically boiled down to pithy little phrases like "node is single threaded" that over-simplify things.

您对节点如何工作的理解是不正确的……但这是一个常见的误解,因为实际情况实际上相当复杂,通常可以归结为诸如“节点是单线程的”之类的简洁的小短语,这些短语过度简化了事情.

For the moment, we'll ignore explicit multi-processing/multi-threading through clusterand webworker-threads, and just talk about typical non-threaded node.

目前,我们将忽略通过clusterwebworker-threads 进行的显式多处理/多线程,只讨论典型的非线程节点。

Node runs in a single event loop. It's single threaded, and you only ever get that one thread. All of the javascript you write executes in this loop, and if a blocking operation happens in that code, then it will block the entire loop and nothing else will happen until it finishes. This is the typically single threaded nature of node that you hear so much about. But, it's not the whole picture.

Node 在单个事件循环中运行。它是单线程的,你只能得到那个线程。您编写的所有 javascript 都在此循环中执行,如果在该代码中发生阻塞操作,则它将阻塞整个循环,并且在完成之前不会发生任何其他事情。这是您经常听到的节点典型的单线程特性。但是,这不是全部。

Certain functions and modules, usually written in C/C++, support asynchronous I/O. When you call these functions and methods, they internally manage passing the call on to a worker thread. For instance, when you use the fsmodule to request a file, the fsmodule passes that call on to a worker thread, and that worker waits for its response, which it then presents back to the event loop that has been churning on without it in the meantime. All of this is abstracted away from you, the node developer, and some of it is abstracted away from the module developers through the use of libuv.

某些通常用 C/C++ 编写的函数和模块支持异步 I/O。当您调用这些函数和方法时,它们会在内部管理将调用传递给工作线程。例如,当您使用该fs模块请求文件时,该fs模块将该调用传递给一个工作线程,该工作线程等待其响应,然后将响应返回给一直在搅动的事件循环与此同时。所有这些都从您,即节点开发人员那里抽象出来,其中一些是通过使用libuv从模块开发人员那里抽象出来的。

As pointed out by Denis Dollfus in the comments (from this answerto a similar question), the strategy used by libuv to achieve asynchronous I/O is not always a thread pool, specifically in the case of the httpmodule a different strategy appears to be used at this time. For our purposes here it's mainly important to note how the asynchronous context is achieved (by using libuv) and that the thread pool maintained by libuv is one of multiple strategies offered by that library to achieve asynchronicity.

正如 Denis Dollfus 在评论中指出的(来自这个对类似问题的回答),libuv 用于实现异步 I/O 的策略并不总是线程池,特别是在http模块的情况下,不同的策略似乎是此时使用。出于我们的目的,重要的是要注意异步上下文是如何实现的(通过使用 libuv),并且 libuv 维护的线程池是该库提供的实现异步性的多种策略之一。



On a mostly related tangent, there is a much deeper analysis of how node achieves asynchronicity, and some related potential problems and how to deal with them, in this excellent article. Most of it expands on what I've written above, but additionally it points out:

在最相关的切线上,这篇优秀文章对节点如何实现异步性、一些相关的潜在问题以及如何处理它们进行了更深入的分析。其中大部分内容扩展了我上面写的内容,但另外指出:

  • Any external module that you include in your project that makes use of native C++ and libuv is likely to use the thread pool (think: database access)
  • libuv has a default thread pool size of 4, and uses a queue to manage access to the thread pool - the upshot is that if you have 5 long-running DB queries all going at the same time, one of them (and any other asynchronous action that relies on the thread pool) will be waiting for those queries to finish before they even get started
  • You can mitigate this by increasing the size of the thread pool through the UV_THREADPOOL_SIZEenvironment variable, so long as you do it before the thread pool is required and created: process.env.UV_THREADPOOL_SIZE = 10;
  • 您在项目中包含的任何使用本机 C++ 和 libuv 的外部模块都可能使用线程池(想想:数据库访问)
  • libuv 的默认线程池大小为 4,并使用队列来管理对线程池的访问 - 结果是,如果您有 5 个长时间运行的数据库查询同时进行,其中一个(以及任何其他异步依赖线程池的操作)将在这些查询开始之前等待它们完成
  • 您可以通过UV_THREADPOOL_SIZE环境变量增加线程池的大小来缓解这种情况,只要您在需要并创建线程池之前执行此操作:process.env.UV_THREADPOOL_SIZE = 10;


If you want traditional multi-processing or multi-threading in node, you can get it through the built in clustermodule or various other modules such as the aforementioned webworker-threads, or you can fake it by implementing some way of chunking up your work and manually using setTimeoutor setImmediateor process.nextTickto pause your work and continue it in a later loop to let other processes complete (but that's not recommended).

如果你想在 node 中进行传统的多处理或多线程,你可以通过内置cluster模块或各种其他模块(如上述模块)来获得它webworker-threads,或者你可以通过实现一些分块你的工作并手动使用setTimeoutsetImmediate或者process.nextTick暂停您的工作并在稍后的循环中继续它以让其他进程完成(但不建议这样做)。

Please note, if you're writing long running/blocking code in javascript, you're probably making a mistake. Other languages will perform much more efficiently.

请注意,如果您在 javascript 中编写长时间运行/阻塞的代码,您可能会犯错误。其他语言的执行效率会更高。

回答by Peter Lyons

So I have an understanding of how Node.js works: it has a single listener thread that receives an event and then delegates it to a worker pool. The worker thread notifies the listener once it completes the work, and the listener then returns the response to the caller.

所以我了解了 Node.js 的工作原理:它有一个侦听器线程,它接收一个事件,然后将它委托给一个工作池。工作线程完成工作后通知侦听器,然后侦听器将响应返回给调用者。

This is not really accurate. Node.js has only a single "worker" thread that does javascript execution. There are threads within node that handle IO processing, but to think of them as "workers" is a misconception. There are really just IO handling and a few other details of node's internal implementation, but as a programmer you cannot influence their behavior other than a few misc parameters such as MAX_LISTENERS.

这并不准确。Node.js 只有一个执行 javascript 的“工作”线程。节点中有处理 IO 处理的线程,但将它们视为“工人”是一种误解。实际上只有 IO 处理和节点内部实现的一些其他细节,但作为程序员,除了一些其他参数(如 MAX_LISTENERS)之外,您无法影响它们的行为。

My question is this: if I stand up an HTTP server in Node.js and call sleep on one of my routed path events (such as "/test/sleep"), the whole system comes to a halt. Even the single listener thread. But my understanding was that this code is happening on the worker pool.

我的问题是:如果我在 Node.js 中建立一个 HTTP 服务器并在我的路由路径事件之一(例如“/test/sleep”)上调用 sleep,整个系统就会停止。即使是单个侦听器线程。但我的理解是这段代码发生在工作池上。

There is no sleep mechanism in JavaScript. We could discuss this more concretely if you posted a code snippet of what you think "sleep" means. There's no such function to call to simulate something like time.sleep(30)in python, for example. There's setTimeoutbut that is fundamentally NOT sleep. setTimeoutand setIntervalexplicitly release, not block, the event loop so other bits of code can execute on the main execution thread. The only thing you can do is busy loop the CPU with in-memory computation, which will indeed starve the main execution thread and render your program unresponsive.

JavaScript 中没有睡眠机制。如果您发布了您认为“睡眠”意味着什么的代码片段,我们可以更具体地讨论这个问题。例如,没有这样的函数可以调用来模拟类似time.sleep(30)python 的东西。有,setTimeout但这基本上不是睡眠。setTimeoutsetInterval显式释放而不是阻止事件循环,以便其他位代码可以在主执行线程上执行。你唯一能做的就是用内存计算忙循环 CPU,这确实会使主执行线程饿死并使你的程序无响应。

How does Node.js decide to use a thread pool thread vs the listener thread? Why can't I write event code that sleeps and only blocks a thread pool thread?

Node.js 如何决定使用线程池线程与侦听器线程?为什么我不能编写睡眠并且只阻塞线程池线程的事件代码?

Network IO is always asynchronous. End of story. Disk IO has both synchronous and asynchronous APIs, so there is no "decision". node.js will behave according to the API core functions you call sync vs normal async. For example: fs.readFilevs fs.readFileSync. For child processes, there are also separate child_process.execand child_process.execSyncAPIs.

网络 IO 始终是异步的。故事结局。磁盘 IO 有同步和异步 API,所以没有“决定”。node.js 将根据您调用的 API 核心函数执行同步与普通异步。例如:fs.readFilevs fs.readFileSync. 对于子进程,也有单独的child_process.execchild_process.execSyncAPI。

Rule of thumb is always use the asynchronous APIs. The valid reasons to use the sync APIs are for initialization code in a network service before it is listening for connections or in simple scripts that do not accept network requests for build tools and that kind of thing.

经验法则是始终使用异步 API。使用同步 API 的正当理由是在网络服务侦听连接之前初始化代码,或者在不接受构建工具的网络请求的简单脚本中等等。

回答by Gregory R. Sudderth

This misunderstanding is merely the difference between pre-emptive multi-tasking and cooperative multitasking...

这种误会只是抢先式多任务和协作式多任务的区别……

The sleep turns off the entire carnival because there is really one line to all the rides, and you closed the gate. Think of it as "a JS interpreter and some other things" and ignore the threads...for you, there is only one thread, ...

睡眠关闭了整个狂欢节,因为所有游乐设施实际上只有一条线路,而您关闭了大门。把它想象成“一个 JS 解释器和其他一些东西”并忽略线程......对你来说,只有一个线程,......

...so don't block it.

......所以不要阻止它。

回答by Lord

Thread pool how when and who used:

线程池如何使用以及何时使用:

First off when we use/install Node on a computer, it starts a process among other processes which is called node process in the computer, and it keeps running until you kill it. And this running process is our so-called single thread.

首先,当我们在计算机上使用/安装 Node 时,它​​会在计算机中的其他进程中启动一个称为 node 进程的进程,并且它会一直运行直到您将其杀死。而这个运行过程就是我们所谓的单线程。

enter image description here

在此处输入图片说明

So the mechanism of single thread it makes easy to block a node application but this is one of the unique features that Node.js brings to the table. So, again if you run your node application, it will run in just a single thread. No matter if you have 1 or million users accessing your application at the same time.

因此,单线程机制可以轻松阻止节点应用程序,但这是 Node.js 带来的独特功能之一。因此,如果您再次运行您的节点应用程序,它将仅在单个线程中运行。无论您有 1 或 100 万用户同时访问您的应用程序。

So let's understand exactly what happens in the single thread of nodejs when you start your node application. At first the program is initialized, then all the top-level code is executed, which means all the codes that are not inside any callback function (remember all codes inside all callback functions will be executed under event loop).

因此,让我们确切了解当您启动 node 应用程序时,nodejs 的单线程中发生了什么。首先初始化程序,然后执行所有顶层代码,这意味着所有不在任何回调函数内的代码(记住所有回调函数内的所有代码都会在事件循环下执行)。

After that, all the modules code executed then register all the callback, finally, event loop started for your application.

之后,执行完所有模块代码,然后注册所有回调,最后,为您的应用程序启动事件循环。

enter image description here

在此处输入图片说明

So as we discuss before all the callback functions and codes inside those functions will execute under event loop. In the event loop, loads are distributed in different phases. Anyway, I'm not going to discuss about event loop here.

因此,正如我们之前讨论的,所有回调函数和这些函数中的代码都将在事件循环下执行。在事件循环中,负载分布在不同的阶段。无论如何,我不打算在这里讨论事件循环。

Well for the sack of better understanding of Thread pool I a requesting you to imagine that in the event loop, codes inside of one callback function execute after completing execution of codes inside another callback function, now if there are some tasks are actually too heavy. They would then block our nodejs single thread. And so, that's where the thread pool comes in, which is just like the event loop, is provided to Node.js by the libuv library.

好吧,为了更好地理解线程池,我要求您想象一下,在事件循环中,一个回调函数内部的代码在完成另一个回调函数内部代码的执行后执行,现在如果有一些任务实际上太重了。然后他们会阻塞我们的 nodejs 单线程。因此,这就是线程池的用武之地,就像事件循环一样,由 libuv 库提供给 Node.js。

So the thread pool is not a part of nodejs itself, it's provided by libuv to offload heavy duties to libuv, and libuv will execute those codes in its own threads and after execution libuv will return the results to the event in the event loop.

所以线程池不是 nodejs 本身的一部分,它是由 libuv 提供的,用来将繁重的任务交给 libuv,libuv 会在自己的线程中执行这些代码,执行后 libuv 会将结果返回给事件循环中的事件。

enter image description here

在此处输入图片说明

Thread pool gives us four additional threads, those are completely separate from the main single thread. And we can actually configure it up to 128 threads.

线程池为我们提供了四个额外的线程,它们与主单线程完全分开。我们实际上最多可以将其配置为 128 个线程。

So all these threads together formed a thread pool. and the event loop can then automatically offload heavy tasks to the thread pool.

所以所有这些线程一起组成了一个线程池。然后事件循环可以自动将繁重的任务卸载到线程池。

The fun part is all this happens automatically behind the scenes. It's not us developers who decide what goes to the thread pool and what doesn't.

有趣的是,这一切都是在幕后自动发生的。决定哪些进入线程池,哪些不进入的不是我们开发人员。

There are many tasks goes to the thread pool, such as

有很多任务去线程池,比如

-> All operations dealing with files
->Everyting is related to cryptography, like caching passwords.
->All compression stuff
->DNS lookups