单线程非阻塞 IO 模型如何在 Node.js 中工作
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14795145/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How the single threaded non blocking IO model works in Node.js
提问by Wenhao Ji
I'm not a Node programmer, but I'm interested in how the single threaded non blocking IO modelworks. After I read the article understanding-the-node-js-event-loop, I'm really confused about it. It gave an example for the model:
我不是 Node 程序员,但我对单线程非阻塞 IO 模型的工作原理很感兴趣。在我阅读了《understanding-the-node-js-event-loop 》一文后,我真的很困惑。它给出了模型的示例:
c.query(
'SELECT SLEEP(20);',
function (err, results, fields) {
if (err) {
throw err;
}
res.writeHead(200, {'Content-Type': 'text/html'});
res.end('<html><head><title>Hello</title></head><body><h1>Return from async DB query</h1></body></html>');
c.end();
}
);
Que:When there are two request A(comes first) and B since there is only a single thread, the server-side program will handle the request A firstly: doing SQL querying is asleep statement standing for I/O wait. And The program is stucked at the I/Owaiting, and cannot execute the code which renders the web page behind. Will the program switch to request B during the waiting? In my opinion, because of the single thread model, there is no way to switch one request from another. But the title of the example code says that everything runs in parallel except your code.
Que:当有两个请求A(先来)和B时,因为只有一个线程,服务器端程序会先处理请求A:做SQL查询是睡眠语句,代表I/O等待。并且程序卡在I/O等待中,无法执行渲染网页的代码。程序在等待过程中会切换到请求B吗?在我看来,由于单线程模型,无法从另一个请求切换一个请求。但是示例代码的标题说除了您的代码之外,所有内容都是并行运行的。
(P.S I'm not sure if I misunderstand the code or not since I have never used Node.)How Node switch A to B during the waiting? And can you explain the single threaded non blocking IO modelof Node in a simple way? I would appreciate it if you could help me. :)
(PS 我不确定我是否误解了代码,因为我从未使用过 Node。)Node 在等待期间如何将 A 切换到 B?并且可以简单的解释一下Node的单线程非阻塞IO模型吗?如果你能帮助我,我将不胜感激。:)
回答by Utaal
Node.js is built upon libuv, a cross-platform library that abstracts apis/syscalls for asynchronous (non-blocking) input/output provided by the supported OSes (Unix, OS X and Windows at least).
Node.js 是建立在libuv 之上的,这是一个跨平台的库,它为受支持的操作系统(至少是 Unix、OS X 和 Windows)提供的异步(非阻塞)输入/输出抽象了 apis/syscalls。
Asynchronous IO
异步IO
In this programming model open/read/write operation on devices and resources (sockets, filesystem, etc.) managed by the file-system don't block the calling thread(as in the typical synchronous c-like model) and just mark the process (in kernel/OS level data structure) to be notified when new data or events are available. In case of a web-server-like app, the process is then responsible to figure out which request/context the notified event belongs to and proceed processing the request from there. Note that this will necessarily mean you'll be on a different stack frame from the one that originated the request to the OS as the latter had to yield to a process' dispatcher in order for a single threaded process to handle new events.
在这个编程模型中,文件系统管理的设备和资源(套接字、文件系统等)上的打开/读/写操作不阻塞调用线程(如典型的同步 c 类模型),只需标记进程(在内核/操作系统级数据结构中)在新数据或事件可用时收到通知。如果是类似网络服务器的应用程序,则流程负责确定通知事件属于哪个请求/上下文,并从那里继续处理请求。请注意,这必然意味着您将位于与向操作系统发起请求的堆栈帧不同的堆栈帧上,因为后者必须让步给进程的调度程序,以便单线程进程处理新事件。
The problem with the model I described is that it's not familiar and hard to reason about for the programmer as it's non-sequential in nature. "You need to make request in function A and handle the result in a different function where your locals from A are usually not available."
我描述的模型的问题在于它对程序员来说并不熟悉且难以推理,因为它本质上是非顺序的。“您需要在函数 A 中发出请求,并在不同的函数中处理结果,而 A 的本地人通常不可用。”
Node's model (Continuation Passing Style and Event Loop)
Node 的模型(Continuation Passing Style 和 Event Loop)
Node tackles the problem leveraging javascript's language features to make this model a little more synchronous-looking by inducing the programmer to employ a certain programming style. Every function that requests IO has a signature like function (... parameters ..., callback)and needs to be given a callback that will be invoked when the requested operation is completed (keep in mind that most of the time is spent waiting for the OS to signal the completion - time that can be spent doing other work). Javascript's support for closures allows you to use variables you've defined in the outer (calling) function inside the body of the callback - this allows to keep state between different functions that will be invoked by the node runtime independently. See also Continuation Passing Style.
Node 利用 javascript 的语言特性解决了这个问题,通过诱导程序员采用某种编程风格,使这个模型看起来更加同步。每个请求 IO 的函数都有一个类似的签名,function (... parameters ..., callback)并且需要给出一个回调,当请求的操作完成时将调用该回调(请记住,大部分时间都花在等待操作系统发出完成信号上 - 时间可以是花在做其他工作上)。Javascript 对闭包的支持允许您使用您在回调主体内的外部(调用)函数中定义的变量 - 这允许在将由节点运行时独立调用的不同函数之间保持状态。另见持续传球风格。
Moreover, after invoking a function spawning an IO operation the calling function will usually returncontrol to node's event loop. This loop will invoke the next callback or function that was scheduled for execution (most likely because the corresponding event was notified by the OS) - this allows the concurrent processing of multiple requests.
此外,在调用产生 IO 操作的函数后,调用函数通常会return控制节点的事件循环。这个循环将调用下一个被安排执行的回调或函数(很可能是因为操作系统通知了相应的事件)——这允许并发处理多个请求。
You can think of node's event loop as somewhat similar to the kernel's dispatcher: the kernel would schedule for execution a blocked thread once its pending IO is completed while node will schedule a callback when the corresponding event has occured.
您可以认为 node 的事件循环有点类似于内核的调度程序:一旦其挂起的 IO 完成,内核将调度执行一个阻塞的线程,而 node 将在相应的事件发生时调度回调。
Highly concurrent, no parallelism
高并发,无并行
As a final remark, the phrase "everything runs in parallel except your code" does a decent job of capturing the point that node allows your code to handle requests from hundreds of thousands open socket with a single threadconcurrently by multiplexing and sequencing all your js logic in a single stream of execution (even though saying "everything runs in parallel" is probably not correct here - see Concurrency vs Parallelism - What is the difference?). This works pretty well for webapp servers as most of the time is actually spent on waiting for network or disk (database / sockets) and the logic is not really CPU intensive - that is to say: this works well for IO-bound workloads.
最后,短语“除了您的代码之外的所有内容都并行运行”很好地捕捉到了节点允许您的代码通过多路复用和排序所有 js来处理来自单个线程的数十万个打开套接字的请求的点单个执行流中的逻辑(即使在这里说“一切都并行运行”可能不正确 - 请参阅并发与并行 - 有什么区别?)。这对于 webapp 服务器非常有效,因为大部分时间实际上都花在等待网络或磁盘(数据库/套接字)上,而且逻辑并不是真正的 CPU 密集型 - 也就是说:这适用于 IO 密集型工作负载。
回答by user568109
Well, to give some perspective, let me compare node.js with apache.
好吧,为了给出一些观点,让我将 node.js 与 apache 进行比较。
Apache is a multi-threaded HTTP server, for each and every request that the server receives, it creates a separate thread which handles that request.
Apache 是一个多线程 HTTP 服务器,对于服务器收到的每个请求,它都会创建一个单独的线程来处理该请求。
Node.js on the other hand is event driven, handling all requests asynchronously from single thread.
另一方面,Node.js 是事件驱动的,从单线程异步处理所有请求。
When A and B are received on apache, two threads are created which handle requests. Each handling the query separately, each waiting for the query results before serving the page. The page is only served until the query is finished. The query fetch is blocking because the server cannot execute the rest of thread until it receives the result.
当 apache 收到 A 和 B 时,会创建两个线程来处理请求。每个单独处理查询,每个在服务页面之前等待查询结果。该页面仅在查询完成之前提供。查询获取被阻塞,因为服务器在收到结果之前无法执行线程的其余部分。
In node, c.query is handled asynchronously, which means while c.query fetches the results for A, it jumps to handle c.query for B, and when the results arrive for A arrive it sends back the results to callback which sends the response. Node.js knows to execute callback when fetch finishes.
在 node 中,c.query 是异步处理的,这意味着当 c.query 获取 A 的结果时,它会跳转到处理 B 的 c.query,当 A 的结果到达时,它将结果发送回回调,回调发送回复。Node.js 知道在 fetch 完成时执行回调。
In my opinion, because it's a single thread model, there is no way to switch from one request to another.
在我看来,因为它是单线程模型,所以无法从一个请求切换到另一个请求。
Actually the node server does exactly that for you all the time. To make switches, (the asynchronous behavior) most functions that you would use will have callbacks.
实际上,节点服务器一直在为您做这件事。为了进行切换,(异步行为)您将使用的大多数函数都将具有回调。
Edit
编辑
The SQL query is taken from mysqllibrary. It implements callback style as well as event emitter to queue SQL requests. It does not execute them asynchronously, that is done by the internal libuvthreads that provide the abstraction of non-blocking I/O. The following steps happen for making a query :
SQL 查询取自mysql库。它实现了回调样式以及事件发射器来排队 SQL 请求。它不会异步执行它们,这是由提供非阻塞 I/O 抽象的内部libuv线程完成的。进行查询时会发生以下步骤:
- Open a connection to db, connection itself can be made asynchronously.
- Once db is connected, query is passed on to the server. Queries can be queued.
- The main event loop gets notified of the completion with callback or event.
- Main loop executes your callback/eventhandler.
- 打开到数据库的连接,连接本身可以异步进行。
- 连接数据库后,查询将传递到服务器。查询可以排队。
- 主事件循环通过回调或事件获得完成通知。
- 主循环执行您的回调/事件处理程序。
The incoming requests to http server are handled in the similar fashion. The internal thread architecture is something like this:
对 http 服务器的传入请求以类似的方式处理。内部线程架构是这样的:


The C++ threads are the libuv ones which do the asynchronous I/O (disk or network). The main event loop continues to execute after the dispatching the request to thread pool. It can accept more requests as it does not wait or sleep. SQL queries/HTTP requests/file system reads all happen this way.
C++ 线程是执行异步 I/O(磁盘或网络)的 libuv 线程。主事件循环在将请求分派到线程池后继续执行。它可以接受更多请求,因为它不等待或不休眠。SQL 查询/HTTP 请求/文件系统读取都是这样发生的。
回答by Tiago
Node.js uses libuvbehind the scenes. libuv has a thread pool(of size 4 by default). Therefore Node.js does use threadsto achieve concurrency.
Node.js在幕后使用libuv。libuv有一个线程池(默认大小为 4)。因此 Node.js确实使用线程来实现并发。
However, your coderuns on a single thread (i.e., all of the callbacks of Node.js functions will be called on the same thread, the so called loop-thread or event-loop). When people say "Node.js runs on a single thread" they are really saying "the callbacks of Node.js run on a single thread".
但是,您的代码在单个线程上运行(即,Node.js 函数的所有回调都将在同一线程上调用,即所谓的循环线程或事件循环)。当人们说“Node.js 在单线程上运行”时,他们实际上是在说“Node.js 的回调在单线程上运行”。
回答by pspi
Node.js is based on the event loop programming model. The event loop runs in single thread and repeatedly waits for events and then runs any event handlers subscribed to those events. Events can be for example
Node.js 基于事件循环编程模型。事件循环在单线程中运行并重复等待事件,然后运行订阅这些事件的任何事件处理程序。事件可以是例如
- timer wait is complete
- next chunk of data is ready to be written to this file
- theres a fresh new HTTP request coming our way
- 定时器等待完成
- 下一块数据已准备好写入此文件
- 有一个新的 HTTP 请求来了
All of this runs in single thread and no JavaScript code is ever executed in parallel. As long as these event handlers are small and wait for yet more events themselves everything works out nicely. This allows multiple request to be handled concurrently by a single Node.js process.
所有这些都在单线程中运行,并且没有并行执行 JavaScript 代码。只要这些事件处理程序很小并且等待更多事件本身,一切都会很好。这允许单个 Node.js 进程同时处理多个请求。
(There's a little bit magic under the hood as where the events originate. Some of it involve low level worker threads running in parallel.)
(事件起源的地方有一些神奇的地方。其中一些涉及并行运行的低级工作线程。)
In this SQL case, there's a lot of things (events) happening between making the database query and getting its results in the callback. During that time the event loop keeps pumping life into the application and advancing other requests one tiny event at a time. Therefore multiple requests are being served concurrently.
在这个 SQL 案例中,在进行数据库查询和在回调中获取结果之间发生了很多事情(事件)。在此期间,事件循环不断地为应用程序注入活力,并一次一个微小的事件推进其他请求。因此同时处理多个请求。
According to: "Event loop from 10,000ft - core concept behind Node.js".
回答by dhiraj suvarna
The function c.query() has two argument
函数 c.query() 有两个参数
c.query("Fetch Data", "Post-Processing of Data")
The operation "Fetch Data" in this case is a DB-Query, now this may be handled by Node.js by spawning off a worker thread and giving it this task of performing the DB-Query. (Remember Node.js can create thread internally). This enables the function to return instantaneously without any delay
在这种情况下,“获取数据”操作是一个 DB-Query,现在这可以由 Node.js 通过产生一个工作线程并赋予它执行 DB-Query 的任务来处理。(记住 Node.js 可以在内部创建线程)。这使函数能够立即返回而没有任何延迟
The second argument "Post-Processing of Data" is a callback function, the node framework registers this callback and is called by the event loop.
第二个参数“数据后处理”是一个回调函数,节点框架注册这个回调并被事件循环调用。
Thus the statement c.query (paramenter1, parameter2)will return instantaneously, enabling node to cater for another request.
因此该语句c.query (paramenter1, parameter2)将立即返回,使节点能够满足另一个请求。
P.S: I have just started to understand node, actually I wanted to write this as comment to @Philipbut since didn't have enough reputation points so wrote it as an answer.
PS:我刚刚开始了解节点,实际上我想把它写成对@Philip的评论, 但由于没有足够的声望点,所以将其写为答案。
回答by Gal Ben-Haim
if you read a bit further - "Of course, on the backend, there are threads and processes for DB access and process execution. However, these are not explicitly exposed to your code, so you can't worry about them other than by knowing that I/O interactions e.g. with the database, or with other processes will be asynchronous from the perspective of each request since the results from those threads are returned via the event loop to your code."
如果您进一步阅读 - “当然,在后端,有用于数据库访问和进程执行的线程和进程。但是,这些并没有显式地暴露给您的代码,因此除了知道之外,您不必担心它们从每个请求的角度来看,I/O 交互(例如与数据库或与其他进程的交互)将是异步的,因为来自这些线程的结果通过事件循环返回到您的代码。”
about - "everything runs in parallel except your code" - your code is executed synchronously, whenever you invoke an asynchronous operation such as waiting for IO, the event loop handles everything and invokes the callback. it just not something you have to think about.
关于 - “除了您的代码之外,所有内容都并行运行” - 您的代码是同步执行的,每当您调用异步操作(例如等待 IO)时,事件循环都会处理所有内容并调用回调。这不是您必须考虑的事情。
in your example: there are two requests A (comes first) and B. you execute request A, your code continue to run synchronously and execute request B. the event loop handles request A, when it finishes it invokes the callback of request A with the result, same goes to request B.
在您的示例中:有两个请求 A(先来)和 B。您执行请求 A,您的代码继续同步运行并执行请求 B。事件循环处理请求 A,当它完成时调用请求 A 的回调结果,同样适用于请求 B。
回答by Robert Siemer
Okay, most things should be clear so far... the tricky part is the SQL: if it is not in reality running in another thread or processin it's entirety, the SQL-execution has to be broken down into individual steps(by an SQL processor made for asynchronous execution!), where the non-blocking ones are executed, and the blocking ones (e.g. the sleep) actually canbe transferred to the kernel (as an alarm interrupt/event) and put on the event list for the main loop.
好的,到目前为止大多数事情应该是清楚的......棘手的部分是 SQL:如果它实际上不是在另一个线程或进程中完整地运行,则 SQL 执行必须分解为单独的步骤(通过一个用于异步执行的 SQL 处理器!),其中执行非阻塞的,而阻塞的(例如睡眠)实际上可以传输到内核(作为警报中断/事件)并放在事件列表中主循环。
That means, e.g. the interpretation of the SQL, etc. is done immediately, but during the wait (stored as an event to come in the future by the kernel in some kqueue, epoll, ... structure; together with the other IO operations) the main loop can do other things and eventually check if something happened of those IOs and waits.
这意味着,例如 SQL 的解释等是立即完成的,但在等待期间(由内核存储为将来某个 kqueue、epoll、...结构中的事件;以及其他 IO 操作) ) 主循环可以做其他事情,并最终检查这些 IO 和等待是否发生了某些事情。
So, to rephrase it again: the program is never (allowed to get) stuck, sleeping calls are never executed. Their duty is done by the kernel (write something, wait for something to come over the network, waiting for time to elapse) or another thread or process. – The Node process checks if at least one of those duties is finished by the kernel in the only blocking call to the OS once in each event-loop-cycle. That point is reached, when everything non-blocking is done.
所以,再重申一遍:程序永远不会(允许卡住),睡眠调用永远不会执行。它们的职责由内核(写一些东西,等待一些东西通过网络传来,等待时间过去)或另一个线程或进程来完成。– Node 进程在每个事件循环周期中检查是否至少有一项任务由内核在对操作系统的唯一阻塞调用中完成。当所有非阻塞都完成时,就达到了这一点。
Clear? :-)
清除?:-)
I don't know Node. But where does the c.query come from?
我不知道节点。但是 c.query 是从哪里来的呢?

