哪个更适合 node.js 上的并发任务?纤维?网络工作者?或线程?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10773564/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 15:46:25  来源:igfitidea点击:

Which would be better for concurrent tasks on node.js? Fibers? Web-workers? or Threads?

multithreadingnode.jsconcurrencyweb-workerfibers

提问by Parth Thakkar

I stumbled over node.js sometime ago and like it a lot. But soon I found out that it lacked badly the ability to perform CPU-intensive tasks. So, I started googling and got these answers to solve the problem: Fibers, Webworkers and Threads (thread-a-gogo). Now which one to use is a confusion and one of them definitely needs to be used - afterall what's the purpose of having a server which is just good at IO and nothing else? Suggestions needed!

前段时间我偶然发现了 node.js 并且非常喜欢它。但很快我发现它严重缺乏执行 CPU 密集型任务的能力。所以,我开始使用谷歌搜索并得到这些解决问题的答案:Fibers、Webworkers 和 Threads (thread-a-gogo)。现在使用哪一个是一个混乱,其中一个肯定需要使用 - 毕竟拥有一个只擅长 IO 而没有别的服务器的目的是什么?需要建议!

UPDATE:

更新:

I was thinking of a way off-late; just needing suggestions over it. Now, what I thought of was this: Let's have some threads (using thread_a_gogo or maybe webworkers). Now, when we need more of them, we can create more. But there will be some limit over the creation process. (not implied by the system but probably because of overhead). Now, when we exceed the limit, we can fork a new node, and start creating threads over it. This way, it can go on till we reach some limit (after all, processes too have a big overhead). When this limit is reached, we start queuing tasks. Whenever a thread becomes free, it will be assigned a new task. This way, it can go on smoothly.

我在想一种方法来晚了;只需要对此提出建议。现在,我想到的是:让我们有一些线程(使用 thread_a_gogo 或 webworkers)。现在,当我们需要更多时,我们可以创造更多。但是在创建过程中会有一些限制。(不是系统暗示的,但可能是因为开销)。现在,当我们超过限制时,我们可以 fork 一个新节点,并开始在它上面创建线程。这样,它可以一直持续到我们达到某个极限(毕竟,进程也有很大的开销)。当达到这个限制时,我们开始排队任务。每当一个线程空闲时,它就会被分配一个新任务。这样,它才能顺利进行。

So, that was what I thought of. Is this idea good? I am a bit new to all this process and threads stuff, so don't have any expertise in it. Please share your opinions.

所以,这就是我的想法。这个主意好不好?我对所有这些过程和线程的东西都有些陌生,所以没有任何专业知识。请分享您的意见。

Thanks. :)

谢谢。:)

回答by hasanyasin

Node has a completely different paradigm and once it is correctly captured, it is easier to see this different way of solving problems. You never need multiple threads in a Node application(1) because you have a different way of doing the same thing. You create multiple processes; but it is very very different than, for example how Apache Web Server's Prefork mpm does.

Node 有一个完全不同的范式,一旦它被正确捕获,就更容易看到这种不同的解决问题的方式。您永远不需要在 Node 应用程序中使用多个线程 (1),因为您有不同的方式来做同样的事情。您创建多个进程;但它与例如 Apache Web Server 的 Prefork mpm 的工作方式非常不同。

For now, let's think that we have just one CPU core and we will develop an application (in Node's way) to do some work. Our job is to process a big file running over its contents byte-by-byte. The best way for our software is to start the work from the beginning of the file, follow it byte-by-byte to the end.

现在,让我们认为我们只有一个 CPU 内核,我们将开发一个应用程序(以 Node 的方式)来做一些工作。我们的工作是处理一个大文件,它的内容逐字节地运行。对于我们的软件来说,最好的方法是从文件的开头开始工作,逐字节地跟踪到结尾。

-- Hey, Hasan, I suppose you are either a newbie or very old school from my Grandfather's time!!! Why don't you create some threads and make it much faster?

-- 嘿,哈桑,我想你要么是我祖父时代的新手,要么是非常老的学校!!!为什么不创建一些线程并使其更快?

-- Oh, we have only one CPU core.

-- 哦,我们只有一个 CPU 内核。

-- So what? Create some threads man, make it faster!

- 所以呢?创建一些线程,让它更快!

-- It does not work like that. If I create threads I will be making it slower. Because I will be adding a lot of overhead to the system for switching between threads, trying to give them a just amount of time, and inside my process, trying to communicate between these threads. In addition to all these facts, I will also have to think about how I will divide a single job into multiple pieces that can be done in parallel.

——它不是那样工作的。如果我创建线程,我会让它变慢。因为我将为系统增加大量开销以在线程之间切换,试图给它们足够的时间,并在我的进程中尝试在这些线程之间进行通信。除了所有这些事实之外,我还必须考虑如何将单个工作分成可以并行完成的多个部分。

-- Okay okay, I see you are poor. Let's use my computer, it has 32 cores!

——好吧好吧,我看你很穷。让我们用我的电脑吧,它有 32 个内核!

-- Wow, you are awesome my dear friend, thank you very much. I appreciate it!

——哇,亲爱的朋友,你真棒,非常感谢。我很感激!

Then we turn back to work. Now we have 32 cpu cores thanks to our rich friend. Rules we have to abide have just changed. Now we want to utilize all this wealth we are given.

然后我们回去工作。多亏了这位富有的朋友,现在我们有了 32 个 cpu 内核。我们必须遵守的规则刚刚改变。现在我们想利用我们所获得的所有这些财富。

To use multiple cores, we need to find a way to divide our work into pieces that we can handle in parallel. If it was not Node, we would use threads for this; 32 threads, one for each cpu core. However, since we have Node, we will create 32 Node processes.

要使用多个内核,我们需要找到一种方法将我们的工作分成可以并行处理的部分。如果不是 Node,我们会为此使用线程;32 个线程,每个 CPU 核心一个。但是,由于我们有 Node,我们将创建 32 个 Node 进程。

Threads can be a good alternative to Node processes, maybe even a better way; but only in a specific kind of job where the work is already defined and we have complete control over how to handle it. Other than this, for every other kind of problem where the job comes from outside in a way we do not have control over and we want to answer as quickly as possible, Node's way is unarguably superior.

线程可以很好地替代 Node 进程,甚至可能是更好的方法;但仅限于特定类型的工作,其中工作已经定义并且我们可以完全控制如何处理它。除此之外,对于工作以我们无法控制且希望尽快回答的方式来自外部的所有其他类型的问题,Node 的方式无疑是优越的。

-- Hey, Hasan, are you still working single-threaded? What is wrong with you, man? I have just provided you what you wanted. You have no excuses anymore. Create threads, make it run faster.

-- 嘿,Hasan,你还在单线程工作吗?你怎么了,伙计?我刚刚给了你你想要的。你再也没有借口了。创建线程,使其运行得更快。

-- I have divided the work into pieces and every process will work on one of these pieces in parallel.

-- 我已将工作分成几部分,每个流程都将并行处理其中一个部分。

-- Why don't you create threads?

-- 你为什么不创建线程?

-- Sorry, I don't think it is usable. You can take your computer if you want?

-- 抱歉,我认为它不可用。如果你愿意,你可以带上你的电脑吗?

-- No okay, I am cool, I just don't understand why you don't use threads?

-- 不好吧,我很酷,我只是不明白你为什么不使用线程?

-- Thank you for the computer. :) I already divided the work into pieces and I create processes to work on these pieces in parallel. All the CPU cores will be fully utilized. I could do this with threads instead of processes; but Node has this way and my boss Parth Thakkar wants me to use Node.

——谢谢你的电脑。:) 我已经将工作分成几部分,并创建了并行处理这些部分的流程。所有 CPU 内核都将得到充分利用。我可以用线程而不是进程来做到这一点;但是 Node 有这种方式,我的老板 Parth Thakkar 希望我使用 Node。

-- Okay, let me know if you need another computer. :p

-- 好的,如果您需要另一台计算机,请告诉我。:p

If I create 33 processes, instead of 32, the operating system's scheduler will be pausing a thread, start the other one, pause it after some cycles, start the other one again... This is unnecessary overhead. I do not want it. In fact, on a system with 32 cores, I wouldn't even want to create exactly 32 processes, 31 can be nicer. Because it is not just my application that will work on this system. Leaving a little room for other things can be good, especially if we have 32 rooms.

如果我创建 33 个进程,而不是 32 个,操作系统的调度程序将暂停一个线程,启动另一个线程,在一些周期后暂停它,再次启动另一个线程......这是不必要的开销。我不想要这个。事实上,在一个有 32 个内核的系统上,我什至不想创建正好 32 个进程,31 个可能更好。因为不仅仅是我的应用程序可以在这个系统上运行。为其他东西留一点空间可能会很好,特别是如果我们有 32 个房间。

I believe we are on the same page now about fully utilizing processors for CPU-intensive tasks.

我相信我们现在就充分利用处理器执行CPU 密集型任务达成了共识

-- Hmm, Hasan, I am sorry for mocking you a little. I believe I understand you better now. But there is still something I need an explanation for: What is all the buzz about running hundreds of threads? I read everywhere that threads are much faster to create and dumb than forking processes? You fork processes instead of threads and you think it is the highest you would get with Node. Then is Node not appropriate for this kind of work?

-- 嗯,哈桑,对不起,我有点嘲笑你。我相信我现在更了解你了。但是我仍然需要解释一些事情:运行数百个线程的所有嗡嗡声是什么?我到处都读到线程的创建速度比分叉进程快得多?你 fork 进程而不是线程,并且你认为它是你使用 Node 获得的最高值。那么 Node 不适合做这种工作吗?

-- No worries, I am cool, too. Everybody says these things so I think I am used to hearing them.

——别担心,我也很酷。每个人都说这些话,所以我想我已经习惯听到它们了。

-- So? Node is not good for this?

- 所以?节点不适合这个?

-- Node is perfectly good for this even though threads can be good too. As for thread/process creation overhead; on things that you repeat a lot, every millisecond counts. However, I create only 32 processes and it will take a tiny amount of time. It will happen only once. It will not make any difference.

-- Node 非常适合这个,尽管线程也可以。至于线程/进程创建开销;在你重复很多的事情上,每一毫秒都很重要。但是,我只创建了 32 个进程,而且会花费很少的时间。它只会发生一次。它不会有任何区别。

-- When do I want to create thousands of threads, then?

-- 那我什么时候要创建数千个线程呢?

-- You never want to create thousands of threads. However, on a system that is doing work that comes from outside, like a web server processing HTTP requests; if you are using a thread for each request, you will be creating a lot of threads, many of them.

-- 您永远不想创建数千个线程。但是,在执行来自外部的工作的系统上,例如处理 HTTP 请求的 Web 服务器;如果您为每个请求使用一个线程,您将创建很多线程,其中很多。

-- Node is different, though? Right?

-- Node 是不同的,但是?对?

-- Yes, exactly. This is where Node really shines. Like a thread is much lighter than a process, a function call is much lighter than a thread. Node calls functions, instead of creating threads. In the example of a web server, every incoming request causes a function call.

- 对,就是这样。这就是 Node 真正闪耀的地方。就像线程比进程轻得多,函数调用也比线程轻得多。节点调用函数,而不是创建线程。在 Web 服务器的示例中,每个传入请求都会导致函数调用。

-- Hmm, interesting; but you can only run one function at the same time if you are not using multiple threads. How can this work when a lot of requests arrive at the web server at the same time?

——嗯,有意思;但是如果您不使用多个线程,则只能同时运行一个函数。当大量请求同时到达 Web 服务器时,这如何工作?

-- You are perfectly right about how functions run, one at a time, never two in parallel. I mean in a single process, only one scope of code is running at a time. The OS Scheduler does not come and pause this function and switch to another one, unless it pauses the process to give time to another process, not another thread in our process. (2)

-- 您对函数如何运行完全正确,一次一个,而不是两个并行。我的意思是在单个进程中,一次只运行一个范围的代码。OS 调度程序不会来暂停这个函数并切换到另一个函数,除非它暂停进程以给另一个进程而不是我们进程中的另一个线程留出时间。(2)

-- Then how can a process handle 2 requests at a time?

-- 那么一个进程如何一次处理2个请求呢?

-- A process can handle tens of thousands of requests at a time as long as our system has enough resources (RAM, Network, etc.). How those functions run is THE KEY DIFFERENCE.

-- 只要我们的系统有足够的资源(RAM、网络等),一个进程可以一次处理数以万计的请求。这些功能的运行方式是关键的区别。

-- Hmm, should I be excited now?

——嗯,我现在应该兴奋吗?

-- Maybe :) Node runs a loop over a queue. In this queue are our jobs, i.e, the calls we started to process incoming requests. The most important point here is the way we design our functions to run. Instead of starting to process a request and making the caller wait until we finish the job, we quickly end our function after doing an acceptable amount of work. When we come to a point where we need to wait for another component to do some work and return us a value, instead of waiting for that, we simply finish our function adding the rest of work to the queue.

-- 也许 :) Node 在队列上运行一个循环。在这个队列中是我们的工作,即我们开始处理传入请求的调用。这里最重要的一点是我们设计函数运行的方式。我们没有开始处理请求并让调用者等待我们完成工作,而是在完成可接受的工作量后迅速结束我们的功能。当我们需要等待另一个组件做一些工作并返回一个值时,我们不再等待,而是简单地完成我们的函数,将其余的工作添加到队列中。

-- It sounds too complex?

——听起来太复杂了?

-- No no, I might sound complex; but the system itself is very simple and it makes perfect sense.

-- 不不,我可能听起来很复杂;但系统本身非常简单,而且非常有意义。

Now I want to stop citing the dialogue between these two developers and finish my answer after a last quick example of how these functions work.

现在我想停止引用这两个开发人员之间的对话,并在最后一个关于这些功能如何工作的快速示例之后完成我的回答。

In this way, we are doing what OS Scheduler would normally do. We pause our work at some point and let other function calls (like other threads in a multi-threaded environment) run until we get our turn again. This is much better than leaving the work to OS Scheduler which tries to give just time to every thread on system. We know what we are doing much better than OS Scheduler does and we are expected to stop when we should stop.

通过这种方式,我们正在做 OS Scheduler 通常会做的事情。我们在某个时候暂停我们的工作,让其他函数调用(如多线程环境中的其他线程)运行,直到再次轮到我们。这比将工作留给操作系统调度程序要好得多,操作系统调度程序试图为系统上的每个线程提供时间。我们比 OS Scheduler 更了解我们正在做的事情,并且我们应该停止时停止。

Below is a simple example where we open a file and read it to do some work on the data.

下面是一个简单的例子,我们打开一个文件并读取它来对数据做一些工作。

Synchronous Way:

同步方式:

Open File
Repeat This:    
    Read Some
    Do the work

Asynchronous Way:

异步方式:

Open File and Do this when it is ready: // Our function returns
    Repeat this:
        Read Some and when it is ready: // Returns again
            Do some work

As you see, our function asks the system to open a file and does not wait for it to be opened. It finishes itself by providing next steps after file is ready. When we return, Node runs other function calls on the queue. After running over all the functions, the event loop moves to next turn...

如您所见,我们的函数要求系统打开一个文件,而不是等待它被打开。它通过在文件准备好后提供后续步骤来完成自己。当我们返回时,Node 会在队列上运行其他函数调用。运行完所有函数后,事件循环移动到下一个回合......

In summary, Node has a completely different paradigm than multi-threaded development; but this does not mean that it lacks things. For a synchronous job (where we can decide the order and way of processing), it works as well as multi-threaded parallelism. For a job that comes from outside like requests to a server, it simply is superior.

综上所述,Node 与多线程开发有着完全不同的范式;但这并不意味着它缺乏东西。对于同步作业(我们可以决定处理的顺序和方式),它与多线程并行一样有效。对于来自外部的工作,例如对服务器的请求,它简直是优越的。



(1) Unless you are building libraries in other languages like C/C++ in which case you still do not create threads for dividing jobs. For this kind of work you have two threads one of which will continue communication with Node while the other does the real work.

(1) 除非您使用其他语言(如 C/C++)构建库,在这种情况下您仍然不会创建用于划分作业的线程。对于这种工作,您有两个线程,其中一个将继续与 Node 通信,而另一个执行实际工作。

(2) In fact, every Node process has multiple threads for the same reasons I mentioned in the first footnote. However this is no way like 1000 threads doing similar works. Those extra threads are for things like to accept IO events and to handle inter-process messaging.

(2) 事实上,由于我在第一个脚注中提到的相同原因,每个 Node 进程都有多个线程。然而,这不像 1000 个线程在做类似的工作。这些额外的线程用于诸如接受 IO 事件和处理进程间消息之类的事情。

UPDATE (As reply to a good question in comments)

更新(作为对评论中一个好问题的回复)

@Mark, thank you for the constructive criticism. In Node's paradigm, you should never have functions that takes too long to process unless all other calls in the queue are designed to be run one after another. In case of computationally expensive tasks, if we look at the picture in complete, we see that this is not a question of "Should we use threads or processes?" but a question of "How can we divide these tasks in a well balanced manner into sub-tasks that we can run them in parallel employing multiple CPU cores on the system?" Let's say we will process 400 video files on a system with 8 cores. If we want to process one file at a time, then we need a system that will process different parts of the same file in which case, maybe, a multi-threaded single-process system will be easier to build and even more efficient. We can still use Node for this by running multiple processes and passing messages between them when state-sharing/communication is necessary. As I said before, a multi-process approach with Node is as well asa multi-threaded approach in this kind of tasks; but not more than that. Again, as I told before, the situation that Node shines is when we have these tasks coming as input to system from multiple sources since keeping many connections concurrently is much lighter in Node compared to a thread-per-connection or process-per-connection system.

@Mark,感谢您的建设性批评。在 Node 的范式中,除非队列中的所有其他调用都被设计为一个接一个地运行,否则您永远不应该拥有处理时间过长的函数。对于计算量很大的任务,如果我们完整地查看图片,我们会发现这不是“我们应该使用线程还是进程?”的问题。但问题是“我们如何以均衡的方式将这些任务划分为子任务,以便我们可以在系统上使用多个 CPU 内核并行运行它们?” 假设我们将在具有 8 个内核的系统上处理 400 个视频文件。如果我们想一次处理一个文件,那么我们需要一个系统来处理同一个文件的不同部分,在这种情况下,多线程单进程系统可能更容易构建,甚至更高效。我们仍然可以通过运行多个进程并在需要状态共享/通信时在它们之间传递消息来使用 Node。正如我之前所说,Node 的多进程方法是以及此类任务中的多线程方法;但不止于此。同样,正如我之前所说,Node 的亮点是当我们将这些任务作为来自多个来源的系统输入时,因为与每个连接的线程或每个连接的进程相比,在 Node 中同时保持多个连接要轻得多系统。

As for setTimeout(...,0)calls; sometimes giving a break during a time consuming task to allow calls in the queue have their share of processing can be required. Dividing tasks in different ways can save you from these; but still, this is not really a hack, it is just the way event queues work. Also, using process.nextTickfor this aim is much better since when you use setTimeout, calculation and checks of the time passed will be necessary while process.nextTickis simply what we really want: "Hey task, go back to end of the queue, you have used your share!"

至于setTimeout(...,0)电话;有时可能需要在耗时的任务期间暂停,以允许队列中的调用有自己的处理份额。以不同的方式划分任务可以使您免于这些;但是,这并不是真正的黑客,它只是事件队列的工作方式。此外,process.nextTick用于此目的要好得多,因为当您使用时setTimeout,需要计算和检查通过的时间,而process.nextTick这正是我们真正想要的:“嘿任务,回到队列的末尾,您已经使用了您的份额! ”

回答by rsp

(Update 2016: Web workers are going into io.js - a Node.js forkNode.js v7 - see below.)

(2016 年更新:Web 工作者将进入io.js - Node.js forkNode.js v7 - 见下文。)

(Update 2017: Web workers are notgoing into Node.js v7 or v8 - see below.)

(2017 年更新:Web 工作者不会进入 Node.js v7 或 v8 - 见下文。)

(Update 2018: Web workers aregoing into Node.js Node v10.5.0 - see below.)

(更新2018:网络工作者进入Node.js的节点V10.5.0 -见下文)。

Some clarification

一些澄清

Having read the answers above I would like to point out that there is nothing in web workers that is against the philosophy of JavaScript in general and Node in particular regarding concurrency. (If there was, it wouldn't be even discussed by the WHATWG, much less implemented in the browsers).

阅读了上面的答案后,我想指出,Web Worker 中没有任何内容违反 JavaScript 的总体哲学,尤其是 Node 的并发性。(如果有,它甚至不会被 WHATWG 讨论,更不用说在浏览器中实现了)。

You can think of a web worker as a lightweight microservice that is accessed asynchronously. No state is shared. No locking problems exist. There is no blocking. There is no synchronization needed. Just like when you use a RESTful service from your Node program you don't worry that it is now "multithreaded" because the RESTful service is not in the same thread as your own event loop. It's just a separate service that you access asynchronously and that is what matters.

您可以将 Web Worker 视为异步访问的轻量级微服务。没有状态是共享的。不存在锁定问题。没有阻塞。不需要同步。就像当您从 Node 程序使用 RESTful 服务时,您不必担心它现在是“多线程的”,因为 RESTful 服务与您自己的事件循环不在同一个线程中。它只是一个您异步访问的单独服务,这才是最重要的。

The same is with web workers. It's just an API to communicate with code that runs in a completely separate context and whether it is in different thread, different process, different cgroup, zone, container or different machine is completely irrelevant, because of a strictly asynchronous, non-blocking API, with all data passed by value.

网络工作者也是如此。它只是一个 API 来与运行在完全独立的上下文中的代码进行通信,无论它是在不同的线程、不同的进程、不同的 cgroup、区域、容器还是不同的机器中都是完全无关的,因为这是一个严格异步、非阻塞的 API,与通过值传递的所有数据。

As a matter of fact web workers are conceptually a perfect fit for Node which - as many people are not aware of - incidentally uses threads quite heavily, and in fact "everything runs in parallel except your code" - see:

事实上,网络工作者在概念上非常适合 Node,它 - 正如许多人所不知道的 - 顺便说一句,非常频繁地使用线程,实际上“除了你的代码之外,所有东西都并行运行” - 请参阅:

But the web workers don't even need to be implemented using threads. You could use processes, green threads, or even RESTful services in the cloud - as long as the web worker API is used. The whole beauty of the message passing API with call by value semantics is that the underlying implementation is pretty much irrelevant, as the details of the concurrency model will not get exposed.

但是 web worker 甚至不需要使用线程来实现。您可以在云中使用进程、绿色线程,甚至 RESTful 服务 - 只要使用 Web Worker API。具有按值调用语义的消息传递 API 的整个美妙之处在于底层实现几乎无关紧要,因为并发模型的细节不会暴露。

A single-threaded event loop is perfect for I/O-bound operations. It doesn't work that well for CPU-bound operations, especially long running ones. For that we need to spawn more processes or use threads. Managing child processes and the inter-process communication in a portable way can be quite difficult and it is often seen as an overkill for simple tasks, while using threads means dealing with locks and synchronization issues that are very difficult to do right.

单线程事件循环非常适合 I/O 绑定操作。它不适用于受 CPU 限制的操作,尤其是长时间运行的操作。为此,我们需要产生更多进程或使用线程。以可移植的方式管理子进程和进程间通信可能非常困难,并且通常被视为对简单任务的过度杀伤,而使用线程意味着处理很难正确执行的锁和同步问题。

What is often recommended is to divide long-running CPU-bound operations into smaller tasks (something like the example in the "Original answer" section of my answer to Speed up setInterval) but it is not always practical and it doesn't use more than one CPU core.

通常建议将长时间运行的 CPU 绑定操作划分为较小的任务(类似于我对加速 setInterval 的回答的“原始答案”部分中的示例),但它并不总是实用的,并且不会使用更多多于一个 CPU 内核。

I'm writing it to clarify the comments that were basically saying that web workers were created for browsers, not servers (forgetting that it can be said about pretty much everything in JavaScript).

我写它是为了澄清那些评论,这些评论基本上是说 Web Worker 是为浏览器创建的,而不是服务器(忘记了 JavaScript 中的几乎所有东西都可以说)。

Node modules

节点模块

There are few modules that are supposed to add Web Workers to Node:

有几个模块应该将 Web Workers 添加到 Node:

I haven't used any of them but I have two quick observations that may be relevant: as of March 2015, node-webworker was last updated 4 years ago and node-webworker-threads was last updated a month ago. Also I see in the example of node-webworker-threads usage that you can use a function instead of a file name as an argument to the Worker constructor which seems that may cause subtle problems if it is implemented using threads that share memory (unless the functions is used only for its .toString() method and is otherwise compiled in a different environment, in which case it may be fine - I have to look more deeply into it, just sharing my observations here).

我没有使用过它们中的任何一个,但我有两个可能相关的快速观察:截至 2015 年 3 月,node-webworker 上次更新是 4 年前,node-webworker-threads 上次更新是一个月前。此外,我在 node-webworker-threads 用法示例中看到,您可以使用函数而不是文件名作为 Worker 构造函数的参数,如果使用共享内存的线程实现,这似乎可能会导致微妙的问题(除非函数仅用于其 .toString() 方法,否则会在不同的环境中编译,在这种情况下可能没问题 - 我必须更深入地研究它,只是在这里分享我的观察)。

If there is any other relevant project that implements web workers API in Node, please leave a comment.

如果有其他相关项目在Node中实现了web worker API,欢迎留言。

Update 1

更新 1

I didn't know it yet at the time of writing but incidentally one day before I wrote this answer Web Workers were added to io.js.

在写这篇文章的时候我还不知道,但顺便说一句,在我写这个答案的前一天,Web Workers 被添加到 io.js

(io.jsis a fork of Node.js - see: Why io.js decided to fork Node.js, an InfoWorld interview with Mikeal Rogers, for more info.)

io.js是 Node.js 的一个分支 - 请参阅:为什么 io.js 决定分支 Node.js,InfoWorld 对 Mikeal Rogers 的采访,了解更多信息。)

Not only does it prove the point that there is nothing in web workers that is against the philosophy of JavaScript in general and Node in particular regarding concurrency, but it may result in web workers being a first class citizen in server-side JavaScript like io.js (and possibly Node.js in the future) just as it already is in client-side JavaScript in all modern browsers.

它不仅证明了 Web Worker 中没有任何内容违反 JavaScript 的哲学,特别是 Node 的并发性,而且它可能导致 Web Worker 成为像 io.js 这样的服务器端 JavaScript 中的一等公民。 js(未来可能还有 Node.js),就像所有现代浏览器中的客户端 JavaScript 一样。

Update 2

更新 2

In Update 1 and my tweetI was referring to io.js pull request #1159which now redirects to Node PR #1159that was closed on Jul 8 and replaced with Node PR #2133- which is still open. There is some discussion taking place under those pull requests that may provide some more up to date info on the status of Web workers in io.js/Node.js.

在更新 1 和我的推文中,我指的是io.js 拉取请求 #1159,它现在重定向到Node PR #1159,该 请求于 7 月 8 日关闭并替换为Node PR #2133- 仍然打开。在这些拉取请求下进行了一些讨论,这些请求可能会提供有关 io.js/Node.js 中 Web 工作者状态的更多最新信息。

Update 3

更新 3

Latest info- thanks to NiCk Newman for posting it in the comments: There is the workers: initial implementationcommit by Petka Antonov from Sep 6, 2015 that can be downloaded and tried out in this tree. See comments by NiCk Newmanfor details.

最新信息- 感谢 NiCk Newman 在评论中发布它:有工作人员:Petka Antonov 于 2015 年 9 月 6 日提交的初始实施提交,可以在此树中下载和试用 。有关详细信息,请参阅NiCk Newman 的评论

Update 4

更新 4

As of May 2016the last comments on the still open PR #2133 - workers: initial implementationwere 3 months old. On May 30 Matheus Moreira asked me to post an update to this answer in the comments below and he asked for the current status of this featurein the PR comments.

截至2016 年 5 月,对仍然开放的PR #2133的最后评论- 工人:初始实施已 3 个月。5 月 30 日,Matheus Moreira 要求我在下面的评论中发布此答案的更新,他在 PR 评论中询问此功能的当前状态

The first answers in the PR discussion were skeptical but later Ben Noordhuis wrotethat "Getting this merged in one shape or another is on my todo list for v7".

公关讨论中的第一个答案是怀疑的,但后来 Ben Noordhuis写道“将其合并为一种或另一种形状是我 v7 的待办事项列表”。

All other comments seemed to second that and as of July 2016 it seems that Web Workers should be available in the next version of Node, version 7.0 that is planned to be released on October 2016(not necessarily in the form of this exact PR).

所有其他评论似乎都支持这一点,截至 2016 年 7 月,似乎Web Workers 应该在 Node 的下一个版本中可用,即计划于 2016 年10 月发布的 7.0 版(不一定以这个确切的 PR 的形式)。

Thanks to Matheus Moreira for pointing it out in the comments and reviving the discussion on GitHub.

感谢 Matheus Moreira 在评论中指出并在 GitHub 上恢复讨论。

Update 5

更新 5

As of July 2016there are few modules on npm that were not available before - for a complete list of relevant modules, search npmfor workers, web workers, etc. If anything in particular does or doesn't work for you, please post a comment.

截至2016 年 7 月,npm 上几乎没有以前不可用的模块 - 有关相关模块的完整列表,请在npm中搜索工作人员、网络工作人员等。如果有任何特别适合或不适合您的内容,请发布评论。

Update 6

更新 6

As of January 2017it is unlikely that web workers will get merged into Node.js.

截至2017 年 1 月,Web Worker 不太可能合并到 Node.js 中。

The pull request #2133 workers: initial implementationby Petka Antonov from July 8, 2015 was finally closedby Ben Noordhuis on December 11, 2016 who commented that "multi-threading support adds too many new failure modes for not enough benefit" and "we can also accomplish that using more traditional means like shared memory and more efficient serialization."

拉取请求 #2133工人:Petka Antonov 于 2015 年 7 月 8 日的初始实施最终被 Ben Noordhuis 在 2016 年 12 月 11 日关闭,他评论说“多线程支持增加了太多新的失败模式,但没有足够的好处”和“我们还可以使用更传统的方法(如共享内存和更有效的序列化)来实现这一点。”

For more information see the comments to the PR 2133on GitHub.

有关更多信息,请参阅GitHub 上对PR 2133的评论。

Thanks again to Matheus Moreira for pointing it out in the comments.

再次感谢 Matheus Moreira 在评论中指出。

Update 6

更新 6

I'm happy to announce that few days ago, in June 2018web workers appeared in Node v10.5.0 as an experimental feature activated with the --experimental-workerflag.

我很高兴地宣布,几天前,在20186 月,Web Workers 出现在 Node v10.5.0 中,作为使用该--experimental-worker标志激活的实验性功能。

For more info, see:

有关更多信息,请参阅:

Finally! I can make the 7th update to my 3 year old Stack Overflow answer where I argue that threading a la web workers is not against Node philosophy, only this time saying that we finally got it!

最后!我可以对我 3 岁的 Stack Overflow 答案进行第 7 次更新,我认为线程化网络工作者并不违反 Node 哲学,只是这一次说我们终于明白了!

回答by limplash

I come from the old school of thought where we used multi-threading to make software fast. For past 3 years i have been using Node.js and a big supporter of it. As hasanyasin explained in detail how node works and the concept of asyncrous functionality. But let me add few things here.

我来自老派的思想,我们使用多线程来使软件快速。在过去的 3 年里,我一直在使用 Node.js 和它的大力支持者。正如 hasanyasin 详细解释了节点的工作原理和异步功能的概念。但让我在这里添加一些东西。

Back in the old days with single cores and lower clock speeds we tried various ways to make software work fast and parallel. in DOS days we use to run one program at a time. Than in windows we started running multiple applications (processes) together. Concepts like preemptive and non-preemptive (or cooperative) where tested. we know now that preemptive was the answer for better multi-processing task on single core computers. Along came the concepts of processes/tasks and context switching. Than the concept of thread to further reduce the burden of process context switching. Thread where coined as light weight alternative to spawning new processes.

回到过去使用单核和较低时钟速度的时代,我们尝试了各种方法来使软件快速并行地工作。在 DOS 时代,我们习惯一次运行一个程序。与在 Windows 中相比,我们开始同时运行多个应用程序(进程)。测试了抢占式和非抢占式(或合作式)等概念。我们现在知道抢占式是单核计算机上更好的多处理任务的答案。随之而来的是流程/任务和上下文切换的概念。比线程的概念进一步减轻了进程上下文切换的负担。线程被创造为产生新进程的轻量级替代品。

So like it or not signal thread or not multi-core or single core your processes will be preempted and time sliced by the OS.

因此,无论是否喜欢信号线程或多核或单核,您的进程都将被操作系统抢占和时间切片。

Nodejs is a single process and provides async mechanism. Here jobs are dispatched to under lying OS to perform tasks while we waiting in an event loop for the task to finish. Once we get a green signal from OS we perform what ever we need to do. Now in a way this is cooperative/non-preemptive multi-tasking, so we should never block the event loop for a very long period of time other wise we will degrade our application very fast.
So if there is ever a task that is blocking in nature or is very time consuming we will have to branch it out to the preemptive world of OS and threads. there are good examples of this is in the libuv documentation. Also if you read the documentation further you find that FileI/O is handled in threads in node.js.

Nodejs 是一个单进程并提供异步机制。在这里,当我们在事件循环中等待任务完成时,作业被分派到底层操作系统来执行任务。一旦我们从操作系统获得绿色信号,我们就会执行我们需要做的任何事情。现在在某种程度上这是合作/非抢占式多任务处理,所以我们永远不应该阻塞事件循环很长一段时间,否则我们将很快降低我们的应用程序。
因此,如果存在本质上阻塞或非常耗时的任务,我们将不得不将其分支到操作系统和线程的抢占世界。在libuv 文档中有很好的例子。此外,如果您进一步阅读文档,您会发现FileI/O 是在 node.js 中的线程中处理的

So Firstly its all in the design of our software. Secondly Context switching is always happening no matter what they tell you. Thread are there and still there for a reason, the reason is they are faster to switch in between then processes.

所以首先,这一切都在我们的软件设计中。其次,无论他们告诉你什么,上下文切换总是发生。线程存在并且仍然存在是有原因的,原因是它们在进程之间切换的速度更快。

Under hood in node.js its all c++ and threads. And node provides c++ way to extend its functionality and to further speed out by using threads where they are a must i.e., blocking tasks such as reading from a source writing to a source, large data analysis so on so forth.

在 node.js 中,它的所有 C++ 和线程都是如此。并且 node 提供了 C++ 方法来扩展其功能并通过使用线程来进一步加快速度,例如,阻塞任务,例如从源读取写入源,大数据分析等。

I know hasanyasin answer is the accepted one but for me threads will exist no matter what you say or how you hide them behind scripts, secondly no one just breaks things in to threads just for speed it is mostly done for blocking tasks. And threads are in the back bone of Node.js so before completely bashing multi-threading is in correct. Also threads are different from processes and the limitation of having node processes per core don't exactly apply to number of threads, threads are like sub tasks to a process. in fact threads won;t show up in your windows task manager or linux top command. once again they are more little weight then processes

我知道 hasanyasin 答案是公认的答案,但对我来说,无论你说什么或如何将它们隐藏在脚本后面,线程都会存在,其次,没有人只是为了速度而将事情分解为线程,这主要是为了阻止任务。线程是 Node.js 的支柱,所以在完全抨击多线程之前是正确的。此外,线程与进程不同,每个内核的节点进程的限制并不完全适用于线程数,线程就像进程的子任务。事实上,线程不会出现在您的 Windows 任务管理器或 linux top 命令中。再一次,它们的重量比加工更轻

回答by lanzz

I'm not sure if webworkers are relevant in this case, they are client-side tech (run in the browser), while node.js runs on the server. Fibers, as far as I understand, are also blocking, i.e. they are voluntary multitasking, so you could use them, but should manage context switches yourself via yield. Threads might be actually what you need, but I don't know how mature they are in node.js.

我不确定 webworkers 在这种情况下是否相关,它们是客户端技术(在浏览器中运行),而 node.js 在服务器上运行。据我所知,Fibers 也是阻塞的,即它们是自愿的多任务处理,因此您可以使用它们,但应该通过yield. 线程实际上可能正是您所需要的,但我不知道它们在 node.js 中的成熟度如何。

回答by motss

worker_threadshas been implemented and shipped behind a flag in [email protected]. It's still an initial implementation and more efforts are needed to make it more efficient in future releases. Worth giving it a try in latest node.

worker_threads已在[email protected]. 它仍然是一个初始实现,需要更多的努力使其在未来的版本中更加高效。值得一试最新node

回答by genericdave

In many Node developers' opinions one of the best parts of Node is actually its single-threaded nature. Threads introduce a whole slew of difficulties with shared resources that Node completely avoids by doing nothing but non-blocking IO.

在许多 Node 开发人员看来,Node 最好的部分之一实际上是它的单线程特性。线程给共享资源带来了很多困难,Node 通过非阻塞 IO 什么都不做,完全避免了这些困难。

That's not to say that Node is limitedto a single thread. It's just that the method for getting threaded concurrency is different from what you're looking for. The standard way to deal with threads is with the clustermodule that comes standard with Node itself. It's a simpler approach to threads than manually dealing with them in your code.

这并不是说 Node仅限于单个线程。只是获取线程并发的方法与您要查找的方法不同。处理线程的标准方法是使用 Node 本身的标准集群模块。这是一种比在代码中手动处理线程更简单的方法。

For dealing with asynchronous programming in your code (as in, avoiding nested callback pyramids), the [Future] component in the Fiberslibrary is a decent choice. I would also suggest you check out Asyncblockwhich is based on Fibers. Fibers are nice because they allow you to hide callback by duplicating the stack and then jumping between stacks on a single-thread as they're needed. Saves you the hassle of real threads while giving you the benefits. The downside is that stack traces can get a bit weird when using Fibers, but they aren't too bad.

对于处理代码中的异步编程(例如,避免嵌套回调金字塔),Fibers库中的 [Future] 组件是一个不错的选择。我还建议您查看基于 Fibers 的Asyncblock。Fiber 很好,因为它们允许您通过复制堆栈然后在需要时在单线程上的堆栈之间跳转来隐藏回调。为您省去真实线程的麻烦,同时为您带来好处。缺点是使用 Fibers 时堆栈跟踪可能会变得有点奇怪,但还不错。

If you don't need to worry about async stuff and are more just interested in doing a lot of processing without blocking, a simple call to process.nextTick(callback) every once in a while is all you need.

如果您不需要担心异步的事情,而更只想在不阻塞的情况下进行大量处理,那么每隔一段时间对 process.nextTick(callback) 进行一次简单的调用就足够了。

回答by kbjr

Maybe some more information on what tasks you are performing would help. Why would you need to (as you mentioned in your comment to genericdave's answer) need to create many thousands of them? The usual way of doing this sort of thing in Node is to start up a worker process (using fork or some other method) which always runs and can be communicated to using messages. In other words, don't start up a new worker each time you need to perform whatever task it is you're doing, but simply send a message to the already running worker and get a response when it's done. Honestly, I can't see that starting up many thousandsof actual threads would be very efficient either, you are still limited by you CPUs.

也许有关您正在执行的任务的更多信息会有所帮助。为什么你需要(正如你在对genericdave的回答的评论中提到的)需要创建成千上万个?在 Node 中做这种事情的通常方法是启动一个工作进程(使用 fork 或其他一些方法),它总是运行并且可以使用消息进行通信。换句话说,不要在每次需要执行正在执行的任何任务时启动一个新的工作器,而只需向已经运行的工作器发送消息并在完成后得到响应。老实说,我也看不出启动数千个实际线程会非常有效,您仍然受到 CPU 的限制。

Now, after saying all of that, I have been doing a lot of work with Hook.iolately which seems to work very well for this sort of off-loading tasks into other processes, maybe it can accomplish what you need.

现在,在说完所有这些之后,我最近一直在使用Hook.io做很多工作,它似乎非常适合将此类任务卸载到其他流程中,也许它可以完成您的需求。