Javascript 多核机器上的 Node.js
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2387724/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Node.js on multi-core machines
提问by zaharpopov
Node.jslooks interesting, BUTI must miss something - isn't Node.js tuned only to run on a single process and thread?
Node.js看起来很有趣,但我必须错过一些东西 - Node.js 不是只调整为在单个进程和线程上运行吗?
Then how does it scale for multi-core CPUs and multi-CPU servers? After all, it is all great to make fast as possible single-thread server, but for high loads I would want to use several CPUs. And the same goes for making applications faster - seems today the way is use multiple CPUs and parallelize the tasks.
那么它如何针对多核 CPU 和多 CPU 服务器进行扩展?毕竟,使单线程服务器尽可能快是很棒的,但是对于高负载,我想使用多个 CPU。使应用程序更快的方法也是如此——今天看来,方法是使用多个 CPU 并并行化任务。
How does Node.js fit into this picture? Is its idea to somehow distribute multiple instances or what?
Node.js 如何融入这张图?它的想法是以某种方式分发多个实例还是什么?
回答by Dave Dopson
[This post is up-to-date as of 2012-09-02 (newer than above).]
[这篇文章是截至 2012 年 9 月 2 日的最新信息(比上面的更新)。]
Node.js absolutely does scale on multi-core machines.
Node.js 绝对可以在多核机器上扩展。
Yes, Node.js is one-thread-per-process. This is a very deliberate design decision and eliminates the need to deal with locking semantics. If you don't agree with this, you probably don't yet realize just how insanely hard it is to debug multi-threaded code. For a deeper explanation of the Node.js process model and why it works this way (and why it will NEVER support multiple threads), read my other post.
是的,Node.js 是每进程一个线程。这是一个非常深思熟虑的设计决策,消除了处理锁定语义的需要。如果您不同意这一点,您可能还没有意识到调试多线程代码是多么困难。要更深入地解释 Node.js 进程模型以及它为什么以这种方式工作(以及为什么它永远不会支持多线程),请阅读我的另一篇文章。
So how do I take advantage of my 16 core box?
那么我该如何利用我的 16 核盒子呢?
Two ways:
两种方式:
- For big heavy compute tasks like image encoding, Node.js can fire up child processes or send messages to additional worker processes. In this design, you'd have one thread managing the flow of events and N processes doing heavy compute tasks and chewing up the other 15 CPUs.
- For scaling throughput on a webservice, you should run multiple Node.js servers on one box, one per core and split request traffic between them. This provides excellent CPU-affinity and will scale throughput nearly linearly with core count.
- 对于像图像编码这样的大型繁重计算任务,Node.js 可以启动子进程或向其他工作进程发送消息。在此设计中,您将有一个线程管理事件流,N 个进程执行繁重的计算任务并占用其他 15 个 CPU。
- 要在 Web 服务上扩展吞吐量,您应该在一个机器上运行多个 Node.js 服务器,每个核心一个,并在它们之间拆分请求流量。这提供了出色的 CPU 亲和性,并将随内核数量几乎线性地扩展吞吐量。
Scaling throughput on a webservice
扩展 Web 服务的吞吐量
Since v6.0.X Node.js has included the cluster modulestraight out of the box, which makes it easy to set up multiple node workers that can listen on a single port. Note that this is NOT the same as the older learnboost "cluster" module available through npm.
由于 v6.0.X Node.js 已经包含了开箱即用的集群模块,这使得设置可以在单个端口上侦听的多个节点工作程序变得容易。请注意,这与通过npm提供的较旧的 learnboost“集群”模块不同。
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
} else {
http.Server(function(req, res) { ... }).listen(8000);
}
Workers will compete to accept new connections, and the least loaded process is most likely to win. It works pretty well and can scale up throughput quite well on a multi-core box.
工作人员将竞争接受新连接,负载最少的进程最有可能获胜。它工作得很好,并且可以在多核机器上很好地扩展吞吐量。
If you have enough load to care about multiple cores, then you are going to want to do a few more things too:
如果您有足够的负载来处理多个内核,那么您还需要做更多的事情:
Run your Node.js service behind a web-proxy like Nginxor Apache- something that can do connection throttling (unless you want overload conditions to bring the box down completely), rewrite URLs, serve static content, and proxy other sub-services.
Periodically recycle your worker processes. For a long-running process, even a small memory leak will eventually add up.
Setup log collection / monitoring
在诸如Nginx或Apache 之类的 Web 代理之后运行您的 Node.js 服务- 可以进行连接限制(除非您希望过载条件完全关闭该框)、重写 URL、提供静态内容以及代理其他子服务。
定期回收您的工作进程。对于长时间运行的进程,即使是很小的内存泄漏最终也会累积。
设置日志收集/监控
PS: There's a discussion between Aaron and Christopher in the comments of another post (as of this writing, its the top post). A few comments on that:
PS:Aaron 和 Christopher 之间在另一篇文章的评论中进行了讨论(在撰写本文时,它是最重要的文章)。对此有几点评论:
- A shared socket model is very convenient for allowing multiple processes to listen on a single port and compete to accept new connections. Conceptually, you could think of preforked Apache doing this with the significant caveat that each process will only accept a single connection and then die. The efficiency loss for Apache is in the overhead of forking new processes and has nothing to do with the socket operations.
- For Node.js, having N workers compete on a single socket is an extremely reasonable solution. The alternative is to set up an on-box front-end like Nginx and have that proxy traffic to the individual workers, alternating between workers for assigning new connections. The two solutions have very similar performance characteristics. And since, as I mentioned above, you will likely want to have Nginx (or an alternative) fronting your node service anyways, the choice here is really between:
- 共享套接字模型对于允许多个进程侦听单个端口并竞争接受新连接非常方便。从概念上讲,您可以考虑预分叉的 Apache 这样做,但需要注意的是,每个进程将只接受一个连接然后死亡。Apache 的效率损失在于分叉新进程的开销,与套接字操作无关。
- 对于 Node.js,让 N 个 worker 竞争一个 socket 是一个非常合理的解决方案。另一种方法是设置一个像 Nginx 这样的机载前端,并将代理流量发送给各个工作人员,在工作人员之间交替分配新连接。这两种解决方案具有非常相似的性能特征。而且,正如我上面提到的,无论如何,您可能希望将 Nginx(或替代方案)置于您的节点服务前面,因此这里的选择实际上是:
Shared Ports: nginx (port 80) --> Node_workers x N (sharing port 3000 w/ Cluster)
共享端口: nginx (port 80) --> Node_workers x N (sharing port 3000 w/ Cluster)
vs
对比
Individual Ports: nginx (port 80) --> {Node_worker (port 3000), Node_worker (port 3001), Node_worker (port 3002), Node_worker (port 3003) ...}
个人端口: nginx (port 80) --> {Node_worker (port 3000), Node_worker (port 3001), Node_worker (port 3002), Node_worker (port 3003) ...}
There are arguably some benefits to the individual ports setup (potential to have less coupling between processes, have more sophisticated load-balancing decisions, etc.), but it is definitely more work to set up and the built-in cluster module is a low-complexity alternative that works for most people.
可以说,单独的端口设置有一些好处(可能减少进程之间的耦合,有更复杂的负载平衡决策等),但设置起来肯定需要更多的工作,而且内置的集群模块很低-适用于大多数人的复杂性替代方案。
回答by Chandra Sekar
One method would be to run multiple instances of node.js on the server and then put a load balancer (preferably a non-blocking one like nginx) in front of them.
一种方法是在服务器上运行 node.js 的多个实例,然后在它们前面放置一个负载平衡器(最好是一个非阻塞的,如 nginx)。
回答by broofa
Ryan Dahl answers this question in the tech talk he gave at Googlelast summer. To paraphrase, "just run multiple node processes and use something sensible to allow them to communicate. e.g. sendmsg()-style IPC or traditional RPC".
Ryan Dahl在去年夏天在谷歌的技术演讲中回答了这个问题。换句话说,“只需运行多个节点进程并使用一些合理的东西来允许它们进行通信。例如 sendmsg() 风格的 IPC 或传统的 RPC”。
If you want to get your hands dirty right away, check out the spark2Forevermodule. It makes spawning multiple node processes trivially easy. It handles setting up port sharing, so they can each accept connections to the same port, and also auto-respawning if you want to make sure a process is restarted if/when it dies.
如果您想立即动手,请查看spark2Forever模块。它使生成多个节点进程变得轻而易举。它处理设置端口共享,因此它们每个都可以接受到同一端口的连接,并且如果您想确保进程在/当它死亡时重新启动,还可以自动重新生成。
UPDATE - 10/11/11: Consensus in the node community seems to be that Clusteris now the preferred module for managing multiple node instances per machine. Foreveris also worth a look.
更新 - 10 年 11 月 11 日:节点社区的共识似乎是集群现在是管理每台机器多个节点实例的首选模块。 永远也值得一看。
回答by Sergey Zhukov
You can use clustermodule. Check this.
您可以使用集群模块。检查这个。
var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', function(worker, code, signal) {
console.log('worker ' + worker.process.pid + ' died');
});
} else {
// Workers can share any TCP connection
// In this case its a HTTP server
http.createServer(function(req, res) {
res.writeHead(200);
res.end("hello world\n");
}).listen(8000);
}
回答by CyberFonic
Multi-node harnesses all the cores that you may have.
Have a look at http://github.com/kriszyp/multi-node.
多节点利用您可能拥有的所有核心。
看看http://github.com/kriszyp/multi-node。
For simpler needs, you can start up multiple copies of node on different port numbers and put a load balancer in front of them.
对于更简单的需求,您可以在不同的端口号上启动节点的多个副本,并在它们之前放置一个负载均衡器。
回答by Toumi
Node Js is supporting clustering to take full advantages of your cpu. If you are not not running it with cluster, then probably you are wasting your hardware capabilities.
Node Js 支持集群以充分利用您的 CPU。如果您不是在集群中运行它,那么您可能是在浪费您的硬件能力。
Clustering in Node.js allows you to create separate processes which can share same server port. For example, if we run one HTTP server on Port 3000, it is one Server running on Single thread on single core of processor.
Node.js 中的集群允许您创建可以共享相同服务器端口的单独进程。例如,如果我们在端口 3000 上运行一个 HTTP 服务器,它是一个在处理器单核上的单线程上运行的服务器。
Code shown below allow you to cluster your application. This code is official code represented by Node.js.
下面显示的代码允许您对应用程序进行集群。这段代码是Node.js代表的官方代码。
var cluster = require('cluster');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
Object.keys(cluster.workers).forEach(function(id) {
console.log("I am running with ID : " + cluster.workers[id].process.pid);
});
cluster.on('exit', function(worker, code, signal) {
console.log('worker ' + worker.process.pid + ' died');
});
} else {
//Do further processing.
}
check this article for the full tutorial
查看本文以获取完整教程
回答by Will Stern
As mentioned above, Clusterwill scale and load-balance your app across all cores.
adding something like
如上所述,Cluster将跨所有内核扩展和负载平衡您的应用程序。
添加类似的东西
cluster.on('exit', function () {
cluster.fork();
});
Will restart any failing workers.
将重新启动任何失败的工作人员。
These days, a lot of people also prefer PM2, which handles the clustering for you and also provides some cool monitoring features.
现在,很多人也更喜欢PM2,它为您处理集群并提供一些很酷的监控功能。
Then, add Nginx or HAProxy in front of several machines running with clustering and you have multiple levels of failover and a much higher load capacity.
然后,在运行集群的几台机器前添加 Nginx 或 HAProxy,您将拥有多个级别的故障转移和更高的负载能力。
回答by TheDeveloper
回答by mikeal
Future version of node will allow you to fork a process and pass messages to it and Ryan has stated he wants to find some way to also share file handlers, so it won't be a straight forward Web Worker implementation.
未来版本的 node 将允许你 fork 一个进程并将消息传递给它,Ryan 表示他想找到一些方法来共享文件处理程序,所以它不会是一个直接的 Web Worker 实现。
At this time there is not an easy solution for this but it's still very early and node is one of the fastest moving open source projects I've ever seen so expect something awesome in the near future.
目前还没有一个简单的解决方案,但现在还很早,而且 node 是我见过的发展最快的开源项目之一,所以期待在不久的将来会有一些很棒的东西。
回答by christkv
I'm using Node workerto run processes in a simple way from my main process. Seems to be working great while we wait for the official way to come around.
我正在使用Node worker以一种简单的方式从我的主进程运行进程。在我们等待正式方式出现时,似乎工作得很好。

