javascript 在node.js中，如何声明一个可以被master进程初始化并被worker进程访问的共享变量？

Question

提问by Hymany Lee

I want the following

我想要以下

During startup, the master process loads a large table from file and saves it into a shared variable. The table has 9 columns and 12 million rows, 432MB in size.
The worker processes run HTTP server, accepting real-time queries against the large table.

在启动期间，主进程从文件加载一个大表并将其保存到共享变量中。该表有 9 列和 1200 万行，大小为 432MB。
工作进程运行 HTTP 服务器，接受针对大表的实时查询。

Here is my code, which obviously does not achieve my goal.

这是我的代码，显然没有达到我的目标。

var my_shared_var;
var cluster = require('cluster');
var numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  // Load a large table from file and save it into my_shared_var,
  // hoping the worker processes can access to this shared variable,
  // so that the worker processes do not need to reload the table from file.
  // The loading typically takes 15 seconds.
  my_shared_var = load('path_to_my_large_table');

  // Fork worker processes
  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  // The following line of code actually outputs "undefined".
  // It seems each process has its own copy of my_shared_var.
  console.log(my_shared_var);

  // Then perform query against my_shared_var.
  // The query should be performed by worker processes,
  // otherwise the master process will become bottleneck
  var result = query(my_shared_var);
}

I have tried saving the large table into MongoDB so that each process can easily access to the data. But the table size is so huge that it takes MongoDB about 10 seconds to complete my query even with an index. This is too slow and not acceptable for my real-time application. I have also tried Redis, which holds data in memory. But Redis is a key-value store and my data is a table. I also wrote a C++ program to load the data into memory, and the query took less than 1 second, so I want to emulate this in node.js.

我尝试将大表保存到 MongoDB 中，以便每个进程都可以轻松访问数据。但是表的大小太大了，即使有索引，MongoDB 也需要大约 10 秒才能完成我的查询。这对于我的实时应用程序来说太慢并且无法接受。我也试过 Redis，它在内存中保存数据。但是 Redis 是一个键值存储，而我的数据是一个表。我还写了一个 C++ 程序来将数据加载到内存中，查询耗时不到 1 秒，所以我想在 node.js 中模拟这一点。

Answer 1

采纳答案by Martin Blech

You are looking for shared memory, which node.js just does not support. You should look for alternatives, such as querying a databaseor using memcached.

您正在寻找共享内存，而node.js 只是不支持. 您应该寻找替代方案，例如查询数据库或使用memcached。

Answer 2

回答by Shivam

If I translate your question in a few words, you need to share data of MASTER entity with WORKER entity. It can be done very easily using events:

如果我用几句话来翻译您的问题，您需要与 WORKER 实体共享 MASTER 实体的数据。使用事件可以很容易地完成：

From Master to worker:

从 Master 到工人：

worker.send({json data});    // In Master part

process.on('message', yourCallbackFunc(jsonData));    // In Worker part

From Worker to Master:

从工人到主人：

process.send({json data});   // In Worker part

worker.on('message', yourCallbackFunc(jsonData));    // In Master part

I hope this way you can send and receive data bidirectionally. Please mark it as answer if you find it useful so that other users can also find the answer. Thanks

我希望通过这种方式您可以双向发送和接收数据。如果觉得有用，请将其标记为答案，以便其他用户也可以找到答案。谢谢

Answer 3

回答by Vadim Baryshev

In node.js fork works not like in C++. It's not copy current state of process, it's run new process. So, in this case variables isn't shared. Every line of code works for every process but master process have cluster.isMaster flag set to true. You need to load your data for every worker processes. Be careful if your data is really huge because every process will have its own copy. I think you need to query parts of data as soon as you need them or wait if you realy need it all in memory.

在 node.js 中 fork 的工作方式与 C++ 不同。它不是复制进程的当前状态，而是运行新进程。因此，在这种情况下，变量不共享。每行代码都适用于每个进程，但主进程将 cluster.isMaster 标志设置为 true。您需要为每个工作进程加载数据。如果您的数据真的很大，请小心，因为每个进程都有自己的副本。我认为您需要在需要时立即查询部分数据，或者如果您真的需要将它们全部保存在内存中，则需要等待。

Answer 4

回答by Allen Luce

If read-only access is fine for your application, try out my own shared memory module. It uses mmapunder the covers, so data is loaded as it's accessed and not all at once. The memory is shared among all processes on the machine. Using it is super easy:

如果只读访问适合您的应用程序，请尝试我自己的共享内存模块。它mmap在幕后使用，因此数据是在访问时加载的，而不是一次加载。内存在机器上的所有进程之间共享。使用它非常简单：

const Shared = require('mmap-object')

const shared_object = new Shared.Open('table_file')

console.log(shared_object.property)

It gives you a regular object interface to a key-value store of strings or numbers. It's super fast in my applications.

它为您提供了一个到字符串或数字键值存储的常规对象接口。它在我的应用程序中非常快。

There is also an experimental read-write version of the moduleavailable for testing.

该模块还有一个实验性的读写版本可用于测试。

Answer 5

回答by Reza Roshan

You can use Redis.

你可以使用Redis。

Redis is an open source, BSD licensed, advanced key-value cache and store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets, sorted sets, bitmaps and hyperloglogs.

Redis 是一个开源、BSD 许可的高级键值缓存和存储。它通常被称为数据结构服务器，因为键可以包含字符串、散列、列表、集合、排序集合、位图和超级日志。

redis.io

Redis.io

javascript 在node.js中，如何声明一个可以被master进程初始化并被worker进程访问的共享变量？

提问by Hymany Lee

采纳答案by Martin Blech

回答by Shivam

回答by Vadim Baryshev

回答by Allen Luce

回答by Reza Roshan

相关推荐

最近更新

标签

javascript 在node.js中，如何声明一个可以被master进程初始化并被worker进程访问的共享变量？

提问by Hymany Lee

采纳答案by Martin Blech

回答by Shivam

回答by Vadim Baryshev

回答by Allen Luce

回答by Reza Roshan

相关推荐

Javascript - 从 ArrayBuffer 中获取数据？

javascript 如何触发 iframe 事件 Jquery

javascript 如何将“this”传递给窗口 setInterval

javascript 如何在Javascript中计算自午夜以来的毫秒数

相关推荐

最近更新

标签