Justin Weiss · 2019-06-13 · node, javascript

Improve reliability with a Node.js library

A Node.js process runs a single thread. Single-threadedness isn't a problem if you run short pieces of code and let Node do other work in between. It can even be a huge benefit! One process can handle a lot of "simultaneous" work and you don't have to worry about thread safety. Most Node libraries take enough breaks that things seem asynchronous, when it's really just a bunch of tiny synchronous tasks happening one after another.

But one thread for one process has its limitations.

For example, if you have code that runs without giving up control, that code will block other work from happening. It's not always obvious if code will run without taking a break, so this can be a real problem.

Second, if your code crashes, you will lose the entire process. That includes anything that may have been running "asynchronously." When everything is running in a single process, that could be a lot of work to lose.

It would be better to send long-running or high-risk work off to another, less important process. That process could do the work and let your process know when the work is done. It turns out that there's a simple, JavaScript-like way to do just that. And you don't even need to take on a dependency to do it.

Farm that work out!

Just like most other server-side languages, you can start new processes from Node. With Node, you manage these processes with child_process: A simple, event-driven API to communicate between your process and the child you created.

You start a child process with fork:

// parent.js
const { fork } = require("child_process");

const child = fork("./child.js");

// child.js
process.on("message", message => {
  // ...
});

When parent.js runs, it will start a new Node process. That new process will run child.js, and the parent will get a ChildProcess object back. child.js will happily sit around waiting for messages from its parent.

The parent sends a message like this:

// parent.js
const { fork } = require("child_process");

const child = fork("./child.js");

// Handle replies from the child process
child.on("message", message => {
  console.log(`Got message from child: ${message}`);
});

child.send("Hello, child!");

And from the child, you can send messages back to the parent with process.send:

// child.js
process.on("message", (message) => {
  console.log(`Got message from parent: ${message}`)
  process.send("Hello, yourself!")
})

$ node parent.js
Got message from parent: Hello, child!
Got message from child: Hello, yourself!
^C
$

You can stop the child with kill:

// parent.js
const { fork } = require("child_process");

const child = fork("./child.js");
child.on("message", message => {
  console.log(`Got message from child: ${message}`);
  child.kill();
});

child.send("Hello, child!");

And if the child exits or crashes, the parent can be notified:

child.on("exit", () => {
  console.log("The child went away.");
});

$ node parent.js
Got message from parent: Hello, child!
Got message from child: Hello, yourself!
The child went away.
$

With just these pieces, your parent process can send requests to a child process and receive responses. You can treat the child like a service. For example, you could perform a long, synchronous calculation:

// parent.js
const { fork } = require("child_process");

const child = fork("./child.js");
child.on("message", message => {
  console.log(`Child produced result: ${message}`);
});

child.send({ command: "calculate", args: [] });

// child.js
process.on("message", ({ command, args }) => {
  switch (command) {
  case "calculate":
    console.log(`Calculating...`);
    const result = ... // do a looooooooong calculation
    process.send(result);
    break;
  default:
    console.error(`Child received an unknown message: ${command}`);
    break;
  }

  process.exit();
});

The parent starts a child process and sends the work to the child. The child does the work synchronously and pings the parent back when it's done:

$ node parent.js
Calculating...
Child produced result: 4
$

With a few lines of code, a naturally synchronous operation becomes an asynchronous one from the perspective of the parent. And it still feels natural and JavaScript-y.

Everyone into the worker pool

If you hand off a lot of small jobs to a child process, you'll have another problem. The amount of time it takes to do the work will be completely dominated by the amount of time it takes to fork the child. It might take 300ms to start a worker in order to do a 50ms calculation.

It would be nice if you could pay that startup cost once. Once a process finishes a job, you can reuse it and send it another, instead of starting a new one.

You can do this with two arrays -- one to keep track of available workers and another to keep track of incoming jobs:

// scheduler.js
const { fork } = require("child_process");

const readyWorkers = [];
const workQueue = [];

// Match up the next queued job with the next free worker, if possible.
function work() {
  console.log(
    `work left: ${workQueue.length}, workers ready: ${readyWorkers.length}`
  );

  if (readyWorkers.length === 0) return;
  if (workQueue.length === 0) return;

  const worker = readyWorkers.shift();
  const job = workQueue.shift();

  const { command, args, callback } = job;

  worker.once("message", response => {
    readyWorkers.push(worker);
    callback(response);

    // A worker became ready, so try to match it up with work
    work();
  });

  worker.send({ command, args });
}

function scheduleJob(command, args, callback) {
  workQueue.push({ command, args, callback });

  // We have new work, so try to find a worker for it
  work();
}

// Start 2 workers
for (let i = 0; i < 2; i++) {
  console.log("Forking a worker...");
  readyWorkers.push(fork("./worker.js"));
}

// Schedule 5 jobs
for (let i = 0; i < 5; i++) {
  scheduleJob("square", [i], result => {
    console.log(`${i}^2 = ${result}`);
  });
}

// worker.js
process.on("message", ({ command, args }) => {
  switch (command) {
  case "square":
    process.send(args[0] ** 2);
    break;
  default:
    console.error(`Child received an unknown message: ${command}`);
    break;
  }
});

$ node scheduler.js
Forking a worker...
Forking a worker...
work left: 1, workers ready: 2
work left: 1, workers ready: 1
work left: 1, workers ready: 0
work left: 2, workers ready: 0
work left: 3, workers ready: 0
1^2 = 1
work left: 3, workers ready: 1
2^2 = 4
work left: 2, workers ready: 1
3^2 = 9
work left: 1, workers ready: 1
4^2 = 16
work left: 0, workers ready: 1
0^2 = 0
work left: 0, workers ready: 2

And now you have the building blocks for an in-process queueing system! There's a little bit more to do -- handling errors, restarting crashed child processes, killing stuck workers, and so on -- but this is a good start.

Connecting directly to the child

send, the function you use to send messages to a child process, has a second parameter: sendHandle. What is sendHandle for?

Let's say you have a browser, and it makes a WebSocket connection to your Node process by sending an upgrade request:

// parent.js
server.on('upgrade', function(request, socket, head) {
  // ...
});

The upgrade handler is given a socket parameter. You can pass that socket to your child process, which finishes the WebSocket handshake:

// parent.js
server.on("upgrade", function(request, socket, head) {
  child.send(
    {
      type: "connection",
      request: { request.headers, request.method, request.url },
      head
    },
    socket
  );
});

// child.js
const WebSocket = require("ws");

wss = new WebSocket.Server({ noServer: true });

process.on(({ request, head }, sendHandle) => {
  wss.handleUpgrade(request, sendHandle, head, ws => {
    // ... add WebSocket listeners...
  });
});

Once that connection is made, any data sent over that socket will go to the child process instead of the parent process. This means you won't have to forward any other messages from the parent to the child process, and the child process can handle the rest of the communication on its own.

Theoretically, you can also do this with web requests. But I haven't found it as useful. You're only handed a socket during the very beginning of the web connection. That's too early to route it to the right child based on path or parameters.

What are the alternatives?

If you've already gone full microservice, you'd probably split this kind of code into completely separate apps. But there's something nice about only having a single app to think about, having a JavaScript-ish API to send work and handle results, and not having to worry as much about deployment and inter-app communication.

child_process isn't the only way to pass messages between a cluster of Node processes. The built-in cluster API will create pools of workers that all share a single HTTP server and distribute connections. PM2 is a larger and more advanced library with a lot of extra process management features.

But if you just have a little bit of synchronous or dangerous work, and you want to keep your main Node server up and responsive, try child_process. It's easy to start with and easy to grow into more.