2019-07-30

How do you test multiprocess code in Node?

A little while ago, I wrote about using Node's child_process library. child_process creates other processes to do work instead of tying up a single process. When you do work in child processes, you get some big benefits: You can avoid hangs when a computation takes too long, and you won't lose important Node process when a single task crashes.

The parent process needs to communicate with the child to give it work to do. The parent does this by sending arbitrary data with stringly-typed keys, and the child processes need to know how to read that data.

This is just asking for trouble.

If you mistype a key or you expect to receive data in a different format than it's sent, things go bad very quickly. So you have a challenge: How do you make sure this code is correct and continues to work?

Testing is the answer, but how do you test across process boundaries? That sounds really hard. But with a few dozen lines of code and a mock or two, it can be done. And it can be done in a way that makes your tests look great at the end.

First, find or create seams

If you have code that's inherently hard to test, you have a few options. For quick, simple tests, you can mock out the parts that are hard to test. But if you mock too much, your tests can lie to you. When you mock both sides and the thing you're mocking changes, your mocks still pass, even though your code fails. That's a bad, bad place to be.

For an end-to-end test to be accurate, you need to leave most of the system alone. When there are parts you can't test, you need to find places where you can work around just those parts.

I think of these places where easy-to-test meets hard-to-test as "seams" — a term I learned from one of my favorite books on testing, Working Effectively with Legacy Code.

Some seams you can find easily. Here's an updated parent process from the last article, turned into a full Calculator class to make things a little cleaner:

// Calculator.js
const { fork } = require("child_process");

class Calculator {
  constructor() {
    this.child = fork("./child.js");
    this.child.on("message", this._handleMessage.bind(this));
    this.lastResult = null;
  }

  square(value) {
    this.child.send({ command: "square", args: [value] });
  }

  _handleMessage(message) {
    switch (message.type) {
      case "result":
        this.lastResult = message.result;
        break;
      default:
        console.error(`Calculator received an unknown message: ${message.type}`);
        break;
    }
  }
}

module.exports = Calculator;

child_process.fork() is a great seam. Mock out fork to return an object you control, and you now have full control over how your "child process" acts. Your code won't realize it's not talking to a real process. You've replaced the hard-to-test fork to make it return something easy to test.

Some seams are trickier to work with. In the previous article, the child was a JavaScript file run by fork. That code also calls the built-in Node process API to communicate with the parent. Both would be a hassle to stub out.

In those situations, I'll create a new seam. I'll extract all the interesting stuff into a method that takes the hard stuff as arguments:

Old code:

// child.js
process.on("message", ({ command, args }) => {
  switch (command) {
  case "square":
    process.send(args[0] ** 2);
    break;
  default:
    console.error(`Child received an unknown message: ${command}`);
    break;
  }
});

New code:

// child.js
const handleMessages = require('./handleMessages');
handleMessages(process);
// handleMessages.js
function handleMessages(parentProcess) {
  parentProcess.on("message", ({ command, args }) => {
    switch (command) {
      case "square":
        parentProcess.send({ type: "result", result: args[0] ** 2 });
        break;
      default:
        console.error(`Child received an unknown message: ${command}`);
        break;
    }
  });
}

module.exports = handleMessages;

Now the child has a very clear seam. The process is now an argument that can be any kind of object, as long as it implements send and emits events. You still can't easily test child.js, but it does so little that you can skip it without much risk.

Next, fake out the complicated code

Once you have your seams — in this case child_process.fork() and parentProcess — you're ready to start testing. But how do you connect your parent and child code together?

Fake objects are my favorite way to test complicated code. A fake object has the same API as the thing it's faking, but it has slightly different and usually much simpler behavior. For example, in your tests, child_process.fork() could return a fake object that has child_process's send method, so your parent process's code still thinks it's talking to the real thing:

// FakeChildProcess.js
class FakeChildProcess {
  constructor() {
    this.receivedMessages = [];
  }

  send(message, handle) {
    this.receivedMessages.push({ message, handle });
  }
}
module.exports = FakeChildProcess;
// Calculator.test.js
const Calculator = require("./Calculator");
const FakeChildProcess = require("./FakeChildProcess");
const { fork } = require("child_process");
jest.mock("child_process");

describe("Calculator", () => {
  let childProcess;

  beforeEach(() => {
    jest.resetAllMocks();

    fork.mockImplementation(file => {
      childProcess = new FakeChildProcess();
      return childProcess;
    });
  });
});

Then, your fake could provide some insight into your test code, like keeping track of the messages sent to it:

it("sends a square message", () => {
  const calculator = new Calculator();
  calculator.square(5);

  expect(childProcess.receivedMessages).toContainEqual({
    message: {
      command: "square",
      args: [5]
    },
    handle: undefined
  });
});

If you have a fake on both sides — a fake pretending to be a child process and a fake pretending to be the parent process — you can even connect them together. Through those fakes, the parent code can send messages to the child code. Then, the child process can do the work and send a response back to the parent, all within a single test.

But to do all that, you can't just write methods. You have to go a little bit further.

EventEmitter: A faker's best friend

Processes send messages through methods, but they receive messages through events. So you also need to fake those out.

In JavaScript, objects that send events are pretty common. And that means there's an easy way to build that functionality into your own objects and fakes — EventEmitter.

After a JavaScript class extends EventEmitter, it can send events and allow subscribers to those events. It gains methods like on, removeListener, and emit. When your fake object can emit the same events as the real thing, anyone using it won't know the difference. And because you control the fake object, you can control when those events are sent.

Here's what your FakeChildProcess might look like after it implements event handling:

// FakeChildProcess.js
const EventEmitter = require("events");

class FakeChildProcess extends EventEmitter {
  constructor() {
    super();
    this.receivedMessages = [];
    this.sentMessages = [];
    this.on("message", message => {
      this.sentMessages.push(message);
    });
  }

  send(message, handle) {
    this.receivedMessages.push({ message, handle });
  }

  // Call to pretend a message was received from the child, in the
  // parent's process
  receive(message) {
    this.emit("message", message);
  }
}

module.exports = FakeChildProcess;

Finally, write the tests

Between seams and fake objects, you have all the parts you need to test parent process and child process communication. Let's think about how you could use these parts to hook everything together.

The parent process calls child_process.fork() to start up a child process. You would mock fork() to return a FakeChildProcess instead:

beforeEach(() => {
  jest.resetAllMocks();

  fork.mockImplementation(file => {
    childProcess = new FakeChildProcess();
    return childProcess;
  });
});

Remember, we changed the child process to do most of its work in a handleMessage method, which takes a parent process as an argument. For the child to send responses back to the parent, you'll need a fake parent process. It looks a lot like the fake child from earlier:

// FakeParentProcess.js
const EventEmitter = require("events");

class FakeParentProcess extends EventEmitter {
  constructor() {
    super();
    this.messages = [];
    this.connectedProcess = null;
  }

  connect(process) {
    this.connectedProcess = process;
  }

  send(message) {
    this.messages.push(message);
    if (this.connectedProcess) {
      this.connectedProcess.receive(message);
    }
  }

  receive(message, socket) {
    this.emit("message", message, socket);
  }
}

module.exports = FakeParentProcess;

When send is called on the FakeChildProcess, it will call receive on the FakeParentProcess, which will trigger its message event and vice versa. That's how these fakes communicate with each other.

When setting up the fake processes, this fake parent process is passed into handleMessages in your tests, so all the child events are hooked up correctly:

parentProcess = new FakeParentProcess();
childProcess = new FakeChildProcess();
parentProcess.connect(childProcess);
handleMessages(parentProcess);

The FakeChildProcess also needs to connect to a FakeParentProcess and relay messages to it:

// FakeChildProcess.js
const EventEmitter = require("events");

class FakeChildProcess extends EventEmitter {
  constructor() {
    super();
    this.receivedMessages = [];
    this.sentMessages = [];
    this.connectedProcess = null;
    this.on("message", message => {
      this.sentMessages.push(message);
    });
  }

  connect(process) {
    this.connectedProcess = process;
  }

  send(message, handle) {
    this.receivedMessages.push({ message, handle });
    if (this.connectedProcess) {
      this.connectedProcess.receive(message, handle);
    }
  }

  receive(message) {
    this.emit("message", message);
  }
}

module.exports = FakeChildProcess;

Finally, hook the FakeParentProcess and FakeChildProcess together, so the parent and child code can communicate both in both directions:

describe("Calculator", () => {
  let childProcess;
  let parentProcess;

  beforeEach(() => {
    jest.resetAllMocks();

    fork.mockImplementation(file => {
      parentProcess = new FakeParentProcess();
      childProcess = new FakeChildProcess();
      parentProcess.connect(childProcess);
      childProcess.connect(parentProcess);
      handleMessages(parentProcess);
      return childProcess;
    });
  });
});

And now the regular code in the child process's handleMessages function, the code you barely had to change, can run within your test process. And because it's all just methods and events, you can monitor the messages being sent, test process crashes and disconnects, and all kinds of complicated edge cases.

What would an actual test look like?

it("sends a square message", () => {
  const calculator = new Calculator();
  calculator.square(5);
  expect(calculator.lastResult).toBe(25);
});

Easy, right? When square is called on calculator, it calls send. In the real world, that will send a message to the child process using Node's API, and receive messages in the same way. In the test world, this is what happens instead:

  1. calculator calls send on a FakeChildProcess.
  2. That FakeChildProcess calls receive on a FakeParentProcess.
  3. That FakeParentProcess emits a message event, which the child code listens to.
  4. The child does the work and calls send on the FakeParentProcess with the result.
  5. The FakeParentProcess calls receive on the FakeChildProcess.
  6. The FakeChildProcess emits a message event, which the calculator listens to, and sets the result.

Most of your code is completely unaware that anything different is happening. That means your tests can look exactly like your code looks, while still testing everything from one end to the other.

When should you go this far?

As nice as this kind of end-to-end testing can be, it's still best to do as much testing as you can in isolation. Test just the parent code, test just the child code. Test around the communication, not through it. These tests will be faster and less complicated to set up.

But when you're sending strings or arbitrary keys as messages, when you have a protocol that could break, or you just want to sprinkle some end-to-end tests to check that everything keeps working, these kind of fakes can be so helpful. Things like process crashes that previously seemed untestable will now be completely easy to test. And even those most complicated tests will look totally natural.

Justin Weiss

About Justin Weiss

Justin is a long-time Ruby on Rails developer, software writer, and open-source contributor. He is a Principal Software Engineer at Aha! — the world’s #1 roadmap software. Previously, he led the research and development team at Avvo, where he helped people find the legal help they need.

Follow Justin

Follow Aha!

© 2020 Aha! Labs Inc.All rights reserved