Monitoring the performance of a Node.js web application

Tl;dr

Building a tool to monitor how an application performs is not very difficult anymore.

Two recent additions to Node.js, the Async Hooks API and the Performance Hooks API, allow anyone to closely monitor the performance of their app with just a little code.

This article explains the key elements to building such a tool, from instrumenting code to reporting collected data in a clear visualisation.

The final project is available on Github and features:

  • A simple performance monitoring agent

  • A test application based on Express and MongoDB

Bar charts showing Node performance

Introduction

When running a web application in production, performance is important. Slow web servers provide degraded user experience and can threaten a whole company’s business.

In order to fully understand how a web application runs in production, load testing is not enough. Even if tools like ab can give measures regarding how fast a server answers under a specific load, they can’t tell you where the bottlenecks are.

In this article, we will build a tool that will monitor how much time is spent in MongoDB when a simple Node.js application answers an HTTP request.

Let’s build a user story

As a developer, I want to be able to know how much time my server spends in external services when answering an HTTP request.

For this article, let’s look at the simple Express/Mongoose application here:

https://github.com/sqreen/funAPM/blob/master/testApp/server.js.

For this application, we want to monitor the time spent in MongoDB for each HTTP request.

Also, in this article, we will only use the async/await syntax.

First solution

The obvious solution is to just add time samples around the database requests and log it. That would probably do the trick, but you would have to change your code everywhere to add a process.hrtime or a new Date() before and after the execution of every method you want to time.

Obviously, this won’t scale, and therefore isn’t a viable solution.

Let’s override some methods

If we don’t want to change the application code, we need to change the code of its dependencies.

If we focus on a smaller version of our application:

We can monitor the time spent in MongoDB by simply overriding the Cat.find method:

In this code:

  • We extract the prototype of the Cat object. That is where the find method is defined.

  • We keep a reference to the original version of find.

  • We replace the find method on the prototype of Cat with our custom method:

  • Using the console.time/console.timeEnd methods we log the execution time of the original method.

  • We call the original method by doing find.apply(this, arguments) (apply is introduced here and arguments here)

When we start the process and curl on http://localhost:9090/cats , the console displays:

server running on port 9090
cat.find: 1.246ms

However, this patch has several issues:

  • We do not know which HTTP request was responsible for this call.

  • Execution times only show in the console, we aren’t storing them so we can manipulate them later.

Performance Hooks API

To save the duration of a call to an external service, we will use the brand new (and experimental) Performance Hooks API. It has been added recently to Node.js by James Snell. This API is compliant with the W3C specification and therefore is the same as the one available in modern browsers.

Let’s write a wrapper that times the execution of a function returning a promise:

Each time a method is called, we create a unique id for each performance measurement. This will ensure no collision happens between two timing operations.

Instead of overriding directly each method, we can directly do:

Async Hooks API

The Async Hooks API is still experimental but should be stabilized by Node.js 10 (expected April 2018). This API allows us to place hooks around asynchronous operations. Irena Shestak wrote a very good article introducing Async Hooks simply.

For our purposes, we will only need this API to keep track of the HTTP request responsible for the code execution. Some packages (like continuation-local-storage or various implementations of Zones) provide similar functionality. However, as these modules are only based on userland code, some asynchronous operations might be ignored by them and the context would be lost (see example here).

Our hook will be pretty simple:

  • When an asynchronous resource is created, if its parent has a context, this context will be propagated to the new resource.

  • When the destroy hook is called, we remove the link between the resource and its context.

Then we will place this in a new Async Hook:

Now we need to create a new context for each HTTP request and to provide a way to access the current context from everywhere. To track HTTP requests, we will override the emit method on the class Http.Server from Node.js core:

Now, for all instances of Http.Server, when the emit method is called with the request event, a new context is created and associated with the current executionAsyncId.

Since our Async Hook will propagate this context to the children resources, at each time, calling AsyncHooks.executionAsyncId() should return us a valid key to the context map.

Let’s write a simple method to wrap this:

Building an Agent

Now, we have all the tools we need to build a proper agent to inject into a Node.js application to monitor it.

Our agent will need to be the first module required by the app in order to place instrumentation hooks (as explained a bit later). It can be started by calling:

require(‘<path_to_agent>’).start();

Before giving you the sources, I will share two of my last secrets:

  • To override a module, we can change the behavior of require by changing a private method in the core. This is a not a good solution, but at the moment I don’t really know any better way to achieve this. (see sources). The new Loader Hooks API only hooks into ES modules.

  • There is a very nice option in node that allows us to load a module before the main module. To take advantage of this option, our agent will need to call its start method if loaded that way. (see sources)

So our whole agent is available on github. Feel free to browse the sources, the central concepts have been explained in this article.

If you run the testApp provided in the repository. A file named apm_logs.json will be created in the directory. Its content looks like:

As you can see, for each request that went through the server, the agent recorded its duration and the duration of MongoDB operations. Unique identifiers have been saved for debug purpose and can be ignored.

Bonus: Viewer

Since our agent is producing output as JSON, we should be able to display the timing data in a more user-friendly way.

Using d3.js and a nice timeline plugin, I generated a webpage to display the measures made by the agent in a more visual way. After the end of the Node.js process, a file named viewer.html is created.

et voila!

Conclusion

In this article, we have seen that building an Application Performance Monitoring tool for modern Node.js applications is not complicated anymore, using two new Node features, the Async Hooks API, and the Performance Hooks API. Of course, these new APIs are great for building all kinds of cool goodies, and it’s worth your time as a Node developer to get to know them in more detail.

Coda

Commercial tools exist and provide many more features than our simple proof of concept including:

  • Event loop monitoring

  • Memory monitoring

  • Historical statistics

  • Garbage collection monitoring

  • Server load monitoring

If you run a Node.js application in production, you might want to look what they have to offer.