Behind the Scenes: Building a Dynamic Instrumentation Agent for Node.js
Building a dynamic instrumentation agent for Node.js is a complex challenge. At Sqreen, we provide a powerful security tool for development teams using Node.js. You will be able to spot user accounts attacking your app and improve your code by fixing vulnerabilities with the stack traces that we provide.
This protection is based on an agent (which is no more than a regular Node.js package that can be installed from the NPM repository).
There are many advantages to dynamic instrumentation:
- Extremely fast to setup since there is no source code modification
- At runtime, the program is fully loaded and can be fully observed, including all third party libraries
- Whatever the code, when instrumentation is involved, the functions can be hooked. Changing source code, or using different versions of it (dev, staging, prod) is transparent.
To build such an agent for Node.js, the following topics need to be assessed:
- Code instrumentation: how is it possible to monitor an application’s behavior and to prevent exploitation of security issues?
- Data transmission: how to accurately report what happened in the application to a server without impacting the performances?
This article will describe of such instrumentation can be built in Node.js.
Instrumenting Node.js code
class keyword, but the
Reflect library is not available.
The permissive module lifecycle makes dynamic instrumentation tricky. If someone takes a reference on a method, it will not be possible to update this reference. However, if the reference is taken on an object, the members of this object can be.
The overall dynamic instrumentation process can be summarized as follow:
- When a module is imported (using `require`), a reference to this module is kept in a private store.
- When the instrumentation instructions are available:
- The module to be instrumented is retrieved from the private store.
- The methods that have to be instrumented within the module are overridden with a general purpose wrapper.
- Instrumentations hooks are placed on the wrapper.
Let’s detail the aspects of this process:
Keeping track of imported modules
Each time a module is imported in Node.js (using the
require method), the Module._load method is called. To keep track of the modules that are imported into a Node.js application, this method can be overridden.
keepTrackOfImport method will place the imported module in a private object. It needs to know the
request string (this is what is passed to
require) and the
parent (the module from where the import is done) in order to get the unique identity of the imported module using the method
Overriding Module._load this way has little impact on the application:
- The original method is called as it would have been without the override thanks to `load.apply(this, arguments)`
- Since what happens during a `require` is synchronous, the performance loss is small compared to the original operation.
Building a general purpose wrapper
The hooking logic has been applied to keep a pointer on each imported module. This will allow overriding methods inside the module at runtime if needed. The point here is to only override methods whose behavior must be controlled.
- A normal function (declared using the keyword `function`, the `Function` constructor or any other historic way to do so).
- An arrow function
- A class
Historically, function wrapping could be achieved using a simple piece of code:
This method will not work anymore since
class have been introduced to Node.js. Calling a method defined using the
class keyword with
apply will throw an error:
TypeError: Class constructor X cannot be invoked without 'new'
And at the same time, an arrow function cannot be called with
A modern generic function wrapper must be aware of how it has been called to call the wrapped method in the same way. Thankfully, ES2015 provides reflection tools to build such a wrapper.
Such an example is available in Node.js util library:
The sad part here is that tools like
new.target are not available in Node.js 4.x whereas
class and arrow functions are.
Placing execution hooks
Once a method is overridden, the wrapper can place methods (named callback functions) to be executed before or after the original one. This allows modifying the behavior of a method dynamically at runtime.
Such a wrapper would look like that:
Three hooks have been placed in this wrapper:
- One before the execution of the original method.
- One if the original method throws an error.
- One after the execution of the original method.
Adding complexity to this method can be done to add actions from the result of the executions of the hooks. For instance, the preHooks could prevent the execution of the original method.
The execution of the hooks must happen in a fail-safe environment, i.e. within a
try-catch statement. Also, all promises used within a hook must have a catch statement. This will prevent the hooks to through uncaught exceptions that will potentially crash the process.
The cost of the described methods is not very high in term of performance:
- The import of a module in Node.js is a synchronous task, adding a small operation to keep track of imported modules is negligible.
- Since the instrumentation is dynamic, only a few methods are patched: the impact over the application is tiny.
- Hookpoint execution time will slow down the call to a method, it is unavoidable. Therefore the code placed here must be carefully tested and optimized in order to have the smallest impact possible. Some advanced performance trick should be used here, Some of them have been described in my last article on RisingStack’s blog.
What’s different from other instrumentation libraries?
Instrumentation libraries as NewRelic or Opbeat will practice static patching. It means that the list of modules and methods to instrument is known at the startup of the application.
The methods to instrument are overridden when the module they belong to is loaded.
The main difference with the approach presented in this article is that if the instructions regarding the instrumentation of a module are not available in the current version of the instrumentation library, it cannot be instrumented. That is why those libraries have a pre-defined list of supported modules. With dynamic instrumentation, it is possible to patch modules during runtime even if they did not even exist when the instrumentation library was published.
Node.js is a single threaded platform, this means it can only do one thing at a time. When building an instrumentation agent, one needs to take that into account when designing the reporting chain.
Spending too much time manipulating data will introduce large synchronous chunks of operation that will effectively impact the server’s performances.
The use of timer methods such as setImmediate can be a good idea: The reporting chain will be divided into a set of asynchronous operations until the data is ready to be placed into a reporting queue.
setImmediate allows us to introduce asynchronicity in a reporting chain. Between each operation, the server will handle other tasks.
It goes without saying that the action of reporting some data to a remote server through an HTTP POST request is an asynchronous operation. However, it is not recommended to directly send a lot of requests when there is data to report. Using a reporting queue can be a good idea here.
The reporting queue is nothing but a set of data that need to be reported to a remote server. It is a FIFO (First In First Out) queue. The interest of such object is that one can decide to report a batch of data.
For instance, if 50 metrics reports stand into the queue, rather than transmitting each of them in an individual HTTP POST request, a batch could be built with a subset of the queue and sent to the server.
This method allows reducing the impact of the reporting chain on the performances of the server. Reports happen less often. Therefore, they consume less memory and fewer network resources.
During a period of huge load for the server, the number of items pushed to the queue can rise exponentially. In order to prevent data leaks, the queue length can be limited and the supernumerary items can be dropped.
In this article, we saw how to build an agent to perform dynamic instrumentation in Node.js.
The key parts of such agent are deeply tied to the nature of the Node.js platform:
- The reporting of data should not impact the performances of the monitored apps. In a mono-threaded environment, the execution of synchronous computing tasks should be spread through the event queue.
Fell free to ask your questions about Node.js instrumentation or Node.js Security. I’m always happy to help. Check out other articles I wrote on this blog about Node.js and keep your app safe by using Sqreen!
About the Author