Using the Sqreen Agent without PII

Sqreen automatically tracks certain kinds of user behavior in your web application, to provide context and actionable insights into how attackers are abusing your app. By default, Sqreen tracks unique users by their email address.

However, in tightly regulated industries, personally identifying information (PII) requires special care; user data must be anonymized using unique identifiers independent of the user’s name, email addresses, Social Security Number, or other sensitive information.

The Sqreen agent offers functionality to help you anonymize the data sent to us, so that you aren’t transmitting PII to our servers, while still getting the behavioral analysis that your security team depends on.

This functionality is also useful if you want to track user behavior, but are not using one of the supported frameworks, such as Devise for Ruby on Rails.

This tutorial will walk you through using this functionality in Ruby.

Enable Advanced User Context

The first step is to disable Automatic User Context, the default setting, and enable Advanced User Context

Navigate to your app’s global settings page from my.sqreen.io

About halfway down, click on the switch to move User Context from Automatic to Advanced.

Don’t forget to hit the big green Save button at the bottom of the page!

Add user authentication tracking to your code

At the moment, the only user behavior that Sqreen tracks is authentication attempts (as opposed to network and browser data that Sqreen also tracks, like IP addresses and user-agents). Sqreen takes note of each successful and unsuccessful login attempt; normally this is done automatically when Sqreen is familiar with the authorization module you are using.

To perform user authentication tracking without sending PII, you’ll need to call this method after each login attempt, whether it was successful or not. Let’s assume you have an object called user that has some fields to identify who is attempting to log in.

Require 'Sqreen'
...
Sqreen.auth_track(user.loggedIn?, email: user.email)

The first parameter to Sqreen.auth is a boolean that indicates whether the authentication attempt was successful. The second (and subsequent) parameters form a hash that collectively uniquely identify the user to Sqreen. In this example, we are relying on the user’s email address—exactly what we wanted to avoid doing.

It’s important to note that this identifying hash is meant to represent a composite key, that is, any and all elements of the hash are taken to collectively and uniquely identify the user. Do not add anything to the hash representing transient or volatile state, such as the timestamp or the IP address used to log in.

You can send anything you like in this hash; let’s suppose that you have a unique user ID associated with the account. That’s a better choice for anonymizing what is sent to Sqreen, since Sqreen does not have your user ID-to-email mapping.

Sqreen.auth_track(user.loggedIn?, id: user.id)

And that’s it. Just be sure to make this call each and every time a user attempts to authenticate, and Sqreen will work its magic from there.

Advanced Topic: Tracking email domains for good

If the user identification hash contains a field called email (which by default in automatic mode, it does), Sqreen can perform some additional analysis to help your security team. In particular, at the moment, Sqreen examines the email domain to see if there is anything suspicious about the mail host, in particular, whether the host is a known email anonymizing service. Such services are rarely used by legitimate users, but are often used by malicious attackers.

To use this feature, you will need to determine whether or not the domain of the email host by itself constitutes PII—in some contexts it certainly might. However, in many, the email domain is not considered identifying it may therefore be fair game for security analysis by Sqreen.

In these cases, you can avail yourself of this analysis while maintaining the anonymity of your users with a clever hack. Instead of sending the actual email, you can replace the local-part of the email address with the user ID (and a prefix so both you and we don’t confuse it with an actual email address):

idAndDomain = "anon-#{user.id}@#{user.email.split('@')[-1]}"
Sqreen.auth_track(user.loggedIn?, email: idAndDomain)

What you send to Sqreen will remain stripped of PII, while at the same time providing enough information to Sqreen to allow it to track whether this email account originates from a suspicious email host.