
John McGovern is a Principal Systems Engineer at ExtraHop, and a wizard at building metrics.
So how many times have you heard this one? "This application is something we bought from vendor XYZ, and they seem to not really know how it works at all." Or even better, this one? "So we have this multi-tier web application that an employee wrote a few years ago, and none of us really know how it works."
My name is John McGovern and I am one of the Principal Systems Engineers here at ExtraHop. I, like you, hear statements like these a lot. I was recently working with one of our prospective customers on this very tricky problem. The root customer scenario usually revolves around an Operations team that wants to enable their customers to be successful, but the team just has no idea where to start when things go wrong.
Well, enter ExtraHop and Application Inspection Triggers, or AI Triggers as they are usually known.
In today's blog post I'm going to walk you through a few advanced AI Triggers for cases like this so that, at the end of this post, you should feel comfortable enough to try them out on your own. In addition, I've included some handy code samples that should help you along the way.
Now, because I feel these problems are always a bit easier if you can break them into smaller chunks, let's do that first.
Breakdown
All right, so we have a multi-tier web application but what does that mean in this example? Good question!
- Web tier (think Apache/IIS)
- Application tier (think SOAP/WebServices)
- External Service calls (think SOAP/WebServices not in-house)
- Database tier (your DB2/Oracle/Informix goodness)
Why
Before we even get into the thick of this, let's regroup and focus on the most important part of everything we do at ExtraHop, "The Why." So why go through all this work? Well, for starters, customers are unhappy with their application's performance but we have no real baseline to know if things are going swimmingly or poorly. Plus their business relies on this application being used. This example customer relationship manager (CRM) application is how they interact with their customers and how they make sure their bills are paid. That's pretty strong motivation to make sure it is supportable by their Operations team.
Tools
ExtraHop is a platform, and as we like to say, "If you need a tool, build a tool." In this case, the ExtraHop AI Trigger API can help you build the tools you need.. One big part of the solution is the Session table.
Session table vs. Flow table
Now, for some of you veterans out there, or folks reading ahead, I want to clarify one point here. The Flow table allows you to send information from say the request side of a conversation to the response side of the same Flow. Say, for instance, you want to see the headers on a Web request and compare them to the processing time for that request. Well, to do that, I would need to store those headers on the Flow and wait for the response in order to know what the processing time is. The next question you might have is "Well, John, then what does the Session table do?" Great question, hypothetical person, I'm glad I had you ask that. The Session table allows you to compare data between different Flows and to start creating some more discreet and actionable metrics. In this case we will use the Session table because we have multiple Flows in play.
- Client to Web server
- Web Server to App Server
- App server to External Services
- Web/App to Databases
The Session Table Is Not Magic, or Maybe It Is
If anyone ever walks up to you and tells you that they can stitch together every transaction in your environment, end-to-end, automatically, and without impacting the performance of that application, I have a bridge in Brooklyn I'd like to sell you. Transaction stitching is a hard problem, usually solved by deploying numerous agents on all of your servers, and then deploying a fleet of Professional Services Engineers for however long it takes to build out the stitching for one application. Watch out if you ever change that application, by the way, because you then get to do it all again.
So how is using the session table different from the pain of that approach? Well, we can pick anything we want out of the wire to use as our glue for the transaction. The main caveat is that something must exist on the wire, and almost always there is something there. In our example, we have the username of the user, and that's really it. Not a problem. Now let's get to work!
Part 1: Find The Keys
All right, we know we have a problem we want to solve, and we know basically how the application is laid out, due to the awesomeness that is the ExtraHop Autodiscovery Process, or some Dynamic Groups that we configured. Assume, at this point, that we have groups defined for both the web tier and the app tier. Let's start there. Remember why we are doing this--we want to know what transactions for users are slowing down performance. So we need the usernames, the transactions, and maybe, if a GUID exists, we should grab that too. Our keys have made themselves known!
So how do we grab them? Well they exist on the web tier and on the request side. So we will use some AI Trigger magic to find the username and a GUID, if it exists. Below is an example Application-Inspection Trigger to get this information for the web tier, and the app side will be similar.
//Replace with an expression that will properly match the username.
var re_user = /UserName=([A-Z].+?)\&/;
//Replace with an expression that will properly match the GUID.
var re_guid = /guid=([^\W].+?)\&/;
var matches;
var payload = HTTP.payload;
if ((matches = re_user.exec(payload)) !== null) {
Flow.store.webUser = matches[1];
}
if ((matches = re_guid.exec(payload)) !== null) {
Flow.store.webGUID = matches[1];
}
So, what's happening here, and why is it important? Well, first we build the username expression on line 1, then the GUID expression on line 3. The magic starts on line 6, AI Triggers give us direct access to the HTTP payload and now we can look for anything we want in that payload. First we find the username and throw it into the Flow store for later, and then we do the same with the GUID. That's it, the expression may change from tier-to-tier, or app-to-app, but that is all you would shift. Here is an example from the app tier that looks really similar:
//In this case both the user and GUID are in the same line.
var re_user = /userId\ xmlns\=\"\"\>([A-Z].+?)\#(.+?)\</;
//Sometimes it is missing so check for the GUID in another field.
var re_guid = /guid\ xmlns\=\"\"\>(.+?)\</;
var matches;
var payload = HTTP.payload;
if ((matches = re_user.exec(payload)) !== null) {
Flow.store.appUser = matches[1];
Flow.store.appGUID = matches[2];
if ((matches = re_guid.exec(payload)) !== null) {
Flow.store.appGUID = matches[1];
}
}
The only thing that changes here is that in this tier, we sometimes get the username and the GUID together but sometimes we get just the GUID. Mainly, we just have to modify the regular expressions and we are off to the races. We have the username and the GUID. Now we need the other piece of the puzzle, the transaction. Oddly enough, this is pretty easy to obtain on the web tier. We want the HTTP path and on the app tier, we want the SOAP Action that was executed, which exists in a header. So how does that work in triggers? Well for the web tier:
var path = HTTP.path;
Yeah, that's it, and it is available on the response side of the Flow, so there is no need to use the store. The last piece we need is the SOAP Action header on the app tier. This one is a little more work, but the API helps a lot here:
if (HTTP.method == "POST") {
Flow.store.appSOAPAction = HTTP.headers['SOAPAction'];
}
In this case we use the Flow store, because this header only exists on the request side and we want to be able to do timing per SOAP Action, so we do a quick sanity check and grab the header.
Part 1 Recap
Now we have a few things in our Flow store and have identified our keys, based upon our initial needs and all of these items exist either natively on the response side or in the Flow store.
- We have the username per tier.
- We have the GUID per tier.
- We have the HTTP path and SOAP Actions for each tier.
Part 2: Organize Your Data
We now have everything we need to keep track of all the transactions by username, so we can figure out how to keep our internal (co-workers) and external (customers) clients happy. Before we go crazy, we should figure out how we want to store this data so we can consume it properly. In this case, we want to track all users by username, and have a record of all their transactions on each tier and how long those took to process.
In this case we feel the most useful way to keep all this data straight is to build a container. Since AI Triggers are based on JavaScript, we have the ability to construct objects pretty easily. Below you will see the construction of our Object for this use case:
var newObject = {
'web': [],
'app': []
};
So basically what is happening here is that we created an Object that has a pair of arrays named web and app that will hold our transactional data. So say we have a transaction we want to put into our web array, how do we go about that? Pretty easily, again due to the full JavaScript engine we have at our disposal.
newObject.web.push({
'path': path,
'tprocess': processingTime,
'guid': GUID}
);
That's it, we've now added an HTTP path, processing time for that path, and the GUID, if I have it. The same can be done for the app array as you see below.
newObject.app.push({
'soapAction': soapAction,
'tprocess': processingTime,
'guid': GUID}
);
So now we have a handy dandy way to store all of transactions, be they HTTP or SOAP, but what about the username, you ask? Well that's the key…
Part 2 Recap
Now that we have a place to keep all of our web and app tier data, using a JavaScript Object with two arrays defined within it.
Now, how do we get that data together for each user?
Part 3: Enter The Session Table
So, we have talked a bit about the session table and really haven't done anything with it, yet. The reason for that is that we need to have the foundation built before we can really leverage the power the table provides.
Why are we doing this again? We have an application here that is poorly understood, randomly performs poorly, and is critical to the happiness of customers (both internal and external.
We have all our data in a JavaScript Object per tier right now, but we really want them together, keyed off the username. Let's do that then, first on the web tier:
var opts = {
expire: 30,
priority: Session.PRIORITY_NORMAL,
notify: true
};
Session.add(user, JSON.stringify(newObject), opts);
Remember that username we found on both the Web and App tiers? That will be our session table key. Above, we set a few options about how long an entry should live in the table, what the priority is, and if we should have an event fire when keys age out of the table. Then we add something to the table using the user as the key with those options. Now remember our Object? Well, you can't just shove that on the table and call it a day, you need to convert it to a string which is able to be stored on the session table. That is what JSON.stringify()
is doing for us here. Conversely, this means that when we pull data off the table we will need to convert it back using JSON.parse()
, but more on that in a bit.
So, that's it. We put something on the table. What if we want to add to an already existing entry, you ask? Good catch! The add command is something you would most likely execute after you have already tried looking up a username in the table and failed. Here is the complete logic with the lookup, appending the data, and creating the Object from scratch with an initial table entry.
//Find the user in the Session table.
var lookup = Session.lookup(user);
//If we find it, grab the object and add on the new data to it.
if (lookup) {
var appObject = JSON.parse(lookup);
appObject.app.push({
'soapAction': soapAction,
'tprocess': processingTime,
'guid': GUID
});
Session.modify(user, JSON.stringify(appObject));
} else {
//If this is our first hit, build the Object that will hold our stats
//and add our first entry to the table.
var newObject = {
'web': [],
'app': []
};
newObject.app.push({
'soapAction': soapAction,
'tprocess': processingTime,
'guid': GUID
});
var opts = {
expire: 30,
priority: Session.PRIORITY_NORMAL,
notify: true
};
Session.add(user, JSON.stringify(newObject), opts);
}
This is the code from the app tier, but the web tier is basically identical, except data is placed into the web array and we store the path instead of the SOAP Action. First we do the lookup, if we hit paydirt convert the string back into an Object, add the new data, and put it back on the table with a Session.modify()
. If this is the first time we see the username, we build out our container, set up the Session options and add a fresh entry. That's it, magic.
Part 3 Recap
We have all the pieces to the puzzle to start figuring out why some users have bad experiences sometimes. Is it some transactions on the web or app tiers? Is it an External Web Services request from the App tier? The next step is to turn the raw data into something meaningful.