Engineering ExtraHop Addy

Machine learning comes to IT with our anomaly detection service.

Yesterday, ExtraHop announced Addy, our new cloud-based machine learning service. With real-time anomaly detection, Addy helps IT teams take a proactive, data-driven approach to supporting and securing the digital experience. The engineering around Addy is the culmination of years of research, applying data science and machine-learning techniques to network and application behaviors. Simply put, Addy is the next evolution in IT analytics. The future is here.

The Dark Ages of Monitoring and Alerting

Performance and security monitoring have come a long way in the past few years. Finally, we're emerging from the dark days of acronym-soup monitoring (APM, NPM, SIEM, ITOA), and IT teams are increasingly adopting an IT analytics approach. But the challenge of scaling applications and infrastructure remains.

Manually eyeballing NOC-style dashboards is prone to human error and provides limited coverage of the infrastructure. The tried-and-true methods of receiving notifications through threshold alerts or historical trends has its set of challenges, too. Those methods assume that you fully understand the changes in your environment and that you are fluent enough to model abnormalities with advanced statistical functions.

The reality is that traffic patterns change, applications and devices in your network come and go, and people don't have time to configure or tune manual thresholds. More importantly, these approaches miss the fact that you cannot fix what you don't see or fully understand.

The Age of Enlightenment

Addy applies machine learning techniques to automatically detect anomalies in your IT environment by analyzing metrics collected on your ExtraHop system. The service requires zero configuration and zero tuning to operationalize. The service sifts through thousands of devices and metrics that were previously difficult or impossible to find manually.

Anomaly examplesClick image to zoom
Example of detected anomalies displayed in the ExtraHop Web UI

Addy integrates with the ExtraHop system and provides a powerful way to analyze anomalies through the existing Web UI and familiar workflows. Each anomaly includes detailed context and actionable information that will assist administrators in identifying root causes quickly. With Addy, you can automatically discover previously unknown performance issues or infrastructure quirks to gain deeper insight into your infrastructure.

Under the Hood

The ExtraHop wire data platform provides in-depth visibility into all aspects of modern IT infrastructure, by providing turnkey analysis of more than 40 enterprise network protocols such as HTTP, SMB, Citrix ICA, SSH, FTP, VOIP, Oracle, MSSQL, NFS, MySQL, and more. For each of these protocols, we measure and store transaction timing, error rates, transaction rates, user load times, and up to 4000-plus other different metrics. The ExtraHop platform is one of the most advanced platforms to extract time series metrics on network and application behavior with wire data.

Dashboard example: HTTP processingClick image to zoom
Dashboard of HTTP processing times displayed in the ExtraHop Web UI

Network traffic and application performance is extremely cyclical; past behavior is a strong predictor of future behavior. As part of our research, we isolated a set of features in wire data that have the highest probability of correlation with relevant IT operation and security anomalies. Addy extracts metrics to tune a model with a custom machine-learning algorithm. The service continuously checks device and network behaviors through metrics collected by the ExtraHop systems and applies that data against the model it built. It then generates an alert when there are anomalous behaviors that might affect IT operations or security.

We focused on a few key areas during the development of Addy:

Accuracy - Reducing duplicate, irrelevant, and false positive anomalies was an important goal from the beginning. To separate mathematical anomalies from anomalies our customers actually care about, Addy includes an expert system in its anomaly processing pipeline that reduces the number of irrelevant notifications you receive. This expert system was developed from more than 10 years of experience monitoring customer network and application behaviors.

Adaptability - As the saying goes "the only thing constant is change". Addy is built with the assumption that the algorithmic model must continually adapt to changing environments. We continually update the model in real time with new metrics found through the ExtraHop system with an Online Learning method.

Scalability - We have customers that monitor anywhere from tens to hundreds of thousands of devices with their ExtraHop systems. Addy was built with scale in mind with the goal of supporting these types of customer environments. We build separate models for each device and for the applications and protocols running within them by leveraging elasticity of the cloud.

ExtraHop Addy is just the first step in what's possible when machine learning is applied to digital interactions. Stay tuned... the network has so much more to offer. And if you're passionate about data and want to work on building the future with us, we're always looking for exceptional candidates to join our top-notch Engineering team!

Subscribe to our Newsletter

Get the latest from ExtraHop delivered straight to your inbox.