In my last post, I explained the basic architecture of our cloud-scale ML as well as the key benefits this type of architecture provides when compared to locally hosted machine learning. Now I'll focus on ExtraHop Reveal(x) specifically, so you can see how our industry-leading network detection & response (NDR) product empowers hybrid security teams through advanced ML technologies.
Modern enterprises are moving at breakneck pace, and security teams struggle to keep up with all the new vulnerabilities, TTPs, security logs, and event data being generated every day. While Reveal(x) provides industry-leading network visibility (north-south and east-west) into customers' hybrid enterprise environments, we at ExtraHop believe it is also paramount to perform automated, high quality analysis and detections as part of the complete Network Detection and Response (NDR) solution. At the end of the day, security logs and event data that are not analyzed or inspected provide very few benefits to protect the business.
Reveal(x) leverages cloud-scale ML architecture to deliver best-in-class scalability and threat detection. ExtraHop has invested heavily in machine learning and combined our years of experience in threat research, network analytics and data science to build best-in-class ML technology into the core of Reveal(x). It incorporates a collection of sophisticated and patented ML components, designed to extract insights, detect threats, and gather context in order to deliver automated and accurate analysis for modern security teams.
The ML components are broken down into 3 function categories: Perception, Detection, and Investigation. Components across function categories work in unison to provide automated analysis and detections to our customers:
Perception: Observing and Inferring Context
The first function category is Perception, which contains a suite of ML components focused on understanding each customer's unique environment. With over a decade of experience in analyzing network data, we at ExtraHop know that each customer's environment is different and each customer has its own unique IT and security policies and practises. In order to produce the best analytical results specific to each customer, we built a suite of ML components to automatically infer customer-specific contextual information, such as peer groups, security policies and device and user roles, based on observed unique behaviors of every entity on the network.
These contextual pieces of information are later ingested by other ML components to improve accuracy and reduce noise. Some of the ML components in the Perception category are:
- Device Clustering: Identify groups of devices that exhibit similar behaviors on the network–a small cluster of databases, a large peer group of VoIP phones or a set of developer workstations.
- Network Security Policy Inference: Infer network segments that are expected to be exposed to public Internet and external services that are approved to be accessed.
- Asset/Device Importance Ranking: Analyze how different devices interact with each other on the network and identify devices that are more important to the business, such as file shares containing critical financial data, bastion hosts for administrators, and databases that back customer facing web apps.
- Behavioral Profile Inference: Infer the behavioral profile of different devices (such as, domain controller, mobile phones and file servers) based on observed communication patterns.
- Privileged User Identification: Reveal(x) utilizes patented analytics on a variety of network protocols (such as LDAP, Kerberos, CIFS, etc) to track user behaviors across the network. This ML component continuously analyzes authentication and access patterns of different users, and identifies privileged users on the network, including IT admins, Domain Admins, and DB admins.
Detection: Observing and Predicting Behavior in Context
The second function category is Detection and it consists of a collection of ML components that build self-adapting predictive models for every single entity (device and user) on the network. We then feed the modeling and other metadata (such as knowledge inferred from the Perception ML components) into a large set of specialized detectors. The models are continuously updated, making sure they reflect the up-to-date behavior patterns.
Due to the diversity of the forms and techniques used by modern cyber threats, no one model is general enough to identify all attacks. Instead, Reveal(x) relies on an ensemble of predictive model classes, each covering a specific aspect of an entity's behavior. In some cases, Reveal(x) can create over a hundred models for a single device or user, depending on its activity, importance and attributes. To detect potential threats, we also developed hundreds of purpose-built detectors (see more in our whitepapers around detection) that analyze observed behavior in real time.
Every ML-based detector is custom-built for detecting a specific suspicious behavior or attack technique, identified by our threat research team, and uniquely leverages their security expertise and sophisticated predictive models. In addition to that, the detectors are constantly being refined based on learning from the field team and customers. Here are a few ML components in this category:
- Time-Series Analysis: Predict the expected behavior and volume of behavior based on historical observed behaviors. Due to the varied nature of different attacks, we have developed a suite of time-series analysis algorithms each optimized for a specific scenario and temporal granularity.
- Behavior Graph Analytics: Network behaviors can be represented as layers of very large graphs where the nodes are the entities on the network, and edges contain relational information or behaviors. By analyzing these large graphs via various proprietary ML algorithms, this component can accurately model behavioral patterns and detect subtle but suspicious changes, such as when an attacker is already inside the network but is interacting with high importance data servers from a low privileged account.
- Peer Group Behavior Modeling: Model how devices that belong to the same special-purpose peer group behave from different perspectives and identify when an attacker has compromised a device and causes it to start acting differently than its peers.
Investigation: Automating Analysis for Accelerated Triage & Response
The third function category is Investigation, which is in charge of providing automated analysis for potential threats detected and assisting in the triage, investigation and remediation processes. We spent significant effort building and refining ML components in the Investigation category because we believe detecting potential threats is only half of the battle. Being able to easily triage, investigate and remediate threats is equally important to our customers.
Along with optimized investigation workflows, components of this function category are a key part of ExtraHop's differentiation. Two examples of ML components we have in this category are:
- Autonomous Root Cause Analysis: Reveal(x) has an industry-leading capability to record every single activity and behavior on the network. However, similar to solving a crime in meatspace, going through the evidence and gathering relevant context can be very labor-intensive. This ML component performs autonomous context gathering for every detection by simulating how a human analyst would go through different related pieces of information of a detection. More specifically, by leveraging the predictive models that are built by the Detection ML components, the component is able to collect most of the relevant information without any manual guidance. For example, when a detection is triggered on abnormally large file accesses on a highly sensitive database, this component can automatically identify the corresponding suspicious users, clients, database tables and SQL queries that fall outside of the normal range of behavior for those entities.
- Detector-Specific Root Cause Analysis: For certain detectors where human intervention is the best first response, we built auto-generated analytical playbooks for each of them using our expertise in threat research to provide all the attack specific information and concrete, practical next steps at a glance for our customers.
Machine learning is not magic, and not all machine learning is created equal. Understanding the inner workings of machine learning systems that you rely upon is important, especially for mission critical tasks like cyber security detection and response. When you are considering buying a product that touts machine learning as a central mechanism for its functionality, it is worth doing some critical thinking and digging into the important details, including:
- What data sources do these machine learning systems leverage?
- How are these ML models built, managed, and updated to assure they keep up with the rapid pace of change in our dynamic environment?
- Is ML the right way to solve this problem? Will ML just create more friction for my business and people, or will it actually augment my staff's ability to succeed in their work?
If you're curious about how these categories work in practice, give our interactive demo a try or reach out for a conversation with an engineer—we're happy to answer questions or walk through specific use cases.