[4/16/2018 - Editor's update: ExtraHop has now released a 100 Gbps appliance. Read more in our blog post: How ExtraHop Analyzes 1 PB+ Per Day Without Breaking a Sweat]
The story of why I joined ExtraHop goes back over 15 years to when I was part of a team that developed a network analysis tool for a large Fortune 500 company. At the time, although I had an instinct there was a better approach to analysis of data on the wire, I accepted the consensus explanation for our approach. Little did I know that I would find myself back to where I left off many years later. This time, however, it would be with a team of people who challenged the consensus and developed a platform that expanded my understanding, and the possibilities, of wire data analytics.
The Past: Great Analytics, Limited DataDuring my first foray into network analysis, I was part of a team that developed a "black box" network analysis tool. It collected data for analysis with little customer involvement, packaged it up for transport, and sent it to us via encrypted channel. By mapping the data collected from the customer's network to our "backend intellectual property," we offered proactive services for compatibility, risk analysis, defect-tracking reporting, and more.
There were three major components of the solution: 1) collect the data to be analyzed 2) transport it to a backend system and 3) analyze and report the findings. The second part of the offering was the most trivial; it required simple compression and encryption of the data and secure transport to our backend systems. The third part, analysis and reporting, was our value-added proposition. We mapped massive amounts of intellectual property available only within the confines of our data center against a relatively small amount of data available on the customer network.
Three Methods of Data CollectionCollecting data for analysis was, and continues to be, the most challenging part of the offering. At the time, I surmised there were three types of data collection models a monitoring solution could employ:
- Polling – The polling model is simply to ask a device for some information and have it tell you what you wanted to know. Often this could be done through SNMP, but that data was somewhat limited in value. To gain access to the richest data available on request, logging into the device was required. Continued log-in access proved difficult to maintain and this approach necessitated more user involvement than was advertised. Furthermore, this approach was limited in scope because you had to know what you were looking for (response) in order to ask the right questions (request).
- Notification – This model is based on the assumption that devices would proactively send out notices regarding their events, state, and status. Examples of a notification approach are syslog, SNMP traps, and alerts. While useful, the main limitation in this model is that the system only tells you what it thinks you ought to know. If the system is not programmed or cannot be programmed to share relevant data (what you want to know), all it will alert you to is what you don't really care about (or worse, what you already know)!
- Passive Monitoring – This approach is what I thought to be the richest, yet most difficult, source of data to extract 15 years ago, and I still believe it to be same today. Regardless of how systems are configured, or what notifications they are programmed to announce, they are undoubtedly going to communicate on the wire. Further, regardless of what these systems said they would do (configuration) or what they said they did (log messages), what they actually did occurred on the wire and cannot be disputed. While incredibly voluminous and sometimes unpredictable in structure, the ExtraHop wire data analytics platform is capable of processing these L2-L7 communications in real-time, thanks to advances in multiprocessing and storage, as well as some wicked-smart programming. Read about the ExtraHop Context and Correlation Engine.
The Present: Wire Data Analytics Is Now a RealityI joined ExtraHop for two simple reasons: 1) the ExtraHop vision for building an IT Operations Architecture and 2) the delivery of a high-performance platform that brings wire data analytics to life.
A concise framework to analyze the four sources of IT operations data (machine data, agent data, synthetic data, and wire data) is the underpinning of a good IT Operations Architecture. There is no one source of data that IT can rely on, but at the same time, wire data is the richest of the four sources. By leveraging wire data for security and compliance, dependency mapping, trending, proactive early-warning notifications, and capacity planning, IT can pay down the small bits of "technical debt" that organizations incur daily and avoid a costly "IT bankruptcy."