Trying to evaluate IT monitoring tools can be quite challenging. By our reckoning, there are more than a hundred vendors and open-source projects providing visibility into the IT environment. It can be difficult to sort out which product does what, especially when everyone says the same things.
We'd like to propose a simple conceptual framework that will help you understand what you need to monitor everything in your IT environment and where there might be gaps. Fundamentally, there are four primary sources of intelligence and insight about your applications and infrastructure. To get a complete picture, you need some way of capturing each of these sources of IT visibility:
- Machine data (logs, WMI, SNMP, etc.) – This is information gathered by the system itself and helps IT teams identify overburdened machines, plan capacity, and perform forensic analysis of past events. New distributed log-file analysis solutions such as Splunk, Loggly, Log Insight, and Sumo Logic enable IT organizations to address a broader set of tasks with machine data, including answering business-related questions.
- Agent data (code-level instrumentation) – Code diagnostic tools including byte-code instrumentation and call-stack sampling have traditionally been the purview of Development and QA teams, helping to identify hotspots or errors in the software code. These tools are especially popular for custom-developed Java and .NET web applications. New SaaS vendors such as New Relic and AppDynamics have dramatically simplified the deployment of agent-based products.
- Synthetic data (service checks and synthetic transactions) –This data enables IT teams to test common transactions from locations around the globe or to determine if there is a failure in their network infrastructure. Although synthetic transactions are no replacement for real-user data, they can help IT teams to discover errors or outages at 2 a.m. when users have not yet logged in. Tools that fit into this category include Nimsoft, Pingdom, Gomez, and Keynote.
- Wire data (L2-L7 communications, including full bi-directional payloads) – The information available off the wire, which has traditionally included HTTP traffic analysis and packet capture, historically has been used for measuring, mapping, and forensics. However, products from vendors such as Riverbed and NetScout barely scratch the surface of what's possible with this data source. For one thing, they do not analyze the full bi-directional payload and so cannot programmatically extract, measure, or visualize anything in the transaction payload. NetFlow and other flow records—although technically machine data generated by network devices—could be considered wire data because they provide information about which systems are talking to other systems, similar to a telephone bill.
Every IT organization needs visibility into these four sources of data. Each data source serves an important purpose, although the purposes are evolving as new technologies open up new vistas in terms of what's possible. In the realm of wire data, ExtraHop unlocks the tremendous potential of this data source by making it available in real-time and in a way that makes sense for all IT teams.
Building an IT Operational Intelligence Architecture
By starting with this conceptual framework, you can more easily evaluate your current IT monitoring toolset and even start to think about creating an IT operational intelligence architecture that combines complementary data sources. The ExtraHop and Splunk integration that combines wire data with machine data, for example, is a case where the whole is greater than the sum of the parts. [To see an example of what's possible with the combination of wire data and machine data, check out John Smith's new wiredata.net.
Gartner Research VP Will Cappelli, who co-authors the Magic Quadrant for Application Performance Monitoring and is leading research in IT Operations Analytics (ITOA), also advocates for an architectural approach to IT monitoring. He outlines five essential sources of IT operations data (including wire data) in his September 2013 report, which is worth a read: "Data Growth Demands a Single, Architected IT Operations Analytics Platform." [Gartner subscription required]
Does this conceptual framework make sense? Let us know what you think.