[This is the second post in a multi-part series, last updated on April 11, 2016. Read the first post: The Big Idea Behind IT Operations Analytics (ITOA): IT Big Data]
In my previous article, I explained that IT Operations Analytics (ITOA) borrows from Big Data principles and that, in order to enable effective insights and data-driven decisions, you must first design a data-driven architecture. This brings us to the question of which data sets to use for ITOA.
Big Data Is Only as Good as the InputsThe old cliché, "When it comes to information: garbage in, garbage out" is especially true when it comes to Big Data. With any Big Data initiative, including ITOA, the insight you expect to derive will be determined by the type of data sources that are used. The inputs for each data source will determine the degree of accuracy, signal-to-noise ratio, reliability and trustworthiness, and thoroughness of the data set. Data scientists would say that a data set, before it is committed to a Big Data initiative, must first be evaluated based on some if not all of these criteria.
Consider a traditional business intelligence solution for sales and marketing analysis: You would not expect to derive a complete perspective on customer purchasing behavior by only analyzing a financial system's record of customer order activity. Rather, you would want to correlate that data with other data sets from your CRM system, support call system, Net Promoter surveys, and web activity if you want to derive much deeper insight into when, how, and why customers' do or don't make purchases. You would also evaluate these sources based on their trustworthiness, accuracy, and thoroughness as well as whether they provide value or create distraction.
The same holds true for IT Big Data—you would not evaluate end-user experience just by analyzing scripted transactions from a remote location. What about server processing and network transfer time? What about the behavior of the router, switch, firewall, authentication, application code, database, and storage system? These also contribute to the overall user experience. No user session or transaction is an island and neither are the elements of an application delivery chain. They are all interdependent and they all play a role in end-user experience.
To move away from the tool-centric approach I wrote about in last week's blog post, we believe an ITOA architecture and practice should be based on a data set taxonomy, which we refer to as the four sources of IT visibility and insight. The sources and their resulting data sets are wire, machine, agent, and synthetic, all of which I describe in detail below. I should mention that the inspiration for this taxonomy and the practice comes from simply observing what ExtraHop customers such as ShareBuilder, Concur, and T-Mobile have done organically in transforming their own operations to data-driven practices. Our customers are much smarter than me so I am indebted to their insight and guidance. It is also derived from some seminal research led by Will Cappelli, Gartner Research VP, and Colin Fletcher, Gartner Research Director in Gartner's Enterprise Management Practice.
Each ITOA source provides a unique and complementary perspective with some data sets being more significant than others depending upon the roles, questions, and requirements of an organization. It's extremely important to understand the differences between data sets when you assess your operational stance and engage with an IT Operations Management vendor. You will be able to better understand the type and degree of visibility a product can and cannot provide. Remember: the type of visibility and insight you can achieve will be dependent upon the source of data.
Wire Data – Observed and Unbiased Transaction Behavior
Wire Data InputsWire data's inputs are derived from the real-time stream analysis of any wire protocol that transacts across a network or networks. This includes all data or payload. Any application communication between clients, applications, machines, systems, or sensors make up the universe of wire data, which is every observed transaction. Because everything transacts on the wire, it is the largest and perhaps most valuable data source within an ITOA practice and is also one of the largest sources for Big Data in general. It is also seen as an essential and foundational source of analytics for the Internet of Things (IoT) because it encompasses all the communication occurring between those interconnected devices and systems. If an organization has the ability to consume and analyze all technical, business, and operational activity on the wire, it can derive tremendous real-time IT and business insight.
Wire data's building block input is the network data packet. However, packets by themselves are not equivalent to wire data. Just as flour does not equal bread, a packet does not equal wire data. In order for packets to be transformed into wire data, they must be reassembled (whether out of order or fragmented) into a per-client session or full transaction stream. They must then be decoded with an understanding of each wire and application protocol's boundaries. This enables time-stamped measurement and data extraction, with the results indexed and stored in real time. The raw data packets are tossed and only the metadata or result data set is kept (unless you set a policy to record some of the packet stream). The resulting data set is not only well formatted, or "structured," but information rich. In terms of signal to noise ratio, once the raw packets have been transformed the wire data set is almost all signal.
Wire Data CharacteristicsThe list below is not exhaustive but meant to provide some key ideas on characteristics you should expect from a wire data platform.
- An outside-looking-in perspective for IT.
- The unbiased observation of all activity and behavior across all tiers in an IT ecosystem.
- Always on, observing and reporting all activity.
- Spans on-premises and cloud-based workloads because all workloads will communicate on the wire at some point.
- Largest source of visibility data (e.g. 100 Gbps of continuous analysis amounts to more than 1 PB of analyzed data per day).
How a Wire Data Platform Works
- Performs real-time extraction and transformation of raw packets into structured wire data.
- Architecture is based on a high-performing and scalable real-time stream processor.
- Deploys passively, mitigating risk to or disturbance of an environment.
- Auto-discovers, categorizes, and maps all relationships and application communication between applications, systems (machines) and clients.
- Indexes and stores contextual analysis for real-time visualization, alerting, and trending.
- Offers rapid programmability so users can customize, extract, measure, analyze, and visualize nearly any payload information transacting within a live stream.
- Supports the simple construction of solution-oriented "apps" that each comprise a bundle of custom configuration objects such as data extraction scripts, session table storage, visualization definitions, alert thresholds, and trend math.
- Ingests third-party data for real-time correlation and analysis via a high-performance session table.
- Scales to speeds greater than 100 Gbps, the equivalent to millions of transactions per second, and decrypts SSL at line rate.
- Enables the streaming of wire data in real time to proprietary and open-source external data stores, enabling cross-data set correlation.
Machine Data InputsMachine data includes all event logs, CDR, CMR, SMNP, WMI information, etc.—anything that a machine records about its own activity. Machine data is primarily derived from what a developer has determined in advance was important to log as a key event or metric for the system or application they have built.
Machine Data CharacteristicsThe list below is not exhaustive but meant to provide some key ideas on characteristics you should expect from a machine data platform.
- An inside-looking-out perspective.
- A host-based perspective of self-reported events from machines across all tiers.
- Contains pre-programmed event data from elements, systems, operating systems, applications, and other components in an application delivery chain.
- Provides visibility for both on-premises and cloud-based components.
- May add overhead to a system when logging is enabled.
- Second-largest source of IT visibility (e.g. indexing and analyzing 1 TB a day is considered large).
How a Machine Data Platform Works
- Extracts, forwards, stores, and then indexes machine data at run-time for powerful and flexible search.
- Architecture is based on a distributed architecture of lightweight forwarders, indexers, search heads, and a scalable data store.
- Native reporting and analysis, including search and visualization with the flexibility to customize this using a query language.
- Programmability of the platform is focused on the building of applications that sit on top of collected machine data.
- Ingests third-party data, such as wire and agent data, if the third-party platform supports their data format and/or common information model.
- Supports the exportation of machine data to external data stores enabling correlation across data sets.
Agent Data InputsAgent data includes all instrumented and observed behavior by a software agent residing on the host; either on bare metal, in an application, the O/S, or hypervisor. This data includes system resource usage, host-based ingress and egress transaction data, as well as code-level stack trace inputs.
Agent Data CharacteristicsThe list below is not exhaustive but meant to provide some key ideas on characteristics you should expect from an agent data platform.
- Host-based observed and instrumented behavior.
- Measures, collects, and reports pre-determined and customized host-based metrics.
- Often goes beyond machine data metrics in regard to system performance, hypervisor activity, operating systems, and application code-level analysis.
- Offers the flexibility to extract additional data from transactions.
- Spans both on-premises and cloud-based applications as long as the agent is compatible with the hypervisor, application, and O/S version.
- Third-largest source of IT visibility. The amount depends on the number of hosts instrumented with agents.
How an Agent Data Platform Works
- Architecture is based on agents (for application and database servers) that run analysis on the host and send collected data back to a central reporting server.
- Lightweight deployment of agents that run analysis on the host and report results back to a central reporting server.
- Some agent-based tools have a SaaS model where agents send data back to the vendor's cloud for scalable analysis of that data.
- Programmability of the platform involves configuring the agents to collect specified data.
- Ingests third-party data, like wire data and machine data, if the third-party platform supports their data format.
- Exports agent data to external data stores enabling cross-data set correlation.
Synthetic Data InputsSynthetic data is generated from synthetic transactions and service checks. This data enables IT teams to test common transactions from locations around the globe or within the datacenter.
Synthetic Data CharacteristicsThe list below is not exhaustive but meant to provide some key ideas on characteristics you should expect from a synthetic data platform.
- Provides an outside-in, observational perspective similar to wire data.
- Tests the availability and responsiveness of the application.
- Able to replicate user experience from around the globe.
- Excellent at identifying hard failures that are observable from outside the application environment.
- Visibility is limited to user experience as defined by prebuilt tests and scripts and will only cover a portion of the application delivery chain.
How a Synthetic Data Platform Works
- Service checks fire on a predetermined schedule and can be anything from simple ICMP pings to fully scripted transactions that run through the application flow.
- To accurately mimic customer geolocation, probes can be fired from around the globe, representing many points of presence.
- Deployment is easy, either as a hosted service or with lightweight on-premises options.
Stay TunedIn the next article of this blog series I will discuss how these different data sets can come together, as well as the purpose, use cases, and what you should expect or the outcomes of an ITOA practice. The wonderful aspect is this type of effort doesn't have to take months or years to achieve—it simply takes making ITOA a priority and I think you'll find the effort will be easy to justify.
Want to learn how ITOA can help optimize your business? Read Designing & Building An Open ITOA Architecture