That's what we always say about monitoring and data analysis, because that's our wheelhouse, but it's true in every single part of the IT stack. If your architecture doesn't match your goals, or scale well along the same axes as your business and your tech stack, then you're going to hit a wall.
There's a big architectural shift going on in how enterprises structure their applications. Lightweight containers and microservices are hitting the mainstream, and that's going to change the enterprise applications landscape for the better.
Goldman Sachs recently reported that they were going to "shift 90% of the company's computing to containers" and start operating their applications via microservices rather than single monolithic chunks of code on individual VMs. This is a huge deal for the infrastructure landscape, and for network and application performance monitoring and management.
In an article published on Medium, Sudip Chakrabarti of Lightspeed Ventures asserts that Goldman's announcement signals "a radical departure" from how mainstream enterprises have typically delivered their applications.
Chakrabarti then asks a question that is so important it bears repeating:
"The question to ask is whether we have the right infrastructure tools to monitor, manage, and secure these truly distributed applications. If not, what would it take to build those tools?"
Chakrabarti goes on to explain the many reasons that managing and monitoring applications composed of distributed microservices is more complex than it is for monolithic applications, citing reasons like distributed application logic, diverse technology stacks, and the difficulty of testing and debugging applications. This additional complexity has been dubbed "the microservices tax." It is the cost you pay to reap the substantial benefits of distributed application architecture.
There are ways to reduce this cost, and it all starts with having a good answer to the question posed above. What would it take to build the right infrastructure tools for managing, monitoring, and securing truly distributed applications?
The Simple Answer
Chakrabarti's answer is that the network is what will enable us to deal with the proliferation of the microservices architecture. Since all the disparate parts of a distributed application still communicate over the network, this data-source is the foundation of effective monitoring and management. Chakrabarti writes:
"Well, fortunately, we do have a knight in shining armor: the good, old network — the network that handles the east-west traffic flowing across the different services in a microservices architecture!"
But … (there's always a but) he acknowledges that even though data derived from the network provides the necessary coverage for monitoring, there are still difficulties at scale:
"...inspecting and storing billions of API calls on a daily basis will require significant computing and storage resources. In addition, deep packet inspection could be challenging at line rates; so, sampling, at the expense of full visibility, might be an alternative. Finally, network traffic analysis must be combined with service-level telemetry data (that we already collect today) in order to get a comprehensive and in-depth picture of the distributed application."
This is where the architecture of your monitoring and management solutions really starts to make a difference in how much of the microservices tax you have to pay.
Say It with Me: "Architecture Matters!"
Network data analytics is definitely the key to maintaining visibility and security in the age of microservices, but how you do it matters. Just like there's a shift in how applications are architected, there's a similar shift in how monitoring platforms are architected. In the past, data volumes were low enough that it was OK to dump all your network data into storage and analyze it post-hoc whenever you had a problem. In the age of massive, distributed application architectures, that's not going to cut it.
Gartner recognized this fact in a recent research note, aptly titled APM Needs to Prepare for the Future. The report discussed the potential complications of using agents to monitor containerized applications (i.e. microservices). Not only does installing an agent in a containerized application go against the entire microservices ethos of using the minimal amount of code to achieve a desired function, but any kind of monitoring that relies on transaction sampling could miss events in containerized applications that spin up, execute their functions, and turn off in a matter of seconds.
Stream analytics is the only architectural approach that will be able to handle the volume and complexity of distributed application architectures. If you're using any other method, such as continuous packet capture, sampling, agent instrumentation, or synthetic transactions, to monitor the performance of your applications, then your monitoring will not keep up with the rest of your tech and you'll run into problems you just can't figure out or fix. That's just how it is.
Why Stream Analytics Is the Only Way
Continuous packet capture requires writing data—a lot of data—to disk before you can analyze it. Chakrabarti was exactly right when he said doing that kind of analysis would cost a lot of storage resources, and even the fastest storage is slow when compared to the speeds stream analytics can reach with modern RAM & bus capacities.
If you're not analyzing the stream of data as it goes by, the data in flight, then your monitoring solution is not going to be able to follow you into the modern, distributed, microservice-ified, increasingly fragmented and complex place that IT is heading.
Don't you want a monitoring solution that can keep up with what you're doing?
You knew this was coming, right? The reason I'm so excited to answer the question Chakrabarti posed is … well … I have a really good answer.
ExtraHop's stream analytics platform automatically discovers and categorizes everything communicating across your network, so every server, application, and piece of infrastructure that participates in a distributed system can be easily monitored. ExtraHop can do this at up to 40 Gbps, and the platform ingests data passively through a SPAN or network tap, meaning it doesn't contend against your applications for computing resources, and it isn't subject to the bottleneck of having to write the data to disk before analyzing it.
ExtraHop was architected from the ground up to solve the very problems that create "The Microservices Tax," and since we've been working on these problems for 8+ years, we're getting pretty good at it.
Read more about how it works in this post from ExtraHop's CEO: How ExtraHop Takes Advantage of Multi-Core Processing
I also recommend checking out Gartner's research note, which you can read for free here: "Use Data- and Analytics-Centric Processes With a Focus on Wire Data to Future-Proof Availability and Performance Management" Vivek Bhalla & Will Cappelli, March 10, 2016
Want to see the ExtraHop platform in action for yourself? Try our online interactive demo.