Storage teams need visibility across tiers. Just seeing that the storage system is getting hammered is not enough—they need to understand which applications are causing the problem and how. ExtraHop helped the storage team at one organization identify—within minutes—a rogue application that was aggressively reading from their Tier 2 storage, thereby crippling backups.
This organization used NetApp as their primary storage and DataDomain as secondary storage. Typically, a primary storage system copies over its data to a secondary system, which acts as backup target and is configured for writes. During some backups, storage performance would slow to a crawl. The storage team could see constrained I/O and TCP connection stalls, but did not know what was causing the problem. Moreover, some backups performed normally, leaving the storage team guessing as to when and where to investigate.
Most IT organizations monitor their applications and infrastructure using specialist tools that provide visibility into a specific technology silo. This disjointed approach to monitoring makes it difficult to pinpoint the root cause of interrelated problems.
Teams responsible for storage architecture are especially disadvantaged because tools that show I/O and resource utilization provide little insight into how applications are using (or abusing) storage. On the other hand, agent-based APM tools bury storage latency within database metrics, leaving application teams without a clear understanding of how their applications use storage resources.
For the storage team trying to troubleshoot their slow backups, there seemed to be few viable options for gaining complete visibility across tiers.
This organization decided to trial the ExtraHop platform, which enabled them to inspect all CIFS, NFS, and iSCSI transactions passing over the network in real time in addition to web, database, and memcache transactions. Drilling into the time periods where they saw slowdown on the second storage tier, the storage team saw an extraordinary amount of reads. The Tier 2 storage was optimized for writes, not reads, so this was causing the slowdown. But which systems were responsible?
Because ExtraHop shows read and write metrics per-client, the storage team could see that a single application server that was aggressively reading from the Tier 2 storage. By correlating information across tiers, the ExtraHop platform was able to identify the true source of the problem: not the storage tier, but a rogue application process.
Cross-tier visibility helps IT organizations to solve problems faster, thereby avoiding fruitless, prolonged troubleshooting sessions, improving user experience, and freeing IT staff to spend time on other projects that generate revenue for the business. For example, if a single storage network engineer can save ten hours they would normally spend troubleshooting an issue, your company gets back on average $590—not to mention the benefit of better performance.
There will always be niche tools for specialist teams, but ExtraHop offers a much-needed new approach that provides holistic visibility across the entire application delivery chain. This cross-tier view enables IT teams to easily understand how applications are impacting the database, network, and storage tiers. With shared operational intelligence, IT teams can collaborate to solve interrelated problems faster.