Independent analyst and ZDNet blogger Dan Kusnetzky recently published a paper titled "Solving Database Performance Problems with Better Storage Performance." Then, last week, he included an interesting excerpt from the paper on his ZDNet blog, Virtually Speaking. The post, "Database performance problems can be solved by fast storage," discusses various ways to address database performance management and application performance management issues, principally pointing to fast storage as a solution.
While I agree with Kusnetzky that Database management systems (DBMSs) are some of the most complex pieces of software in a company's data center and that database performance issues can cause application slowdowns and failures—ultimately leading to lost revenues and customers—I disagree that throwing more hardware into the mix is the answer. Fast storage is indeed a crucial component to supporting the exponential growth of data, but more hardware also adds yet another layer of complexity to the environment. More nodes to manage translates into more potential points of failure. And as the amount of data continues to increase, this problem will only intensify.
In fact, Network Computing recently published an article, "EMC Sees Big Opportunity in Big Data," showcasing just how enormous the data problem has become. The article discusses the findings of an IDC report on the "big data" phenomenon, which predicts that data use will grow by as much as 44 times globally to a massive 35.2 zettabytes in 2011. To put this into perspective, 1.2 zettabytes is equivalent to
"75 billion fully-loaded 16 GB Apple iPads, which would fill the entire area of Wembley Stadium to the brim 41 times, the Mont Blanc Tunnel 84 times, CERN's Large Hadron Collider tunnel 151 times, Beijing National Stadium 15.5 times, or the Taipei 101 Tower 23 times."
… Now multiply by 30. It's difficult to fathom.Consequently, these vast quantities of data are increasing the burden on the database programs that manage and make sense of it all. The database market—estimated by Gartner at around $21 billion in 2009—is now so large as a result of this data demand that new (and somewhat unexpected) players like Salesforce.com are entering the fray (see "Salesforce.com takes on Oracle in database market").
And because, as database reviewer and journalist Ian Murphy details in his recent article "Will 2011 improve on 2010?" the number of ways that the database is required to interface with every business process and layer of the network, gaining visibility into the database is critical. Application, network, and storage layers have constant interaction with the database to support business-critical transactions, from online sales to regulatory compliance to data backup and privacy.
As Kusnetzky details in his ZDNet article, a failure in any one of these transactions can, "cause the organization to lose revenues, lose customers, or fail to meet its goals in other ways. This, by the way, is considered 'bad' by most IT folks I've spoken with." And this is precisely why it's crucial for application performance management (APM) solutions to provide visibility across all tiers—especially the database and storage tiers combined—to ensure that accessing data does not become a performance bottleneck.
The ExtraHop Application Delivery Assurance system enables you to pinpoint database issues in minutes. Beyond troubleshooting database issues, ExtraHop customers also are using the ExtraHop system to tune database performance, provide service assurance, and facilitate capacity planning.
For example, at a recent customer engagement we profiled database performance for a key B2B application. The application relied on an IBM Informix database where storage space resided on an NFS-mounted file system. The ExtraHop system showed that several of the stored procedures supporting the search functionality were running particularly slowly, but the DBAs could not figure out why: all the tables were indexed according to best practices. The ExtraHop system automatically detected that the database servers were also NFS clients and showed high access latency to remote files backing the tables searched. Upon closer examination, it turned out that these files were on the same physical file server and the disk was thrashing under load when search activity spiked. Splitting the tables across multiple file servers load balanced the traffic, dramatically reducing contention and improving remote storage access latency. This in turn solved the database performance problem. Note that the final solution required deep understanding of bottlenecks in the system—adding more hardware without knowing where the contention was would not have resolved the problem.
Data problems like these are only going to get worse. We'll be making a big announcement around expanded database support very soon. Stay tuned!
How is your organization ensuring that the database tier doesn't become a bottleneck, in light of the "big data" phenomenon? What sorts of DB failures have you experienced and how did you solve them? Let us know!