Guest author John Gentry is vice president of marketing and alliances at Virtual Instruments.
The tree in your front yard may be a lot like your data center. It has a lot of stories to tell, and if you cut it down, the circles in the stump may reveal periods of drought, stretches of good weather, an instance of insect infestation or scars from a fire. The total visibility could cover years of history. But you have to cut the tree down to see it all.
Similarly, your data center’s equipment and software can reveal clues about your company’s approach to IT. But getting that visibility into your IT environment is a lot harder than chopping down a tree (though the frustration could have you reaching for an ax).
Over time, as aging data centers reach toward the cloud, their infrastructure has only grown more complex. Old mainframes from the 1970s or 1980s may still chug along today. Other rings in the “trunk” of your data center could reveal the shift to client/server architecture in the ’90s, distributed computing a decade or so ago, and today’s virtualized, cloud-oriented solutions.
See also: “Mr. Windows” Bets Big On The Mesosphere Datacenter OS
Far from ancient history, all of those factors may still be alive in your organization, struggling to work together so modern customers can have the always-available, mobile-friendly experiences they expect. Unfortunately, this rising complexity, combined with stagnant budgets and staffing rates, can hamper the transparency necessary for healthy performance.
A Brief History Of Data Center Performance Visibility
The gap between a dynamic IT infrastructure and the ability to effectively manage it goes back as far as the mainframe, which was really the first version of the cloud. A shared system with information operations (IO) latency requirements and different workloads, the mainframe delivered phenomenal performance management because it was closed. But it was expensive and required deep technical expertise to maintain.
When the industry moved to client/server systems, we saw the first wave of IT democratization and a wave of performance tools in the shape of enterprise systems management (ESM). Another 10 years down the road, network performance management (NPM) became the tool of the day as data centers connected out to their customers and partners and had to operate interdependently.
Today, data centers are interconnected and virtualized all the way down the stack. But the business logic often exists deep within the layers of systems already deployed. ESM and NPM have been marginalized by a myriad of tools that deliver insight into one layer or another, and the comprehensive visibility gap has only widened with virtualization. The fact that most data centers are heterogeneous only exacerbates the problem, as staff struggle to juggle one vendor’s management tools alongside multiple others.
See also: Cloud, Schmoud—To Really Succeed, Web Companies Need Their Own Data Centers
The quest for faster, better, cheaper (or at least cost-effective) performance management raises plenty of questions. What types of silos are you monitoring? Applications? Servers? Networks? Something else? And what about downtime?
In nearly every industry, the expectation for availability, which once factored in some measure of downtime, is now 24/7. In the financial sector, for example, high-frequency traders can lose big when systems falter for mere milliseconds. Slow-loading e-commerce sites will eventually tank. And health care organizations must avoid outages at all costs to ensure patient safety and consistent functionality throughout facilities. Not many industries can forgive poor performance, regardless of how difficult it is to address.
Nearly every company is a technology company now, because practically all deliver products or services via some kind of Internet or mobile interface. They must have a firm grasp on performance issues and how to fix them before they corrupt the customer experience.
The Challenge For Technology Companies
One of the biggest issues is that IT and staffing budgets haven’t grown on par with the increase in complexity. You can’t slack on performance, not if your company wants to attract new customers and retain existing ones. So your qualified staff, already spread thin, has to address problems immediately, not in days, weeks or months.
And yet, teams often wait until problems or even outages are reported before they look at individual components and system logs. This process-of-elimination troubleshooting not only degrades performance experiences, it can actually make problems worse. In pursuit of deeper, more real-time transparency, CIOs have tried a variety of approaches, with mixed results.
Some teams focus on application performance management (APM). They look primarily at the end-user experience, but when problems are red-flagged there, such tools lack the ability to dig deeper into the infrastructure to uncover and address the root causes. Other IT leaders emphasize device management or network operating centers above or beside APM.
Those tools are useful in their own silos, but ensuring a high quality of holistic service and deliverability requires IT to analyze the concentric circles in the data center’s “trunk”—through every layer of infrastructure abstraction and back across the legacy technology, with an emphasis on IO. It’s the fastest growing, most expensive and least understood layer in the stack, and it has the most impact on performance. To bridge the visibility gap, you will need vendor-neutral monitoring, predictive analysis tools, automated reporting mechanisms and centralized management.
Helping Performance Management Take Root
The challenge of maintaining high service in the face of massive annual data growth can be particularly difficult for older businesses, many of which have been layering on technology over several years or even decades.
These organizations must now use cloud-based infrastructures while competing against younger, smaller and more agile competitors that aren’t burdened with legacy issues. It’s no longer sufficient to manage just one element, such as APM. All are critical to the total experience in their own right.
To start, companies must acknowledge that service-level agreements (SLAs) geared toward the server tier or the storage tier are no longer viable on their own. The SLAs should address the whole business; they have to, in order to optimize the speed of delivery, agility and cost.
Above all, organizations must bear in mind that, wherever legacy and modern systems work together, total visibility is essential. It’s the sun, soil and water necessary for their data centers’ health—and growth.
Hard-drive X-ray by Jeff Kubina; tree trunk photo by Kathleen Conklin; mainframe photo by Martin Skøtt; data center server photo by Bob Mical