The Hadoop market is on a tear, growing at a compound annual growth rate of roughly 60%, according to IDC. But why it’s growing, or rather, how it’s being used, might surprise you. Given all the media hype around Hadoop and its power to predict everything from the optimal number of raisins in your cereal (23) to the exact date of Armageddon (next Tuesday – call in sick), it’s perhaps surprising to learn that comparatively few organizations use Hadoop for analytics. Today most enterprises use Hadoop for the pedestrian uses of storage and ETL (Extract, Transform, Load).
Eventually enterprises get to sexy analytics. But we’re not there yet. Not by a long shot.
‘Poor Man’s ETL’, ‘Unsupervised Digital Landfill’, Or Both?
While commonly billed as an analytics tool, Hadoop remains “a poor man’s ETL” for the vast majority of enterprises. Yes, there are enterprises running interesting analytical workloads on Hadoop, but these are the exception, not the rule. Hence, while Cloudera cites three common use cases for Hadoop (data transformation, archiving, and exploration, I’m hearing from analysts that 75% or more of the actual Hadoop adoption resides in those first two use cases.
Which is not to suggest such adoption is valueless. Quite the contrary.
The Common Adoption Path For Hadoop
As 451 Research analyst Matt Aslett highlighted at Hadoop Summit, there is a natural progression from using Hadoop to store large quantities of data (i.e., Hadoop as an “unsupervised landfill“), to processing and transforming that data and ultimately to analyzing that data. The fact that most enterprises have yet to get to analytics in any meaningful way is simply a description of where we are in the Hadoop market’s evolution.
from
Indeed, Aslett notes that “attempting to fast forward to analytics, missing out on the processing/integration stage, creates silos and will result in disillusionment.”
We’re still early in Hadoop’s technological and market evolution, in part due to the complexity of the technology, with 26% of even the most sophisticated Hadoop users citing how long it takes to get into production as a gating factor to its widespread use. Gartner reveals even lower rates of adoption of Big Data projects, often involving Hadoop, at a mere 6%, as enterprises try to grapple with both appropriate use cases and understanding the relevant technology.
Start With What You Know
Small wonder, then, that enterprises are starting with known use cases like storage or ETL before proceeding to more ambitious analytics projects, as Christos Kotsakis suggests. We’re still getting comfortable with Hadoop. Applying an unfamiliar technology to a familiar problem makes a lot of sense.
Some day, we’ll get to the point where mainstream adopters commonly use Hadoop for significant analytics. But we’re not there. Not yet. Just give it time.
Image courtesy of Shutterstock.