As the poster child for Big Data, Hadoop has been both a blessing and curse for enterprise big data adoption. Powerful but complex, many enterprises have preferred to wait for something easier to come along before rolling out a big data project.
The wait is over. Hadoop has been progressing at such a torrid pace, with significant ease-of-use enhancements from vendors like Hortonworks and Cloudera, that Hadoop’s learning curve has been cut in half. As I’ve written before, enterprises are increasingly shedding their big data training wheels, moving from basic ETL workloads to the advanced data analytics Hadoop was meant to tackle.
The trick to big data for enterprises using Hadoop, it turns out, is to start small.
Baby Elephants Everywhere
Small? That’s not a word typically associated with Hadoop. And yet it perfectly matches the reality of big data. For all we talk about petabytes and zottabytes, most enterprises don’t have petabyte-scale problems. At least, they don’t have problems of that scale that they know how to manage today.
Rather, as this NewVantage Partners survey suggests, enterprises are primarily concerned with mastering new types of unstructured data. Gartner confirms this, noting that “Many organizations find the variety dimension of big data a much bigger challenge than the volume or velocity dimensions.”
As such, smart Hadoop vendors have been tinkering with their messaging, helping enterprises start with smaller-scale deployments and growing from there. As Hortonworks vice president of Strategy Shaun Connolly told me in an interview:
We’ve seen a repeatable pattern of adoption that starts with a focus on a new type of data and creating or enhancing a targeted application around that new type data. These new applications are typically driven by a line of business and start with one of the following new types of data: Social Media, Clickstream, Server Logs, Sensor & Machine Data, Geolocation Data, and Files (Text, Video, Audio, etc.).
Ultimately deploying more applications and new types of data leads to a broader modern data architecture. But the successful customers started their journey on unlocking value from specific types of data and then rinsing and repeating from there.
Starting with small, measurable Hadoop projects is a great way to demonstrate its value without forcing enterprises to swallow the entire elephant upfront. It’s a smart strategy for a powerful technology that can all too easily overwhelm would-be adopters by its massive capabilities.
Making Big Data Small
In so doing, Hadoop is becoming the elephant in the room that people actually want to talk about. While it remains true that far more people are talking about big data than actually rolling out significant big data projects—Gartner highlights that only 8% of enterprises have actually deployed big data projects despite 64% declaring their intention to do so—the percentage of companies engaging in Hadoop-based big data projects should grow now that its primary proponents are selling substantive, achievable business value rather than Hadoop hype.
In fact, most big data projects today center on incremental advances to existing use cases, e.g., better understanding of customer needs, making processes more efficient, further reducing costs, or better detecting risks. For all the talk about dramatically transforming one’s business, most Big Data and, by extension, most Hadoop, deployments are focused on incremental improvements, not change-the-world projects.
Which makes sense. Enterprises first take baby steps with Hadoop with achievable projects then master the technology and then go big.
In 2014 we’re going to see Hadoop adoption accelerate. Talking with Connolly at Hortonworks and Mike Olson at Cloudera, both have seen their businesses boom in 2013, with the pace speeding up even more in the final two quarters of the year. Such acceleration reflects both improvements to their marketing messages, which have centered around how enterprises can more easily gain value from Hadoop, but also suggests that the bar to getting value from Hadoop has been lowered.
My prediction? The more Hadoop is focused on smaller-scale deployments, the more it will ultimately get used for big deployments.
Lead image courtesy Shutterstock