Everyone may be talking about Big Data, but in reality, its actual adoption may be grossly overstated. Big Data certainly appears to be in demand according to jobs data, though other sources suggest that enterprises are still feeling their way.
The one thing that isn’t a factor of faith-based computing, however, is Puppet, an open-source tool for automating server configuration. It’s one of several such frameworks underpinning the “DevOps” phenomenon, in which developers assume more responsibility for managing IT infrastructure in order to push out and oversee their applications more effectively.
See also: DevOps Booms In The Enterprise
As Dice data suggests, Puppet adoption is booming, giving organizations an easy way to manage IT infrastructure at scale. That’s true whether the task they’re tackling is called “Big Data” or something dull like “running lots of servers.”
Puppet Pulls The Strings
Dice, focused on technology professionals, is a reasonable barometer for tracking the rise and fall of technologies. Rather than gauging popularity through Google searches or other soft factors, Dice tracks the roughly 80,000 jobs posted daily on Dice.com and then identifies the 10 top job-skills “big movers” on a year-over-year basis according to how frequently they’re mentioned in job postings.
According the latest Dice report, Puppet is pulling the strings. Take a look at this Dice chart of the “fastest growing tech skills”:
Two things stand out for me in these numbers:
- As hot as Big Data and related technologies are, the old-school market of IT management remains really hot. Perhaps that’s because …
- Puppet makes Big Data real. Underlying all that data are servers, and servers need to be managed. Puppet makes it easy to manage IT infrastructure at scale, and already sits at the heart of Hadoop-related management tooling like Bigtop.
Whether companies overtly identify themselves as “Big Data” operations or not, they’re starting to realize that “they must automate or go extinct,” DevOps pro Sean Carolan told me on Twitter. “Shell scripts won’t cut it in the era of continuous [software] delivery.” Though Puppet has significant competition in Chef, Ansible and Salt, it’s currently the market leader.
I asked Puppet Labs CEO (and Puppet founder) Luke Kanies for his interpretation of the data, and he offered this:
The space Puppet is in—automation—is so different from most of those other spaces, it’s hard to compare. Companies have been doing databases for decades, so NoSQL’s adoption path is both helped by and stymied by that long history. They’re basically in a replacement business, whether they want to be or not.
With Puppet, we’re filling a gap for people. Most people who adopt Puppet are moving from doing things by hand or writing custom scripts to using industry-standard automation that has a broad community and a great ecosystem. They don’t have to stop using something to start using automation, and they don’t obsolete existing skills, so it’s culturally easier. And they’re getting swamped right now, so they know they have to solve this—it fills a fundamental need, rather than being a better way of doing something you’ve always done.
Kanies suggests some key reasons for thinking Puppet might be “hotter.” But is it really bigger than Big Data?
Stacking Puppet Against Big Data
One way to view Puppet’s outsized growth is to recognize that such growth is relative to more established markets, as consultant Kris Buytaert noted on Twitter: “[T]he [configuration management] market is in its infancy and growing fast with plenty of room left.” So the fact that Puppet tops Dice’s list may merely indicate that it’s growing from a smaller base.
Also, Dice’s data doesn’t necessarily jibe with jobs data from Indeed.
Indeed tracks jobs across over 1,000 job sites, including Dice. So it has a much larger data set to work from in analyzing job trends. According to Indeed, Big Data and Hadoop are much bigger than Puppet in terms of absolute job postings:
And even in terms of relative growth—exactly what Dice purports to measure—Indeed shows Hadoop and NoSQL leading the way:
Even this doesn’t tell the full story, however. Hadoop, for instance, isn’t a single thing. It’s an ecosystem of technologies that includes everything from Hive (which facilitates querying and managing large datasets residing in distributed storage) to Hbase (a key-value data store) to Pig (a platform for analyzing large data sets) to a range of ever evolving, expanding technologies.
The same is true of “NoSQL.” The differences between NoSQL databases are more pronounced than their similarities. A document database is very different from a key-value data store. Posting a generic “NoSQL” database job essentially means the enterprise doesn’t really know what they need. The same is true for anyone requesting “Big Data” expertise.
Which might actually be the point.
I Still Haven’t Found What I’m Looking For
Enterprises working with Big Data aren’t exactly sure what they need to be successful. As I’ve written before, Gartner’s data on this seems pretty clear: everyone knows they need to be doing something with Big Data, but how to do it or what to do remain mysteries.
Not surprisingly, then, actual Big Data adoption lags media hype about it, as 451 Research analyst Michael Coté details in this chart of Big-Data-associated storage use:
Even if companies are still exploring Big Data territory, they increasingly see the need to manage their infrastructure more efficiently than in the past. Puppet is the “how” of serious infrastructure management—or, rather, a significant “how.”
I think it’s fairly easy to rationalize the apparent discrepancies between the Dice and Indeed data simply by acknowledging that nearly all of the job postings related to Big Data are somewhat noisy and aspirational in nature. Organizations know that they need to do something meaningful with Big Data and are trying to hire for this without always knowing precisely what they need.
When they’re looking for Puppet expertise, however, they know exactly what they need: something to help the configure and manage an army of servers. Those servers ultimately mean “Big Data,” but whatever their trendy name, they need managing.
Lead image courtesy of Shutterstock