Last year, I was slated to attend the first O’Reilly Strata Conference, but the 2011 Snowpocalypse intervened and said “no flights for you, St. Louis.” Not only did I miss the inaugural Strata Conference, but it seems like I missed out on all the hype and irrational exuberance for big data as well.
The first day of the 2012 conference was dedicated to half-day tutorials and the all-day Strata Jumpstart. The Jumpstart sessions were geared for business leaders looking to see “how information can transform the enterprise.”
The over-arching theme for the Jumpstart sessions? As Rob May said on Twitter “be wary of the religion of data.” That’s not quite the message one might have been expecting in attending Strata, but it’s a good one.
To be clear, nobody was saying “big data is over” or that it’s useless. But the message from most of the speakers was that it’s deeply important to know what data can do for you, and what it can’t before you decide you’ve got to get you some Hadoop.
Marketers and Analysts
Avinash Kaushik, co-founder of Market Motive, says that we have enormous data, but very little insight.
Kaushik says that if you have a budget for data, spend 90% of it on people who can work with the tools and derive insight from the data, rather than spending the bulk of the budget on technology to gather data.
He also questions the need for real-time data. If you don’t have the ability to act on real-time data, then don’t try to gather real-time data. Instead, Kaushik argued for “right-time” that is available when decisions need to be made.
The exception? When “you can get rid of humans” in the decision-making process. If you can make decisions algorithmically based on real-time data, then it might be worth it. But, Kaushik says, “if humans are involved, you’re screwed.”
Ammo for the CFO
Continuing the theme of cautiously adopting big data, J.C. Herz spoke on “Ammunition for the CFO: How to be a Hard-Nosed Business Customer for Analytics.” Herz, in particular, played devil’s advocate to the question of whether companies need big data and analytics.
Herz is CEO of analytics company Batchtags. At the last Strata, says Herz, everyone came out saying “we’ve gotta get us some Hadoop” after being pumped up by the sessions extolling the virtues of big data without really understanding what they wanted. “Hold on cowboy,” she says, “let’s figure out what you want to accomplish before we ‘get us some Hadoop.'”
One step companies need to undertake is a data audit. Companies may think they have “big data” but “sometimes it’s not as big as you think it is.” One company Herz worked with had bought into infrastructure to support “massive flows of data” but after spending millions of dollars “they had something like 2TB of data.”
Companies need to know how much data they’re working with, how fast it’s being generated, and how many places the data is coming from.
Next question? Who owns the data? Who’s taking responsibility for cleaning the data and making sure it’s accurate? Who’s in the position of saying ‘no, you can’t have it?” Herz described a few horror stories about companies that thought they had rich data sources, but when they really dug in they found that human laziness meant that the data was missing or inaccurate.
By the same token, companies need to ask who’s going to do analysis on data. When you’re deciding on a big data strategy, Herz says that companies need to decide exactly who is going to be doing the analysis, by name, and who they’ll be reporting to.
Another question, are you using the data to make a decision – or avoid one? Herz says that analytics are good when management wants to make a decision, but it’s a waste of money when companies are gathering data so that decisions can be put off.
Time is a resource, says Herz. One of the worst scenarios is when management gets the “big data religion” and throws “obscene” amounts of money at it, wants results yesterday. It doesn’t work like that.
Companies also need to realize that data decisions “have consequences” says Herz. If you’re embarking on a data strategy, Herz warns that companies need to understand that it might piss off a few people when the results come in, and you need to be OK with that. As Kaushik says, when humans are involved…
That doesn’t mean that Herz is against companies embracing analytics, just that they need to be thoughtful when doing it.
When working with vendors about big data and analytics platforms, Herz says that companies should ask for three cost scenarios that factor in the “data iron triangle.” The triangle is storage, cycles, and performance. Ask vendors to come up with three cost scenarios that minimize one corner of the triangle each. Most of the time you can make sacrifices in one area and get the results you want.
It’s interesting to see just how fast big data is moving out of the hype cycle. If you’re following along with Gartner’s technology life cycle, we should be hitting the “trough of disillusionment” shortly. However, I think that’s likely we’re going to be skipping that or seeing a very abbreviated trough. It seems that a lot of companies are hitting “enlightenment” already and moving towards productivity very quickly.