Strata+Hadoop World, Big Data’s big conference last week, was filled with sessions dedicated to the gospel of bigness: More data equals more good. From data lakes to enterprise data hubs, the industry has made a fetish of gathering ever more data.
Because, you know, insights are bound to occur. In a twist on open source’s “given enough eyeballs, all bugs are shallow,” Big Data proclaims, “Given enough data, all data will sprout correlations and consequent insights.”
Except, of course, that it doesn’t.
As much as we want to fetishize data volumes, the reality is that data is only as useful as the people interpreting it. Yes, machines can programmatically act on correlations they “see” in large data sets, but truly revolutionary change may start with Big Data but ends with Big Insights from real people.
Signal, Meet Noise
Even T.S. Eliot, one of the great poets of the twentieth century, knew this. Writing in 1935, Eliot bemoans the insight we’ve lost in spite of a wealth of data:
Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?
At least some of the struggles we have with Big Data arise from not knowing what to do with all the data we now accumulate. This shows through in a Gartner survey:
More data, it turns out, doesn’t automagically turn into more insight, as noted statistician Nate Silver declares:
If the quantity of information is increasing by 2.5 quintillion bytes per day, the amount of useful information almost certainly isn’t. Most of it is just noise, and the noise is increasing faster than the signal. There are so many hypotheses to test, so many data sets to mine–but a relatively constant amount of objective truth.
Real insight begins when people apply domain expertise to a body of data to intelligently query that data. As Silver continues, “The numbers have no way of speaking for themselves. We speak for them. We imbue them with meaning.” Hence, while we can introduce biases into the data, we also can attain perspicacity.
Visualizing Data
To enable individuals to make sense of the ever-increasing mountains of corporate data, companies like Tableau, Roambi, Zoomdata, and other next-generation business intelligence vendors have arisen. These companies make it easier for the rank-and-file within an enterprise to understand data.
As Zoomdata’s Justin Langseth told ReadWrite, the point is not to deliver more data to “high priest” data scientists, but rather
to provide a beautiful, simple, yet powerful interface and underlying tech stack to allow regular business people to access, visualize, and collaborate around data that is residing and streaming into a variety of big data backends, and do that efficiently at large data and user scale.
Or as Roambi recently noted in a blog post, “As you invest in big data and analytics solutions, make sure you invest just as much into the people who will use them.”
As the company explains, “It’s up to the business to invest in training end-users how to think about and use data and analytics as much as they invest in the actual infrastructure and product.” In other words, downloading Hadoop isn’t the answer. Not the final answer, anyway.
Which may be another way of saying that companies need to prioritize their people, not their data. As Roambi coaches, data-before-people is increasingly the norm, and it causes several, related problems:
- Analysts aren’t sure which metrics to provide: They may know how to pick apart data to discover insights, but don’t know how to communicate these through dashboards that tell a story to a particular job function
- Metrics aren’t being segregated based off job roles: Different roles require different data
- End-users can’t transform information into knowledge: People need training to learn how to think about data effectively
- Businesses are collecting data without changing behaviors: Organizations should change in response to the data
The foundation for resolving these issues is to better visualize data for mere mortals. Small wonder, then, that Tableau, the market leader, has seen its stock hit all-time highs recently.
By all means, keep investing in Hadoop, NoSQL databases, and other Big Data infrastructure. Just don’t forget to also invest in the data visualization software that will help to make it meaningful for your employees who will ultimately be the ones to make sense of your data.
Lead photo by Seongbin Im