Big data and sentiment analysis can do amazing things, whether it’s in the enterprise or in the quest to create compelling applications and experiences for consumers. But can technology trends such as these actually predict major real-world events?
As sci-fi as it may sound, that’s exactly what researcher Kalev Leetaru was able to accomplish with a little help from SGI’s Altix UV supercomputer packing 8.2 teraflops of processing power. Leetaru, a digital media analytics expert at the University of Illinois, wrote software that can scan over 100 million news articles and uses sentiment analysis, text geocoding and predictive analytics to determine when political upheaval will go from rowdy to revolutionary.
Editor’s note: This story is part of a series we call Redux, where we’re re-publishing some of our best posts of 2011. As we look back at the year – and ahead to what next year holds – we think these are the stories that deserve a second glance. It’s not just a best-of list, it’s also a collection of posts that examine the fundamental issues that continue to shape the Web. We hope you enjoy reading them again and we look forward to bringing you more Web products and trends analysis in 2012. Happy holidays from Team ReadWriteWeb!
Leetaru’s software was able to churn through all that data and visually demonstrate a sharp increase in negative tone preceding recent uprisings in Egypt, Tunisia and Libya. It analyzed thousands of international news articles from the last 30 years pertaining to those countries and algorithmically mined for certain phrases denoting both positive and negative tones. It then geocoded the text to tie these sentiments to specific geographic locations in the world.
Well sure, you might say, wasn’t it obvious that the Egyptian revolution was coming to anybody following current events? In early 2011, perhaps it was. What this software was able to pinpoint was a an increase in negative tone during the entire decade that preceded these revolutions.
Writes Peter Murray on the Singularity Hub:
Tone monitoring was performed on 52,438 articles worldwide between January 1979 and March 2011 that contained any mention of an Egyptian city. The software selected for Egyptian cities rather than the word “Egypt” to filter out articles that only casually mentioned Egypt the way a travel guide might do. Between January 1 and January 24 of 2011, global tone about Egypt dropped to an extent that had only been seen twice in the past 30 years.
This is Culturnomics at work. One of the more well-known applications of it would be the Google Books Ngram Viewer, a Google Labs project that scans 15 million digitized books to reveal the frequency of certain words and phrases over time. By applying a similar methodology to news articles, researchers can gain insight into human society on an even bigger scale and in a more real-time fashion.
“A growing body of work has shown that measuring the ‘tone’ of this real-time consciousness can accurately forecast many broad social behaviors, ranging from box office sales to the stock market itself,” Leetaru writes in the introduction to his recent study on how the tone of news global news coverage can be used to predict events.
“Despite being hailed as a social media revolution, monitoring the tone of only mainstream media around the world would have been enough to suggest the potential for unrest in Egypt,” continues Leetaru.
The academic paper, which is well worth at least a skim for those interested in this topic, goes into detail about how this method can be applied to retroactively foresee turmoil in the Middle East and the Balkans and even allegedly narrow down the location of Osama Bin Laden’s hideout, at least within a range of 200 kilometers.
Of course, this is a relatively new area of study and the methodology has yet to be used to actually predict future events. Either way, there’s no doubt that we stand to gain substantial new insights when the real-time, Web-based dissemination of news meets large-scale sentiment analysis.