Beyond Twitter Search: Semantic Analysis of the Real-Time Web

Many of you probably never heard of the Ellerdale project until this week, when Twitter announced it was one of the company’s new partners in receiving the “firehose” of Twitter data, a full feed stream of tweets that was, prior to Monday, only available to the major players like Yahoo!, Google and Microsoft.

What Ellerdale is now doing with Twitter’s 50 million tweets per day is definitely interesting – the service uses an intelligent data-parsing engine to analyze the context of tweets and the links they contain and combines that with other data sources like RSS feeds and Wikipedia to create a real-time search engine and trends tracker that provides more than just a list of tweets – it provides an understanding of the world’s conversations.

Launched in late 2009, Ellerdale, still in alpha testing, tracks data sources from around the web, primarily Twitter, and examines what topics are being discussed. It then organizes these conversations into categories like “people,” “sports,” “politics,” “music,” “television,” and more. Within each category are conversation topics and sub-topics. For example, in the “people” section, “Sarah Palin” is a topic of conversation right now and the sub-topics are “The Tonight Show” and “Jay Leno,” referencing her recent appearances on those TV programs.

You can click through on any of the topics or sub-topics to learn more about what’s being discussed. Although Ellerdale’s best feature is its ability to highlight these sorts of trends, you can also use it to search the real-time web for your own keywords. Here, unlike Twitter’s own search engine, Ellerdale won’t just return a simple list of tweets in response to a query.

Instead, any topical page on Ellerdale returns an incredible amount of data. There are summaries provided from sources like Wikipedia, Freebase (an online semantic database), New York Times’ people search and more. Related topics, in the form of thumbnail images, are listed above the live-updating message stream on every topic’s main page. To the right, a graph charts that keyword’s popularity over time and you can manipulate this to show you data from the past hour, day, week or month. Also to the right is a list of top articles from around the web, ranked by how many times they’ve been mentioned on Twitter. That article list can even be subscribed to via an included RSS feed.

And let’s not forget the main dish – the live-updating stream of tweets. The message stream shows who tweeted what, when and what Twitter client they used to do so, which is the same information you would see on Twitter.com. However, where Twitter’s own homepage and search results pages stay put until you refresh them, this message stream moves in real-time as tweets come in. If it goes too fast for you (something that’s a real possibility when you watch a currently hot trend), you can pause the stream with a click of a button.

For data hounds, search results like these are tantalizing to say the least. And this engine is now just one of many that has access to Twitter’s entire stream of tweets. The other new Twitter partners are also search and discovery services, including Collecta, Kosmix, Scoopler, twazzup, CrowdEye, and Chainn Search, all of which parse the Twitter data feed in their own way. Is any one better than another? That’s hard to say. Each has their own niche, site design and unique features which allow them to appeal to select groups of searchers. Ellerdale is interesting because of its semantic capabilities, but it’s not the only one to offer those. Kosmix, for example, has been developing their semantic-based news portal for the last three years.

The best part about all these new partnerships is that we’re about to see an entirely new way to search the web emerge. For quick real-time results, there will always be the major search engines and their more basic lists of tweets, but for true data analysis, we now have incredible new options like Ellerdale and all the others.

Facebook Comments