2009 has seen a lot of Semantic Web and structured data activity. Much of it has been driven by Linked Data, a W3C project which gained momentum this year. According to Sir Tim Berners-Lee, the inventor of the Web, Linked Data is a sea change akin to the invention of the WWW itself. We’ve gone from a Web of documents to a Web of data.
The 10 products we’ve picked out for this end-of-year review are ones that have done interesting things with data. Connecting to other data, building new applications with data, sharing data, and more. These 10 products may not be the type of Semantic Web apps that the W3C envisaged in the 90s, but that no longer seems to matter. What’s important is that the Web is becoming more meaningful – more semantic. See also our 2008 list.
Google Search Options and Rich Snippets
ReadWriteWeb’s Best Products of 2009:
In May, Google announced two significant additions to its search product: Search Options and Rich Snippets. The two features notably extended Google’s core search product and the ‘rich snippets’ part in particular was based around structured content.
Rich snippets extract and show useful information from web pages. Google is using structured data open standards such as microformats and RDFa to power the rich snippets feature. It is inviting publishers to mark up their HTML (webmasters can find more details here).
Feedly describes itself as “magazine-like startpage.” When it launched in August 2008, we labeled it just “an alternative interface for Google Reader.” However with the launch of Feedly Mini, a mini bar that hovers at the bottom of the screen as you surf through blogs on the web, the service has become a much more inclusive blog reading companion.
Feedly Mini integrates Twitter, FriendFeed, Google Search, Mozilla’s Ubiquity, and more. A number of our writers love this tool – Sarah Perez went so far as to call it “a must-have tool” for anyone who uses services like Twitter and FriendFeed.
In our February review, we came away impressed by Apture due to the amount of multimedia that can be packed into such a little pop-up. Also the end-user experience is sophisticated – readers on washingtonpost.com and other sites that use Apture can see rich, relevant, contextual content from the likes of Wikipedia, YouTube and Flickr without leaving the host site.
Zemanta is a real-time semantic analysis tool that plugs into your blogging software. As we explained in April, Zemanta offers bloggers relevant links, photos and other assets to include in their blog posts. Zemanta’s API is also being used by startups. Over 2009, the company has continued to iterate and impress. For example in October Zemanta released a new engine and API.
Zemanta is open source and standards-based. It works well with the rest of the tech community and has some interesting tools for supporting non-profit organizations.
Note: We compared Zemanta to Apture in an August analysis post.
Open Calais 4.0
In January Thomson Reuters released their most significant update yet to the Calais web service and open API: Calais 4.0. Calais is a toolkit of products that enables publishers to incorporate semantic functionality within their properties – enabling them to categorize content as people, places, companies, facts, events, and more.
Calais 4.0 went beyond metatagging and enabled publishers to integrate their content with Linked Data assets from Wikipedia, GeoNames, the Internet Movie Database (IMDB), Shopping.com and others. Calais 4.0 also let publishers share semantic metadata about their content with “content consumers” such as search engines, news aggregators, related stories recommendation services and more
BBC’s Semantic Music Project
The BBC has long been a major experimental force in the Web ecosystem, backed with large dollops of British taxpayer funding and an equally large measure of British geek talent. The BBC Music Beta project caught our eye earlier this year. It’s an ongoing effort by the BBC to build semantically linked and annotated web pages about artists who are played on BBC radio stations. Within these pages, collections of data are enhanced and interconnected with semantic metadata, letting music fans explore connections between artists that they may have not known existed.
The backend of this BBC project comes from the Linked Data world – specifically MusicBrainz, an open content music “metadatabase” that lists information for over 400,000 artists.
Once just a browser add-on that allowed users to surf smarter across several verticals, AdaptiveBlue’s Glue is now a site-centric product that acts as both a hub and a spoke of the social web.
As we wrote in October, Glue’s technology is based on a user’s browsing across common sites such as Amazon, Wikipedia and YouTube. Those visits and any interactions (comments, “likes,” etc.) feed back to Glue to automatically create a taste profile and a web of affinity with other users. The idea is to recommend items or content across categories including music, books and movies.
Freebase is an open, semantically marked up database of information. It looks similar to Wikipedia, but Freebase is all about structured data and what you can do with it. Freebase has been one of the more hyped companies in the Semantic Web, leading to some skepticism that the product is too much like Wikipedia and offers nothing much new. Our verdict at the end of 2008 was that Freebase was still lots of work to do, in terms of usability and useful data.
However over 2009 Freebase has continued to build out its ambitious vision. It recently announced the publication of its 10 millionth topic, up from 4 million a year ago. While it’s obviously still working on its user experience for the consumer audience, Freebase continues to get the respect of Semantic Web and Linked Data aficionados.
DBpedia is not a commercial product, but it’s one of the largest sources of Linked Data on the Web – so we think it deserves to be recognized in our top 10. Essentially, DBpedia extracts structured information from Wikipedia and makes that data available on the Web.
At the present time, DBpedia has more than 2.9 million “things” – including at least 282,000 persons, 339,000 places, 88,000 music albums, 44,000 films, 15,000 video games, 119,000 organizations, 130,000 species and 4400 diseases.
The trend towards opening up government data has been a significant one during 2009. The inventor of the Web, Tim Berners-Lee, is currently helping the British government to do this – using Linked Data. It’s a subject he’s very passionate about, as we discovered when we interviewed him in June.
The Obama administration has had a mixed record so far in terms of opening up government data, however the launch of Data.gov in May was a positive step in 2009. It was announced by the Chief Information Officer (CIO) of the United States, Vivek Kundra, and it aims to “increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.” Although we found Data.gov to be too limited an offering, it’s a start.
Those are our picks for the top Semantic Web (or structured data) products of 2009. Let us know in the comments if you disagree with our choices, or think we missed something important.
ReadWriteWeb’s Best Products of 2009: