In May last year we wrote about the state of Linked Data, an official W3C project that aims to connect separate data sets on the Web. Linked Data is a subset of the wider Semantic Web movement, in which data on the Web is encoded with meaning using technologies such as RDF and OWL. The ultimate vision is that the Web will become much more structured, which opens up many possibilities for “smarter” Web applications.
At this stage last year, we noted that Linked Data was ramping up fast – evidenced by the increasing number of data sets on the Web as at March 2009. Fast forward a year and the Linked Data “cloud” has continued to expand. In this post we look at some of the developments in Linked Data over the past year.
Governments Get on Board
The most high-profile usage of Linked Data over the past year has come from two governments: the United States and United Kingdom.
The U.S. was first to open up some of its non-personal data for use by developers, with the May 2009 launch of Data.gov. In January 2010, the U.K. government announced Data.gov.uk – with the help of Sir Tim Berners-Lee, the inventor of the World Wide Web. At launch, Data.gov.uk had nearly 3,000 data sets available for developers to build mashups with. At the time it was more than three times as much data than the U.S. site offered.
Following on from the launch of Data.gov.uk, U.K. Prime Minister Gordon Brown announced a new British Institute for Web Science along with $45 million in government backing. The Institute will be led by Berners-Lee and prominent researcher Nigel Shadbolt. This was great news for Linked Data, because according to Prime Minister Brown, the Institute “will help place the U.K. at the cutting edge of research on the Semantic Web and other emerging web and internet technologies.”
There have been commercial success stories too, such as OpenCalais for media, MusicBrainz for music and GoodRelations for e-commerce. There are also many commercial sites tapping into the general knowledge data store at dbpedia.org.
However it’s relatively early days for commercial applications of Linked Data. We’re beginning to see smart people explore potential use cases, such as this list for news organizations, but much of the early implementation is being done by publicly funded entities such as the U.K.’s BBC.
The latest version of the Linking Open Data dataset cloud, as at July 2009, maintained by Richard Cyganiak and Anja Jentzsch.
Just Get The Data Up There
To reiterate, Linked Data is data that has been connected to other data sets using Semantic Web technologies such as RDF (Resource Description Framework) or RDFa (a simpler variation). Minus the acronyms, Linked Data is simply structured data.
However, one of the reasons the Semantic Web hasn’t yet been widely adopted, at least commercially, is that it’s often difficult or time consuming to mark up data semantically. RDF in particular has a reputation for being painful to code. With that in mind, the past year has been as much about prompting governments and organizations to put their data up on the Web in whatever form they can.
Indeed when I interviewed Berners-Lee last July, he told me that he’d be happy if governments “just put data up in whatever form it’s available.” He mentioned that “Comma separated values (CSV) files are remarkably popular.” He’d be much more happier if it was semantically marked up data, using the likes of RDF, but conversion can happen after it’s been uploaded to the Web.
So overall, Linked Data is still early in its adoption curve. However it’s undeniably become a solid on-ramp to the wider Semantic Web and world of structured data.
For a good technical overview of the current state of Linked Data and the Semantic Web, see this presentation by Davide Palmisano.