Wikipedia is an incredible monument to human creativity and collaboration, but as one era of innovation passes into another – semantic web advocates want to augment the huge human input into the web with machine learning. The semantically enriched common database Freebase announced today that it will soon reach the milestone of 4 million topics added to its collection. That’s 60% more than English Wikipedia’s 2,445,041 articles and almost half the size of Wikipedia’s full 10 million articles in 250 different languages.

What is Freebase? It’s a database of information that’s organized by people and machines and is particularly well suited for machine reading. You’re not a machine – so why should you care? Read on.

What You Can Do With Freebase

Semantic web expert and RWW contributor Alex Iskold spelled out the value of Freebase in great detail here in May. The long and short of it though is that Freebase learns fast through a combination of automated information harvesting and machine and human organization. It collects information from sources like Wikipedia and MusicBrainz and from user uploads and edits.

Programmatic access to that now structured data allows all kinds of mashups to be built that “know things.” Check out, for example:

  • Taught or Not – a cute little game that tests your knowledge of who influenced who throughout the history of thinkers.

  • Shot or Not – another game that tests your knowledge of the causes of death of various famous people throughout history.

  • Random Walk Through Influences – a little app that displays the chain of historical influence around any artist whose name you enter.

  • Pull Quotes – If you have any interest in politics, check this out – it’s awesome!

  • Powerset – the Natural Language search engine acquired by Microsoft last week uses Freebase, too.

Seriously, Though

Obviously most of these are relatively frivolous use cases. Are there serious powerful use cases for Freebase yet? We’re not entirely sure. There are big gaps in the data, which is understandable, but the interface is so much harder to use than Wikipedia’s that there’s reason to be concerned about expectations of substantial human editing. The interface was much improved this summer and is now far more usable, but it’s still harder than it needs to be.

We’ve certainly got our questions about Freebase, but we’re excited about what Metaweb is doing with it. They are smart, well funded and aiming high. The community there deserves congratulations on growing to 4 million reusable articles, something that the the celebrated English Wikipedia community can only aspire to.