Thomson Reuters' Calais, a semantic markup API that we first reviewed in February, has reached its 2.0 release. The latest version aims to fix one of the main issues with Calais -- that it was too focused on business. Because Calais has roots as Clearforest, the rules it applies while parsing text are biased toward the language of business, which meant that its utility was limited. Version 2.0 has added new semantic entity types in an effort to rectify that.

Calais 2.0 has a dozen new semantic entity types, which Reuters says will increase its utility for "pop-culture publishers and bloggers covering media, music, entertainment and sports, as well as those covering pharmaceuticals, medicine and healthcare." In addition to expanded semantic identification capabilities, Calais 2.0 can now prints results in the Simple Tags format and Microformats, as well as the original RDF.

More than 3,200 developers have signed up to work with Calais since launch, according to product lead Thomas Tague, who said in a press release that Calais and plugins and services built on the API will "make it easy to kick-start metatagging and enter the era of the Semantic Web."

Along with an updated web site, a handful of new code samples and libraries, Thomson Reuters is announcing three new plugins that utilize Calais.

  • Calais Marmoset is a tool that enables developers to automatically create metadata for use with Yahoo!'s open search platform, Search Monkey (our coverage).
  • Calais is also announcing the official release of Tagaroo, a Wordpress plugin that allows bloggers to automatically tag relevant people, places and things in their posts, as well as pull in semantically relevant Flickr photos. We wrote recently about an unofficial Wordpress plugin for Calais, and noted that its utility would be limited mainly to business and tech bloggers because those were the API's strengths. Calais 2.0 should theoretically improve the utility for both plugins for a wider variety of bloggers.
  • Though they've been out since last month, Thomson Reuters is also officially introducing their Calais plugins for Drupal, a popular content management system, that it developed with Phase2Technology.

Calais is an awesome top-down semantic API that can help fuel the bottom-up approach by combing unstructured data and spitting out structured tags. We're excited for the second version of Reuters' product and the added utility that new semantic entity types should bring.