Calais, a project sponsored by Reuters offers a few handy plugins that enable you to use its API to auto-tag all the posts in your blog (see our coverage). It goes through your content, extracts the relevant keywords, and adds those as tags in your CMS.

But Open Calais isn’t open source. Here are a few open source tools you can use to extract key terms from text. As far as I know, none have been turned into CMS plugins… yet.


Tagger is a fairly new Python project by Alessandro Presta. Right now it only works in English.

Via the comments on Hacker News, I found a similar Ruby based project…


Phrasie is a very simple Ruby-based key term extractor. It turns out Phrasie is based on a different Python library called…

Topia’s Term Extractor

Topia’s Term Extractor is an older Python package for extracting key terms by Stephan Richter, Russ Ferriday and the Zope Community.

See Also

See also: Overview of Text Extraction Algorithms

Image by Andrew Mason