Home Australian Museum Uses Open Calais to Tag Collection

Australian Museum Uses Open Calais to Tag Collection

The Powerhouse Museum of Science and Design in Sydney, Australia has begun to utilize the Reuters Open Calais API (our coverage) to tag their collection. The museum’s online collection database houses some 66,303 objects, so tagging them all by hand would be quite a task. By using the Open Calais web service, the museum is able to automate much of the process.

That the museum has so much of its collection online is actually quite impressive in its own right. About 70% of the museum’s electronically documented collection is online in the database which went live in June 2006. Museum objects are searchable, taggable (by humans) and painstakingly described.

However, there are so many objects, that even though users can help to tag them, many of them haven’t yet been tagged. Sebastian Chan, who is the Manager of Web Services at the museum, told us that Open Calais is being used to compliment the people-powered tagging they’ve had running for two years. “What Open Calais lets us do now is connect people, places and companies across our collection and has already revealed many new pathways through our dataset (navigating by designer or inventor is now much easier for example),” he said.

The automatically generated tags at right were created by the API for some swim wear designed by Speedo for the 1991 Australian swimming team that competed at the World Swimming Championships in Perth. Open Calais was correctly able to identify some important locations in the document — Perth where the competition took place, and Sydney where Speedo is based — as well as an important corporation (Speedo). It also picked up the name of the designer, and the name of the person who owned the suits before the museum.

However, as you can see, the API made some mistakes too — it classified “World Championships” as a company, and mistook the general text “international swimming organisation” as an actual organized body. It missed the actual organization (FINA) and probably should have picked up the MacRae Knitting Mills company, which was a predecessor to Speedo. Further, because Open Calais is built around people, places, and companies, general information about items may be lost on it. Tags that would be obvious to humans, such as swimming, swim wear, Olympics, or the year 1991, are beyond the scope of Open Calais.

“These errors and other like them reveal Open Calais’ history as Clearforest in the business world,” said Chan. “The rules it applies when parsing text as well as the entities that it is ‘aware’ of are rooted in the language of enterprise, finance and commerce.” On the other hand, according to Chan, the technology has already revealed “many new connections between objects,” even though it has so far been deployed only very sparingly across the collection.

Powerhouse’s use of Open Calais may be the first large scale deployment of the technology across a large public data set. It will be interesting to see the results as they evolve. “It is important to remember that there is no way that this structured data could be generated manually – the volume of legacy data is too great and the burden on curatorial and cataloguing staff would be too great,” reminded Chan.

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.