Let’s say you’ve got a whole pile of free-form text content and you want to determine what geographic locations are discussed in it. And let’s say you want to do that while you’re rushing to respond to a catastrophic earthquake. What would you do?

Disaster response network Ushahidi has been using software called Yahoo Placemaker for this function, as many other developers do. But this week the organization announced that it is adopting an open source alternative called GeoDict. GeoDict, which was created by ReadWriteWeb contributor Pete Warden, detects, standardizes and returns coordinates for text regarding 2.7 million locations around the world. As a part of the deal, Warden has officially joined Ushahidi parent organization SwiftRiver.

We wrote about how to use both Yahoo Placemaker and GeoDict here in October. Warden says that SwiftRiver will be launching a hosted GeoDict API soon.

He also says he’s hard at work expanding the list of places GeoDict can recognize. Why not just use a larger online database? “One key requirement for Ushahidi is that the service works even when an internet connection is down, for processing documents, etc.,” Warden says. “There may be SMS messages still coming in for example, even when the web is inaccessible. So having a service that can be installed locally on a machine is a big deal.”

Ushahidi describes itself as “a non-profit tech company that develops free and open source software for information collection, visualization and interactive mapping.” The name is from the Swahili word for “testimony” or “witness.” According to the organization’s Wikipedia entry, it “created a website (http://legacy.ushahidi.com) in the aftermath of Kenya’s disputed 2007 presidential election that collected eyewitness reports of violence sent in by email and text-message and placed them on a Google map.”

It’s notable that it was Kenyan political corruption and violence that also propelled Wikileaks to the forefront of many peoples’ minds internationally after that organization won an Amnesty International media award in 2009 for its work regarding Kenya.

Two weeks ago, Ushahidi released a new mobile check-in app ala Foursquare called Crowdmap:CI.

As ReadWriteWeb’s Audrey Watters wrote earlier this month:

Crowdmap is a easy-to-install, hosted version of Ushahidi – the equivalent of WordPress.com blogs for WordPress, perhaps.

The new tool Crowdmap:CI (or Crowdmap Checkins) will function on both Ushahidi and Crowdmap and will allow users to create ad-hoc check-in communities, complete with mobile apps and web portals. Crowdmap:CI is designed to further simplify the creation of annotated location points. As Jon Gosier writes in the blog post announcing the new product, “Sometimes users just want to drop quick notes that represent data points allowing them to enter details later. For instance: the locations of wells while touring a rural village, or potholes around a metropolitan city, or simply dropping pins while on a vacation for the memories of where to return to. Crowdmap:CI is an attempt to make this data entry process quicker, allowing users to focus on location first, and everything else later.”

About its relationship with GeoDict, SwiftRiver posted the following:

You’ll see us contribute our staff, time and resources to the development of GeoDict (because it’s an open source project aligned with our greater mission). GeoDict’s community will also actively contribute back to that code, and hopefully they’ll feel welcome enough that they’ll also contribute to SwiftRiver and Ushahidi code base as well.

GeoDict will be fully integrated into the Swift Web Services family of API products which we offer as both free and paid services, but also as open-source code for anyone out there to use on their own terms.

About Warden

By adding Warden to its team, SwiftRiver gains one of the most interesting independent data scientists on the web. A former Apple engineer, Warden hit the news last February when he announced plans to release Facebook data from hundreds of millions of users to the broader research community. Legal pressure from Facebook quickly ended those plans, but Warden performed some analysis on his own.

“There’s so many interesting ways to slice the data – especially as I’m starting to get changes over time,” he told ReadWriteWeb.

“I’m also trying to map out political networks in aggregate; how polarized the fans of particular politicians are – so how likely a Sarah Palin fan is to have any friends who are fans of Obama, and how that varies with location too. One of my favorite results is that Texans are more likely to be fans of the Dallas Cowboys than God…

“Nobody thinks about how much valuable information they’re generating just by friending people and fanning pages. It’s like we’re constantly voting in a hundred different ways every day. And I’m a starry-eyed believer that we’ll be able to change the world for the better using that neglected information. It’s like an x-ray for the whole country – we can see all sorts of hidden details of who we’re friends with, where we live, what we like.”

Warden frequently cites the use of GIS software in detecting inequitable distribution of public services between areas inhabited by different demographic groups as big inspiration for his work.

Warden’s other work includes the creation of OpenHeatMap, an open source service aimed at journalists for easy creation of heatmap displays of data and authorship of the forthcoming O’Reilly book Data Source Handbook: A Guide to Public Data, due out next month.

We’re honored to count Pete among our team here at ReadWriteWeb. You can read his articles via this link.