The Emerging World of Real-Time Cellphone Data

At ReadWriteWeb we’ve been following with interest the projects of the MIT SENSEeable City Lab, which is producing some excellent analysis and visualizations of cellphone data in urban centers. MIT refers to this data as “digital footprints,” because it essentially tracks the movement and sometimes actions of people in an environment. Our most recent post looked at what cellphone data revealed about who attended the Obama Inauguration in January.

Recently we spoke to Andrea Vaccari, a research associate at the lab. He gave us a fascinating glimpse into the coming world of practical apps built on top of digital footprints.

ReadWriteWeb’s one slight criticism so far of the SENSEeable City Lab projects has been that it hasn’t reached the point of much practical use yet. So we asked Andrea to tell us how the research his group is doing might be used in the real-world. In a follow-up post, we’ll explore a recent project: The New York City Waterfalls.

What Does MIT’s SENSEeable City Lab Do?

Firstly, what are the goals of MIT’s SENSEeable City Lab? Andrea told us that it aims to “inform the public,” to show vision and to inspire.

We asked if there is potential for commercialization, by companies and startups. Andrea replied that there is, but that we’re only at the early stage of understanding ‘digital footprints.’ He said that for the first time we have access to these data sets – via cellphone data – and so we have to understand it first. This is MIT’s objective. But he explained that there will, in time, be practical uses for what they are researching, for example, organizing public transportation.

Uses of Real-Time Cellphone Data

Many companies are already using cellphone data to predict traffic and transportation – it’s the “first and most obvious problem to target” according to Andrea Vaccari.

However he sees many other opportunities. Tourism is one – he mentioned MIT’s ‘the world’s eyes’ project in Spain, based on geo-tagged photos in Flickr. In this project hundreds of photos were tagged by Flickr users as party, arty, and so on. This allowed them to see where all the parties were happening, which areas were arty, etc. Andrea explained that it could also identify the type of person who took the photo, so they can identify trends from that – for example, a particular area of a city might be frequented by a particular nationality or age group.

Another opportunity is security. Andrea said that people often think of the negative use of aggregate cellphone activity (in particular the threat to privacy), but there are also positive security uses. For example, it can alert you to potential trouble spots. Or it can help detect car accidents or fires by checking cellphone activity variations.

These are some of the commercial activities for real-time cellphone data. As yet we haven’t seen too many companies using such data, but one we reviewed recently was Sense Networks. It has a product called Citysense, an iPhone and Blackberry app that allows people in San Francisco to see the most happening nightlife in real time. Andrea noted that there is a major difference between what SENSEeable City Lab does and what Sense Networks does. MIT collects aggregate data, whereas Sense Networks collects individual behavior from their users’ phones. Andrea noted that a disadvantage of getting aggregate data is that the detail is lower, however, the advantages include that it covers all of the city, is “unbiased” and doesn’t cost anything. Which leads to the question…

Where Does The Data Come From?

The aggregate data that MIT uses comes via telecoms companies. MIT has different partnerships, for example, they worked with AT&T on the NYC Waterfalls project (the subject of our next post on this topic). Andrea admitted that
it’s easier for academic institutions (eg MIT) to get access to that data. It would be harder for commercial companies – or even for the telecoms companies themselves – to use it. He says that,
hopefully, in future telecoms companies will be more free to give the data out (securely of course!) and enable re-use of it.

Andrea mentioned that currently very few groups around the world have access to this data by telecoms providers; and MIT has only had access to it for a couple of years. So it really is an emerging field of discovery.

What’s Next

The next step for the SENSEeable City Lab is to put what they’ve done into a consistent framework, currently in development.

Andrea also said that they’re working more with semantic data – for example comparing the frequency of a word in a document with the frequency in the total library of documents. Using this methodology, they can identify keywords, check if it’s in (for example) Wikipedia, then create a semantic index with a set of tags. In other words, eventually they’ll be able to determine what cellphone activity means semantically.

A number of trends are converging that will make use of cellphone data very widespread, and soon. Andrea mentioned that in 2008, for the first time, more than 50% of the world’s population lived in cities. He noted too that the
physical and digital are merging. Also at ReadWriteWeb we’ve been tracking the growing number of Mobile Web apps that use location data. For example, there are currently a lot of location-based social networking companies. They include “social compass” service Loopt (our review), Nokia-owned Plazes (our review), Pelago’s Whrrl, ULocate, and GyPSii. Probably our favorite right now is mobile social network app Brightkite, which last year we named our Most Promising startup for 2009.

So expect real-time cellphone data to have many useful applications in the near future. Let us know what applications you hope to see emerge out of this.