You know it's flu season when everyone on Twitter is talking about runny noses, germ-filled subway rides and obsessing over the effectiveness of their Purell.
While potentially annoying, these kinds of tweets may help predict and monitor how illnesses like influenza are moving across the country. A new study published in the Journal of Medical Internet Research (yes, that's real) from Brigham Young University's Computer Science department has been able to track illness trends by analyzing the location data of tweets referencing illness.
Location, Location, Location...
The BYU team did not track any certain sickness during this trial, instead concentrating on location data. The ultimate goal of the study was to encourage the creation of a program that health organizations like the Centers for Disease Control (CDC), the National Institutes of Health (NIH) and even smaller city-based organizations could use to track the progression of any disease across the nation. This would give health officials a heads up if a outbreak is headed their way.
Other online services track disease too, such Google Flu Trends, which uses search terms and results, as well as data from the CDC to specifically track the flu. Another site, MappyHealth, uses location data from Twitter to track illnesses ranging from pertussis to STDs. While these sites are similar to the work done at BYU, the research team mainly sees them as validation to what it originally thought was possible with tweet tracking.
Associate Professor of Computer Science at BYU and lead researcher Christophe Giraud-Carrier said in an email that the method his research team developed can pick up epidemics up to two weeks before the CDC can. "That kind of lead time would greatly help put the resources where they are most needed and in a more timely fashion."
Those two weeks can make a big difference. The influenza vaccine takes about two weeks to be effective at preventing the flu. If health officials were have that kind of lead time, vaccinations could be targeted at areas that seem to be outbreak locations.
With all of the different types of social media, why choose Twitter? Tweets are public by default, which makes them easier to monitor. The site provides independent researchers a way to monitor users without having to engage them, require them to remember anything or take a test. No one was asked to turn on the location option for their tweets, so the location data that is gathered should be an accurate portrayal of the country.
The site's terms and conditions when it comes to its application programming interface (API) also made it very easy for the research team to follow tweets en masse: 24 million tweets by 10 million different users were tracked by the research team.
Location data wasn't gathered from tweets themselves. Only about 2% of users actually tagged their location in their tweets. Researchers found a better option to decipher location: User profiles. About 17% of the tweets monitored had users that provided location data on their profile. While some users had fake locations like "a cube world in Minecraft," 88% of the time the provided location data was accurate and useful because it provided a distribution of geolocated tweets across the country. This correlated with the overall distribution of the overall U.S. population.
Of the tweets tracked, 15% contained specific location data. This may not seem like much, but Giraud-Carrier explained that this percentage is actually very good: "15% is indeed relatively small. But with over 312 million Americans, it still gives us a lot of people to look at/listen to. The fact that the distribution is consistent with the population is also encouraging. We do not want to overstate what is possible, but there seems to be a critical mass here that should allow useful things to be done."
Giraud-Carrier wants the study to serve as proof of the quality and value of geolocation data on Twitter, adding that social media should be used not only as a means of prevention, but also intervention.
Image courtesy of Brigham Young University.