Home Searching for Sadness in New York: Is the Foursquare API Living Up to Its Potential?

Searching for Sadness in New York: Is the Foursquare API Living Up to Its Potential?

As explained in this blog post, Foursquare needed a way for its business staff to run reports based on its data without slowing down production servers and without learning technologies such as Scala and MongoDB. The company decided to make its data available to business staff through a Hadoop cluster hosted by Amazon Web Services. Foursquare’s data miners could then query it using Hive, which provides a SQL-like query language for Hadoop.

As a proof-of-concept the company has produced a report on the rudest cities in the world, based on the number of tips that contain profanity. Which is pretty cool (apart from the assumption that profanity use = rudeness). But it makes me realize just how under-utilized geolocation APIs are.

Here are the results of Foursquare’s profanity-mining:

And here’s how Foursquare’s data analysis system works:

Some more practical applications, from a business standpoint, for data mining staff might include determining:

Which venues are fakes or duplicates (so we can delete them), what areas of the country are drawn to which kinds of venues (so we can help them promote themselves), and what are the demographics of our users in Belgium (so we can surface useful information)?

Of course, this sort of check-in data is solely in the hands of Foursquare’s internal users. But it makes me wonder whether you could pull together information like this through the Foursquare API if you build your own data warehouse for analysis.

I wonder what services like Fourwhere (which we covered here) could learn by caching all the data retrieved from location various APIs and running sentiment analysis on it. What could MisoTrendy (coverage) tell us about a venue based long-term trend patterns? Is there something in Foursquare’s terms of service that prevents people from doing this? I guess we’re back to that old question what would you do with the massive data sets produced by persistent location tracking?

Update: MisoTrendy’s Andrew Ferenci explains the limitations:

1. You would not be able to pull and process historical data like 4SQ did from their production databases and log files (only real-time data/ hard for small web app to run queries that generate 1bn records)
2. If you use something like Google Apps Engine you have lots of limitations on DB and backend processing (only 80-90K hits before you have to start payinh)
3. Most third party applications would only be able to pull real-time data from 4SQ API, so no backend processing.

However, if you decided you want to create an application to do pull similar data starting today, you would definitely be able to, but not as the same historical breadth.

Techincally, its all feasible with some limitations. Misotrendy was built using Google Apps Engine with a Python backend. There are limitations for the DB and backend processing because you cannot use Ruby on Rails with this setup.

This feels like it could be the first steps towards accomplishing what was described in the opening lines of the Headmap Manifesto:

there are notes in boxes that are empty

every room has an accessible history

every place has emotional attachments you can open and


you can search for sadness in new york

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.