In a big win for MongoDB and its sponsor company 10Gen, Craigslist is migrating its archives to MongoDB. In an interview on the MongoDB blog, Jeremy Zawodny of Craigslist explains that the company is moving around two billion documents into a MongoDB cluster.
Craigslist is just using MongoDB for its archive of deleted and expired posts, not for the posts that are live on the site. However, in a video presentation Zawodny made it clear that the archive is in use by the live site – it’s not just “cold storage” for old posts.
From the introduction to the interview, here’s a run down of the problems Craigslist needed a new database to solve:
Craiglist has kept every post anyone has ever made in a large MySQL cluster. A few months ago, they began looking for alternatives: schema changes were taking forever (Craigslist’s schema has changed a couple times since 1995) and it wasn’t really relational information. They wanted to be able to add new machines without downtime (which sharding provides) and route around dead machines without clients failing (which replica sets provide), so MongoDB was a very strong candidate. After looking into a few of the most popular non-relational database systems, they decided to go with MongoDB.
For those who want more, Zawodny talks about why the company decided to add a NoSQL database, why it chose MongoDB and the challenges he’s faced during the migration in this video.