How The MongoDB Database Learned To Scale

“MongoDB doesn’t scale.” The nagging criticism that the open-source database can’t handle larger data workloads, propelled by a viral video roughly five years ago, continues to haunt it to this day.

With last week’s announcement of version 3.0 of the database, however, it’s clear that MongoDB—both the open-source project and the company that supports it—hopes to finally put that lingering concern to rest. 

“MongoDB has always scaled—there are many, many examples, ” said Kelly Stirman, MongoDB’s director of products. “But in prior releases, doing so required a level of expertise that not everyone has. With MongoDB 3.0 it’s a lot easier to scale your system.”

Why Scaling Matters

That’s a big deal for the company, which harbors big plans for taking on the established database market.

MongoDB, which offers services based on the open-source database, raised $80 million in funding in January. It acquired new technology in the form of database storage engine firm Wired Tiger and recently hired a new chief revenue officer.

And now it also has a shiny new 3.0 release of the database MongoDB, which it announced last Tuesday and will launch in March.

One major upgrade lies the way Mongo manages concurrency. That’s essentially how a database ensures that operations like reading and writing data values don’t interfere with one another when executed at the same time.

“If you think about cars getting over a bridge, concurrency is like a tollbooth,” Stirman said. “If you only have one tollbooth, even if that booth is really efficient, you can only get so many cars over the bridge.”

Concurrency issues appear to have been a big deal for MongoDB. Some large customers complained that the database would “lock up” for fractions of a second while writing data, creating major traffic jams, according to a recent report in the online publication The Information.

MongoDB 3.0 has improved concurrency control—more tollbooths, you might say. In a nutshell, this is what Mongo hopes will deliver the simple scaling results developers want. 

“Now concurrency is managed at the document level, which dramatically improves throughput,” Sturman said. “More cars can travel over the bridge at the same time. We expect most people to see a 7x to 10x improvement over Mongo 2.6.”

Database 101

The scale critique of MongoDB touches on a larger issue. The database is a relatively new challenger in a vast database market that hasn’t changed much in decades.

MongoDB’s big selling points are speed and simplicity, but it won’t make much headway if it can’t convince companies that those advantages warrant turning away from standard relational databases of the type that turned Oracle into a giant.

Databases, of course, are simply repositories of information, indexed in ways that organize stored data and make it accessible. MongoDB and other new database approaches basically threaten an established order with decades of history and billions of dollars behind it. Needless to say, controversy goes with the territory.

“Every piece of the technology stack except [the database] has been reinvented several times in these past 40 years,” said Stirman, “But the relational database has more inertia.”

Traditional relational databases store data in a rows-and-columns model, conceptually similar to the rows and columns in you might recognize from Microsoft Excel spreadsheets. MySQL is a reputable—and popular—open-source relational database.

See also: Why You Need NoSQL For The Internet Of Things

MongoDB and other so-called “NoSQL” alternatives to relational databases, by contrast, organize data in different ways. Mongo, for instance, is a “document database” with a looser structure that allows users to store data even if it isn’t consistent or highly structured.

For example, if you had a contact list with multiple phone numbers for two people, a physical address for another, and no last name for a fourth, a document database could store the relevant data without adding empty rows and columns for what’s missing.

Taking On The Big Boys

This model, proponents say, explains why MongoDB is faster than relational databases. In MongoDB, individual datapoints are connected by “pointers” to related datapoints. That makes it more efficient to gather up a collection of relevant data than if you had to search all rows and columns of every table in a database. But it also risks slowing down under larger data loads.

Of course, neither type of database is suitable for every problem. But many businesses are used to thinking of MySQL as a one-size-fits-all solution. MongoDB’s goal is to demonstrate to clients why a document database might be a better fit for particular business needs.

Mongo 3.0 is an opportunity for the company to show its uses for everything from Internet of Things applications to messaging apps. Anything that needs speed, and now, Stirman hypothesizes, anything that needs to scale.

“This release expands the kinds of applications people will view as a good fit for MongoDB,” he said. 

Photo by Garrett Heath

Facebook Comments