Your relational database is great. Really, it is. But it’s not going to help you with your Internet of Things project. Really, it isn’t.
At least, not according to a study by Machina Research, which looks at the management of data in the coming network of billions of smart devices. It finds that while relational databases will play a role for “processing structured, highly uniform data sets,” NoSQL databases are critical for the far bigger task of “managing more heterogeneous data generated by millions and millions of sensors, devices and gateways.”
See also: When NoSQL Databases Are Good For You And Your Company
If you’re a developer, in other words, you need to add a NoSQL database or two to your arsenal.
Why You Need NoSQL
Despite the common label, so-called “NoSQL” databases are more different than similar. Even so, in general NoSQL databases don’t enforce a strict schema and hence allow for highly flexible data modeling, not to mention dramatically better scalability than even the most hefty of relational database management systems (RDBMSes).
Both aspects are critical for Internet of Things applications, as Machina’s report makes clear. While RDBMSes will continue to play a part, the more disruptive aspect of the Internet of Things is all NoSQL:
The traditional relational database management systems will continue to have a role in the Internet of Things when processing structured, highly uniform data sets, generated from a vast number of enterprise IT systems and where this data is managed in a relatively isolated manner. When it comes to managing more heterogeneous data generated by millions and millions of sensors, devices and gateways, each with their own data structures and potentially becoming connected and integrated over the course of many years, databases will require new levels of flexibility, agility and scalability. In this environment, NoSQL databases are proving their value.
For the past 30 years enterprise data has been fairly predictable. Companies would store customer data in the rows and columns of a customer-relationship management system. Or they might track components for the widgets they manufacture and sell in the tables of their enterprise resource-planning system.
See also: What’s Holding Up The Internet Of Things
But data in the Internet of Things is different because it is almost by definition not completely known in advance. The market is moving so fast that its systems must be flexible, allowing the introduction of new sensors/devices and the data they emit:
Data generated from an exponentially growing number of diverse sensors, devices, applications, and things will be accompanied by a growing diversity in the structure and scale of that data—and more and more sources of additional data ranging from data sourced from corporate systems to crowdsourced data will need to be combined with this data.
But it’s not merely a matter of heterogenous, semi-structured data that creates the need for NoSQL. According to DataStax CTO Jonathan Ellis, “Relational databases like Oracle are great for dealing with data from a single company or department, but cannot provide the scale or availability that a database designed for the cloud like [NoSQL] Cassandra can.”
Raising A New Generation On NoSQL
Scale obviously matters in the Internet of Things. No article on the subject is complete without some mention of “50 billion devices” or some other heady forecast.
But the more interesting number, for me, is a matter of millions. As device counts balloon, the ranks of NoSQL-savvy developers must swell to accommodate the increasing demand for applications. According to VisionMobile estimates, there are just 300,000 Internet-of-Things developers today, but that number will explode by 2020:
Of the various things holding back IoT’s potential, including a lack of standards and connectivity issues, one of the biggest is finding enough NoSQL-savvy developers. Though 451 Research tracks tens of thousands of developers listing NoSQL on their LinkedIn profiles, the market needs hundreds of thousands.
Of course, not all Internet-of-Things data is necessarily NoSQL-ready, leaving plenty of room for developers versed in SQL syntax. For example, Revolv, a smart-home platform company acquired by Nest, turned from MongoDB (A NoSQL database and company for which I used to work) to DynamoDB (NoSQL) and ultimately to Postgres (RDMBS) because its “data is relational” in nature, according to Matt Butcher, a Revolv developer.
In that case, it would be silly to dabble in a non-relational database when the data is inherently relational.
Still, Butcher goes on to note that he’s “more inclined to choose the tool that naturally matches a data model. (This means, of course, that I do the data model before selecting the database.)”
Square Peg, Round Hole
And therein lies the problem that will likely drive more developers to eschew their comfortable RDBMS rut and try NoSQL for Internet of Things applications.
As the Machina report summarizes, “Responding to the Internet of Things with relational database management systems is an option but presents a limiting factor which in time will become a significant obstacle to realization of the full opportunities available from all types of data.”
Already, the RDBMS world is try to refashion itself for the Internet of Things world, with companies like DeepDB offering alternative storage engines for MySQL or other RDBMSes because “Traditional MySQL databases hit performance walls long before they are able to scale to meet this new demand.”
But ultimately the Internet of Things is more than a matter of scale. Its core challenge involves flexible data modeling, both for the devices and services available today and those that will be coming tomorrow. It is this need for flexibility that most challenges developers, and demands they embrace NoSQL.
Lead image of an automated home courtesy of Sony