A few months ago we told you about a paper by Microsoft researchers, Erik Meijer and Gavin Bierman which argued that non-relational data stores will need to create a standardized database query language in order to achieve widespread adoption.
Today a new potential standard for document databases (and possibly other NoSQL databases) was announced: UnQL.
UnQL
UnQL (“Unstructured Query Language”) comes from the Couchbase and SQLite teams with the explicit aim to create a standard for NoSQL database queries. It’s a SQL-like syntax for manipulating document databases. Presumably, if implemented correctly, it could eventually be used across several databases, including CouchDB, Riak and MongoDB.
“The work we’ve done on UnQL has been very gratifying. UnQL stems from our belief that a common query language is necessary to drive NoSQL adoption in the same way SQL drove adoption in the relational database market. I look forward to continuing my work alongside SQLite to push this new language forward,” Couchbase CTO Damien Katz said in an announcement.
In the same announcement, Meijer was quoted saying: “One of the main arguments in our
recent CACM article on coSQL was the industry needs a common query language and data model to feed the ecosystem for key-value stores. The UnQL language presents an important practical next step in this process. We are looking forward to working with Couchbaseand other industry leaders in the noSQL space on taking the design to the next level.”
You can check out the spec here.
CQL
Last month DataStax announced Cassandra Query Language (CQL), its own SQL-like implementation for Apache Cassandra. Cassandra is radically different architecturally than CouchDB, and DataStax isn’t positioning CQL as a standard across NoSQL databases. But thinking about this in the context of UnQL makes me wonder if it would be possible to make CQL and UnQL compatible at a very high level.
Alternative Languages for Hadoop/HBase
Apache Hive has been providing a SQL-like interface for Apache Hadoop and its data store HBase for years.
However, Cascading from Concurrent provides an alternative Java API for manipulating Hadoop data and it’s gaining some traction through a partnership with MapR (the distribution used in EMC’s Hadoop appliance), and just closed a new round of funding. This suggests that there’s room for approache to big data that aren’t SQL-like.