Category Theory for Breakfast at Strange Loop

Keynotes are often viewed skeptically by technical audiences. Far too often conference keynotes are all style and no substance. Larger conferences can be worse – where keynotes are pay to play and audiences are expected to sit through glorified sales pitches. The Strange Loop conference today in St. Louis, Missouri avoided that problem pretty nicely today with its morning keynote by Erik Meijer, “Category Theory, Monads, and Duality in (Big) Data.”

Meijer, is an architect in Microsoft’s SQL Server division. He was previously an associate professor at Utrecht University, and an adjunct professor at the Oregon Graduate institute. No doubt, he’s been spending plenty of time thinking about the differences between SQL and NoSQL databases lately.

As Meijer’s title suggests, the keynote was anything but fluff. Quite the contrary, as Meijer spent the allotted time talking about the differences between NoSQL and SQL so the audience could make “a more informed decision” about which tech to use in any given situation.

Don’t Call it NoSQL

Actually, Meijer might not like it being called NoSQL – he prefers calling it “coSQL.” Why? Meijer says that NoSQL (or coSQL, if you like) is “more friendly and more scientifically valid.” NoSQL, he says, is too negative.

Much of the keynote was given over to Meijer talking about the differences between SQL and NoSQL. He talked in detail about the limitations of SQL and reasons why programmers would prefer NoSQL databases. He poked fun at SQL join statements, saying that joins “make great exam questions,” but not so great to use in the real world. He also talked about the problems with NULL semantics in databases, saying that they represent “a full employment theorem for database folks.”

But after pointing out differences between the two, and some annoyances with SQL, Meijer noted that they’re “not that different” after all.

Two Sides of the Same Coin

The big difference, says Meijer, is that “all the arrows are reversed” when you look at diagrams of how data is stored and accessed in the database. A good example of what Meijer discussed, in more detail and with similar diagrams, can be found in his paper for ACM called “A co-Relational Model of Data for Large Shared Data Banks.”

In short, Meijer says that there’s a “duality” between SQL and coSQL – and each has its place. It’s really all about understanding what fits best with the task at hand. For instance, Meijer compared coSQL to referencing objects in RAM or using a URI to grab a resource on the Web. You may not know exactly what you’re getting when you request something using a URI. With SQL, it’s highly structured and you should know each time roughly what you’re getting back. That is, you may not know what album title to expect from a query of a database, but you know it’s going to be a string of text with a character limit of some kind.

Meijer covered quite a bit of ground in a short time. The audience seemed to enjoy the talk quite a bit, and the room was packed nearly to capacity. More than 900 people are registered for Strange Loop, which continues through tomorrow afternoon at the Downtown Hilton in St. Louis.

Facebook Comments