Unstructured data, for lack of a more poetic phrase, exists. In fact, there’s more of it now than at any time in history – the growth rate Forrester experts cite is 80% annually, and perhaps rising. All this year, analysts have been asking whether Microsoft would come to embrace unstructured data, or what some call “NoSQL databases.” But by now, it’s grown so large that it’s encompassing Microsoft.
So amid today’s stunning news that the company plans to integrate Hadoop support in Windows Server, even insofar as to consider adopting it as a role alongside Web server (IIS) and DNS server, there’s this structured database management system whose roadmap to general availability was announced this morning at the PASS Summit in Seattle.
Where data lives now
From the perspective of the corporate balance sheet, SQL Server is the jewel in the crown, and has consistently been so for over a decade. While the Windows and Windows Live division shook off a 6% annual income decline in fiscal 2010, the Server and Tools division made up for part of it with a 19% annual income gain. SQL Server is by no means in trouble as a product.
But SQL Server 2012 needs to distinguish itself today in a way the product line has never had to before: to defend the viability of structured data in a market where the fun and excitement has centered around “big data” in the cloud. “Big data” had to break several bonds in order to attain cloud scale, one of which being the tabular indexing formats that an RDBMS like SQL Server depends upon. It would be nice if structured databases could get big without running into that performance drag.
Not all data has to be big, though; and in fact, critical business databases don’t have to scale up just because the cloud is available. That’s the case that Microsoft is trying to make today.
“There’s a variety of data that is living out there. The growth is really being driven by a shift from human-generated data to machine-generated data,” says Doug Leland, SQL Server’s general manager for product management, in an interview with RWW. “The first couple of decades of traditional database management systems dealt with data largely generated by transactional processing systems, oftentimes through human input. Now, we’re collecting data from a much broader range of sources – everything from sensors, RFID tags, videos, audio clips – which by nature are coming in at larger volumes, because there are more machines connected than people, and they’re coming in with different formats: structured, unstructured, semi-structured. The reality is, the volume is growing and the diversity of the data is growing.”
An iTunes for data?
Being able to draw an uninterrupted line around the new contours of this market, and include SQL Server completely therein, will be difficult for Microsoft (as it is for everyone else). Its strategy thus far appears to be to build a rough outline of what could later evolve into a kind of “data ecosystem,” analogous to the apps ecosystems that drive mobile devices and that have already inspired Microsoft to change course with Windows 8 architecture. The storefront for that data marketplace was announced this time last year, with the idea of promoting Windows Azure as a presenter of public (or commercially published) data for consumption by business users however they will.
“However they will” has never been fleshed out, so today Microsoft is adding a new component to the mix. It’s an extension of the database’s new visibility tool, formerly code-named “Crescent” into a console for browsing public data. And keeping with the “browsing” theme, Microsoft is calling it Data Explorer.
“What Data Explorer allows a customer to do is easily discover data, either inside or outside the organization, including in the Data Marketplace,” explains Leland, “[as well as] bring that data in, integrate it, transform it, and merge it with other data sets to create new, valuable data sets that can either be shared across the organization or even published back up to the Data Marketplace for broad consumption.” Details of how licensing will work, not only for data consumption but also for modification and republishing, have yet to be determined.
In a company blog post this morning, Microsoft Corporate VP Ted Kummert, who leads the Business Platform Division – and who delivered the keynote at the PASS Summit also this morning – outlined his vision of the Data Marketplace this way: “Imagine if everyone, regardless of what type of data frameworks or platforms they use, could achieve deep business insights by amassing and analyzing enormous amounts of data not just from their own organization, but from all over the world using a global data marketplace. As futuristic as that may seem, we believe we are uniquely positioned to bring this vision to life. Our existing assets across the public and private cloud, as well as our commitment to providing choice and flexibility for our customers, make us the only vendor delivering on this vision today. And now more than ever, with our latest data platform innovations, we are ready to help our customers harness the currency of tomorrow.”