It is quite a remarkable feeling to watch as the pieces fall into place and the picture, anticipated for so long, is finally revealed in all its splendour. As with any jigsaw that lacked a guiding picture on the box, the final result is that inevitable mix of vindication and surprise. Some areas of the picture are wholly unexpected, some look as one predicted, while across most of the image there are new facets to explore in familiar places, anticipated scenes to compare with long-held expectations, and assumptions to challenge or validate.
Recent advances in the business of cloud computing form just such a picture and reach out to encompass previously unrelated aspects of Web 2.0, the semantic web, platform computing, software as a service (SaaS), and the economics of disruption.
Not merely some game of buzzword bingo on an unprecedented scale, cloud computing is coming into its own, and it is becoming increasingly easy to see the opportunities for a significant shift in the way we access computational resources and to recognize that the walls separating organizations from their peers, partners, competitors, and customers will become ever-more permeable to the flow of data through which those distant machines will compute.
There are many areas to understand that have already been ascertained in related fields, and many ideas unique to this space to discover. One early challenge is to carve a distinct niche for the place we are moving towards with such rapidity. Far more than "just" a cloud, it is an evolutionary cycle beyond the playful flippancy that diminishes so many of Web 2.0's poster children, and it is difficult to relate to mainstream misconceptions of the semantic web's complexity. Yet this new place is greater than the sum of its parts. So do we sustain the already ephemeral notion of cloud computing? Do we appropriate the "next big thing" label of Web 3.0? Or do we need a fresh attitude towards business computing's apparently insatiable desire to apply labels?
First, though, let us consider the shape of this thing that is taking on more substance with each passing day.
Reporting on last month's Web 2.0 Summit in San Francisco, CNET's Dan Farber notes that "the cloud was omnipresent," before closing his report with the observation that "cloud computing won't be very compelling without what is variously called Web 3.0 or the semantic web."
For too long, the emphasis in cloud computing circles has been almost exclusively on the provision of rapidly scalable and ad hoc remote computing on top of cost-effective commodity hardware. The cloud play by Salesforce, Amazon's EC2, and the rest has been dominated by the implicit assumption that these cloud-based resources are an extension of the corporate data center; a way to simply reduce the costs of enterprise computing.
There is value down this road, but there are bigger opportunities.
Nick Carr is among those who fear that a small number of players may come to dominate the provision of cloud resources. He outlines many of these arguments in his latest book, The Big Switch, and more recently had an interesting discussion with Tim O'Reilly on the topic. Justin Leavesley shares some of Talis' views on the economics behind all of this over on Nodalities, broadly agreeing with Tim O'Reilly:
"It's pretty clear that utility cloud computing is highly capital intensive so it should come as no surprise that there are powerful economies of scale to be had. But the bottom line is that you are talking about plant and power. These are rival goods, scarce resources that are created and consumed. This is not different from many utility industries with one exception: the distribution network has global reach, already exists and is very cheap compared to existing utility distribution networks. It is a lot cheaper to access a computing resource on the other side of the planet than it is to send electricity or gas across the globe... [So] what is to stop economies of scale turning this into a global natural monopoly?
"Actually, unless there are some large network effects, quite a lot stops single companies ruling entire industries. For a start, without network effects, economies of scale tend to run out: the curve is usually U-shaped. Telecoms, gas, rail companies have strong network effects from their infrastructure -- it makes little sense to have duplicate rail networks or gas networks in a country. Utility computing does not have this advantage because the distribution network is not owned by them."
Continuing the conversation, Carr summarizes the usual widely held perception of cloud computing nicely:
"The history of computing has been a history of falling prices (and consequently expanding uses). But the arrival of cloud computing -- which transforms computer processing, data storage, and software applications into utilities served up by central plants -- marks a fundamental change in the economics of computing. It pushes down the price and expands the availability of computing in a way that effectively removes, or at least radically diminishes, capacity constraints on users. A PC suddenly becomes a terminal through which you can access and manipulate a mammoth computer that literally expands to meet your needs. What used to be hard or even impossible suddenly becomes easy."
This is quite true, but it continues and further entrenches the misapprehension that the cloud is little more than an adjunct to the corporate data centre, a misapprehension that we shall get down to challenging in a moment.
First, though, there is a growing recognition that today's market leaders will inevitably need to become more interoperable if this business segment, and they, are to grow. The proprietary nature of their offerings today may allow them to innovate ahead of the standards process (which will be shaped in large part by the lessons they learn), and the relatively high cost of switching to a competitor today may give each the critical mass on which to invest and grow; but the characteristics of the current market are clearly the characteristics of a nascent market: computing's new Wild West. As so often before, standardization, true competition, mainstream adoption, and commoditization will all follow as we move towards phases 2 and 3 of Gartner analyst Thomas Bittman's intriguing analysis of the "evolution of the cloud computing market." Similarly, Erica Naone offered a useful overview of cloud computing's open-source component in Technology Review last month. None of the projects she covers are a significant challenge to Amazon's EC2, Microsoft's Azure, Salesforce's Force.com or Google's App Engine... yet. But together, they help to keep these commercial entrants honest and remind all of us that switching costs can be brought very low indeed if the pain of the status quo becomes too great.
Writing "Welcome to the Data Cloud?" for ZDNet in October, I began to explore the important role that data could and should play in the cloud:
"Just as 'we' used to duplicate and under-utilize computational resources, so we do something very similar with our data. We expensively enter and re-enter the same facts, over and over again. We over-engineer data capture forms and schemas, making collection exorbitantly expensive, whilst often appearing to do all we can to limit opportunities for re-use. Under the all-too-easy banners of 'security' and 'privacy' we secure individual data stores and fail to exploit connections with other sources, whether inside or outside the enterprise.
"In a small way, the efforts of the Linked Data Project's enthusiasts have demonstrated how different things should be. The cloud of contributing data sets grows from month to month, and the number of double-headed arrows denoting a two-way linkage is on the rise. Even the one-way relationships that currently dominate the diagram are a marked improvement on 'business as usual' elsewhere on the data web; even in these cases, data from a third party is being re-used (by means of a link across the web) rather than replicated or re-invented. Costs fall. Opportunities open up. Both resources, potentially, improve. The strands of the web grow stronger."
It is here, in the use and reuse of data, that the potential of the cloud will be realized. Back to the previously cited conversation between Nick Carr and Tim O'Reilly, O'Reilly himself comes very close to saying so:
"In short, Google is the ultimate network effects machine. 'Harnessing collective intelligence' isn't a different idea from network effects, as Nick argues. It is in fact the science of network effects -- understanding and applying the implications of networks.
"I want to emphasize one more point: the heart of my argument about Web 2.0 is that the network effects that matter today are network effects in data. My thought process (outlined in 'The Open Source Paradigm Shift' and then in 'What is Web 2.0?,' went something like this:
- The consequence of IBM's design of a personal computer made out of commodity, off-the-shelf parts was to drive attractive margins out of hardware and into software, via Clayton Christensen's 'law of conservation of attractive profits.' Hardware became a low margin business; software became a very high margin business.
- Open-source software and the standardized protocols of the Internet are doing the same thing to software. Margins will go down in software, but per the law of conservation of attractive profits, this means that they will go up somewhere else. Where?
- The next layer of attractive profits will accrue to companies that build data-backed applications in which the data gets better the more people use the system. This is what I've called Web 2.0.
It's network effects (perhaps more simply described as virtuous circles) in data that ultimately matter, not network effects per se."
Talis CTO Ian Davis would appear to agree, commenting:
"People need to be investing in their data as the long-term carrier of value, not the applications around them... The data is more likely to persist than the software, so it's important to get the data right and take care of it."
Salesforce CEO Marc Benioff, too, used his Dreamforce User Conference this month to move a company long associated with the "data-centre extending cloud" firmly in the direction of embracing data and the network. As Krishnan Subramanian noted on Cloud Ave before the keynote:
"Till now, the Force.com platform served business users to develop apps that can be used internally within an organization. They have to tap into Force.com APIs from outside platforms to offer customer-facing web apps. With the new initiative, it becomes easy for customers to allow the Internet users to 'interact' with their data."
Over on VentureBeat, Anthony Ha had more:
"Salesforce.com wants to become an even big player in the cloud computing market with a new service called Force.com Sites, which allows companies to host public-facing web applications in the Force.com platform. That means Salesforce --- nominally a maker of customer relationship management (CRM) software, but also an increasingly important platform for business-related applications --- is moving closer to direct competition with cloud giants like Amazon Web Services and the Google App Engine."
Locked away within an organization and only accessed by that organization's applications, data cannot be put to full use. Much of the value in each individual datum lies in comparing it to other measurements, in delving into detail, and in pulling back to observe the bigger picture.
Organizations that believe that either the big picture or the detail resides in their own systems alone are woefully misguided. Even the most specialized, proprietary, and confidential of data only reveal their true value when put in context, and that context is all the richer when informed by numerous perspectives.
Cloud computing, and the various SaaS movements, have finally brought us to a place where the fiercely guarded and tightly delineated boundaries between the organization and those outside it may become permeable in ways that should benefit the organization rather than threaten it. Data is just a resource. In the terminology of Geoffrey Moore, most data are often mere context, and there are savings to be made both in reusing the data of others and in re-selling necessary context to those prepared to pay. Some data, of course, is core to the business, and this may continue to receive the same reverence and protection that we misguidedly apply to the entire database today. Even here, though, the opportunities afforded by (controlled?) sharing may outweigh any desire to maintain data protectionism.
The language of Groundswell offers opportunities to go further, to embrace and exploit the behaviors and motivations of customers and the wider web.
There is clearly far more to say in clarifying this view of both the components and the whole, but at over 2,000 words, this post has perhaps gone on long enough.
For now, then, we should conclude by asking what role the semantic web has to play in any of this. The semantic web, with its unadulterated recognition of the primacy of the web's hyperlink? The semantic web, designed from the outset to convey context and relationships derived from data spread across the web? The semantic web, supported by technologies that operate openly and on the scale of the web?
Isn't it obvious yet?
Returning to the Web 2.0 Summit with which we began, another presentation was from Kevin Kelly, founding editor of Wired Magazine. Steve Gillmor and Nicole Ferraro reported on his presentation at the time, and the video was subsequently shared online, echoing Kelly's earlier presentation (which I greatly enjoyed), in which he argued:
"You have to be open to having your data shared... which is a much bigger step than just sharing your web pages or your computer."
Yep, here we go, on a journey toward Kevin Kelly's "World Wide Database," which will take in a lot of the shifts facing enterprise computing along the way.