Way back when I was in high school I found myself in a high school science
class. It was your typical experience, replete with bunsen burners, saftey
goggles and a science teacher named Norbert. But one day Norbert had an
inspriation – he let the class watch a video of James Burke and his Connections
series, which described all of the happy accidents in technology through the
years, that have brought us where we are today. My favorite accidental
connection was the development of fine mist sprays for perfume bottles. While it
may have helped people smell better, the real combustion happened in your
automobile, where the fine mist sprays became fuel injection nozzles for the
modern gasoline engine.
Like a fine fragrant perfume, Amazon has a revolutionary technology sitting
right under everyone’s noses. Their happy accident? Building a reliable,
scalable and robust ecommerce system. While I’m sure Jeff Bezos didn’t envision
his online company being compared to perfume sprays, the fact of the matter is,
even after immense technological investment, retail needs a lot of perfume to
make the margins smell nice– even if you’re online.
But in their quest to prove to the world that online retail is the wave of
the future, Amazon has created not just a fine mist. They have unexpectedly
created a vapor cloud – or
internet cloud – that is ready for ignition. Most fortunately for Amazon,
they’ve been able to build one of the world’s most impressive, massively
scalable datacenter systems. Most fortunately for you, they’re willing to share
it. And most fortunately for corporate programmers, you’re about to be relevant
again.
The Value of Sharing
How valuable is a robust, secure, scalable and reliable computing platform
that allows you to store and retrieve any kind of digital data from anywhere in
the world at anytime of the day? While your average high school student doesn’t
have much use for one, technology visionaries and corporate titans have been
willing to pay big money to build such platforms. Groove Networks ran through
approximately $120 million of investment in order to build and convince the
world that theirs was robust, secure, scalable and reliable. Microsoft, which
purchased Groove in 2005, is spending upwards of $2
billion in order to do the same thing. Google’s entire business model
requires them to reinvest huge amounts of capital into their storage platform.
Throw in Yahoo and Baidu and it’s clear that such computing power is seen as a
technological edge.
However a high school student can rent one of the best, most proven, storage
platforms of them all — for a little less than $.50 per month. If he or she is
willing to lay off the occaissional candy bar, he/she can be on an equal web
infrastructure footing with the richest technology companies in the world. If I
was an investor in MSFT, I might even ask if investing $200 million in Amazon
makes more sense than $2 billion on your own technology and starting from
scratch. In fact, taking it one step further, I might even be worried that
Amazon’s examples treat .NET as just another language – alongside
Java, Ruby, PHP, Python and Perl. Another $1.8 billion into .NET and Visual
Studio might take care of that, especially for those prized corporate
programmers.
So Easy a High Schooler Can Do It
Like anything worthwhile, it takes
a bit of time to figure out the best way to exploit it. The obvious comes
first: use Amazon’s storage capacity and scalability to store data such as
images or globs of backup data. Indeed, a simple scan of Amazon’s Simple Storage
Service solutions
shows many backup, photo sharing and large email attachment services.
Furthermore, most of these services are web-based and so their single greatest
cost is usually bandwidth. Hence Amazon’s marketing focus on web
2.0 apps as their clientele. This makes perfect sense, as it fits the world’s
current multi-tiered architecture: the simple browser on the client, the
business logic on a web server and now the robust data store on the backend.
Many well known and successful companies are exploiting this today: SmugMug,
37Signals and MyBlogLog to name a few. Would you know it buy using these
services? Probably not, unless you have a keen eye reading the urls flash by as
elements on your web page are downloaded. Does it matter? Not one bit. Each of
these services uses Amazon S3 in order to offload the work of sending images and
documents off their servers and onto Amazon’s. As a result, each company can
focus on their distinctiveness and bring it to market as quickly as possible,
without worrying so much about infrastructure. In some cases it’s enough to
prove the concept and sell to a large company in just
a few months.
The Next Fortunate Event
In the US there is a saying: “Only
Nixon could go to China.”. While Amazon has positioned itself as a
proven resource for eager online entrepreneurs, they have also (accidentally?)
created a solution that makes the client chic again – actually both the Client
and the Server. Indeed, Amazon’s Simple Storage Service takes us back to
the 1980s, resurrects Client/Server
architecture and provides it on such a scale that it actually works.
In the late 1980s, when C/S became a holy grail, it quickly sprang a leak at
the supper table as large deployments grappled with networking protocol
decisions, scaling issues, big capital investments for servers and, of course,
maintenance. Amazon has moved these giant obstacles aside and resurrected
something most thought dead. And this from an online department store!
While Amazon is still very much a consumer site selling anything they can
display on a web page, at some point people will see the light and realize that
a lot of C/S architecture implemented, deployed and wheezing along in
corporations around the world can be refactored with services like Amazon S3. By
providing the ultimate server, Amazon has made it possible for programmers to
build corporate client software that a) actually scales and b) actually cuts
costs – no more corporate servers to procure, house and maintain. A corporate
programmer can create a client that can scale for the entire organization by
using a good GUI tool and something only moderately more complicated than
File.Open. Web application stacks
of functionality might not look as appealing compared to this new reality.
(Have you noticed that programming stacks have evolved like razor blades? We
started with one, then two blades, then three solved the problem, until four
blades really did the trick, and so on – in fact, aren’t servers even called
blades now?)
But while Amazon has provided the fuel, it’s still up to the developers to
provide the ignition; and right now we’re dealing with sticks. By that I mean
that Amazon isn’t offering a database, nor server side scripting, interpreted
languages, web server plugins or anything else to add complexity. Instead, at
its core, is the world’s greatest file server.
Groaning from the Peanut Gallery
The collective moan you just heard was from everyone who actually lived
through the client/server days rolling their eyes and recalling how difficult it
was to merge everyone’s changes into a single file on the server. That was
indeed a huge problem and yes it did lead to C/S being discredited. HOWEVER, the
problem was twofold: 1) tackling multi-user scalability issues for the server
and 2) tackling multi-user data issues for the client. In the 1980s, personal
computing was still in its infancy, and nascent applications struggled to meet
the needs of a single user, much less two.
If we look at the path highly available and scalable servers have travelled
the last twenty years, we see a single machine– even a really big one– was
simply inadequate for the task– it’s just too prone to failure. Furthermore,
data needs have exploded past workgroups or departments in a single location–
people across the globe may need access to specific data now. Although it will
take a post in the future to adequately describe how Amazon and others are
building scalable and reliable data services, they do require multiple data
centers located around the world connected by fiber optics, with each data
center housing thousands of redundant systems built for quick switching in case
of any single point of failure. In other words, a far cry from a PC running IBM
OS/2 with an Intel 386 and 32 megs of RAM locked in a closet.
But while the server side of the equation has advanced over the last twenty
years, can the same be said for the client? Or, more specifically, for the data
clients are producing? No. While operating systems are more powerful today and
prettier to look at, the state of application data today is not so different
than it was back in the 1980s– a stream of data supporting the state of the
application as last saved by a single user. In this day and age, multiple people
need to work in single context to produce a deliverable. Unfortuantely up till
now this has been dealt with using the checkout, checkin, ‘hey, you overwrote my
data!’ design pattern. The server has improved, now it’s time for clients to
respond in kind.
One More Event Before The Big Bang
The solution? How client applications read and write data needs to make the
same advancements that server platforms made to make data storage reliable and
delivery scalable. As application’s become decoupled from the desktop, they
should also decouple from the notion that a single user updates data at one
time. Does only one person work ever work on a document? Does only one person
ever participate in a project? Only if they wish to remain in the past.
Now that a superserver like Amazon S3 is almost a given (just a bit
more reliability and maybe one
more feature), applications can pull ‘multi-user’ files off, perform merge
operations using local computing power and then place the file back in the cloud
with an updated, combined view – ready for the next user to come along. Save
doesn’t have to mean ‘write out my view of the data only’ – it can also mean
‘merge my view of the data with the group’. It might sound like a database,
but it’s really functionality that all apps working with groups, or in a group
context, should have. The applications themselves will have the knowledge to
merge files they recognize together – an excercise left to the developers of the
next generation of client software.
There’s a lot of refactoring to do, but this time the benefits are tangible.
Entrepreneurs seem to be embracing
Amazon web services, and the next wave can’t be far behind. If Amazon manages to
catch that next wave, they could be in for a great ride. And however it works
out, in their quest for the ultimate online department store, Amazon might have
finally solved the server side of the equation in Client/Server architecture. If
they manage to attract the corporate programmers ready to build the next
generation client pieces, then not only would Amazon again enjoy first mover
advantage, but it would be for something worth a lot more than books. Even James
Burke would have been proud of that connection.