It’s difficult to know how often a cloud computing service fails. When signing up for a service, it pretty much comes down to faith the provider will perform.
In this regard, services that provide updates about outages can be invaluable. CloudFail.net is one of those kinds of services. It’s a simple blog that aggregates RSS feeds from leading cloud services providers. It bills itself as a service that brings you “a unified place to find information about outages, as they happen.”
CloudFail.net monitors service updates from companies such as Amazon, Google and Rackpace.
On Thursday, for instance, it reported a Google service update for Postini, the enterprise email security service.
August 6, 2010 12:02:00 PM PDT
We’re experiencing an issue affecting less than 14% of the Postini Services user base. The affected users are unable to access Postini Services. We will provide an update by August 6, 2010 1:02:00 PM PDT detailing when we expect to resolve the problem. Please note that this resolution time is an estimate and may change
An hour later Google said the service would soon be fully restored. They apologized and expressed efforts to always make the system better.
For the customer, that’s good information to know. If nothing else, it helps them ask questions of the service provider.
CloudFail also covers services such as Twitter and Basecamp.
In this respect, CloudFail is not purely for cloud computing service providers. Twitter is a microblogging network, not a cloud service. It actually has its own data center at NTT America. But its inclusion does reflect that many view Twitter and Amazon Web Services in the same category.
Twitter’s disruptions have had its consequences. Twitter continues to grow at a significant pace but service failures do detract from the service.
It’s possible this same type of disruption could affect cloud computing companies at some point. The abiliity to deliver is only getting more complex. The age of the zettabyte is fast approaching. Networks will need to be increasingly sophisticated to service large providers.
That’s why we need more services like CloudFail.net. It keeps service providers a bit more honest. With more sophisticated third-party reporting, benchmarks can be established that rate services. That would be valuable information for a customer to have.
CloudFail.net is a rich source of information. It updates in real-time so up-to-the-minute problems can be viewed. For instance. on Friday, Amazon CloufFront had timeout issues with its API. Mechanical Turk had issues on Thursday with customers who were complaining of latency problems and intermittent timeouts.
There are hundreds of these entries on CloudFail.net. It’s worth exploring if nothing else to get new perspectives on service providers and see issues in new light.