widespread Skype outage continues today, according to an official update posted early this morning to Skype's company blog. As of the time the blog post was written, 5 million users were reportedly back online, but Skype says this is only 30% of what they would expect at that time of day.Yesterday's
However, the number is gradually increasing as Skype repairs the problem, which the company is now saying is a software issue affecting its "supernodes." But what does that mean?
For those who aren't familiar with Skype's technical underpinnings, this two-day outage has offered the opportunity to explore the backend of a communications system so many millions throughout the world depend on.
Yesterday, when the problem was first reported, Skype said that the downtime was due to many of its "supernodes" going offline at the same time. "These supernodes act a bit like phone directories for Skype," the company said. "If you want to talk to someone, and your Skype app can't find them immediately...your computer or phone will first try to find a supernode to figure out how to reach them."
Understanding the Supernodes
But calling the supernodes "phone directories" is simplifying the explanation a bit, as it turns out.
Dan York, a writer, speaker, podcaster and director of conversations at Voxeo Corporation as well as best practices chair of the VOIP Security Alliance wrote a more detailed explanation of supernodes on his blog at DisruptiveTechnology.com. There, he explains better what supernodes are, how they work, how they're connected and more in a fascinating, yet still easily digestible format for anyone who's curious to learn more.
"Supernodes," writes York, "connect individual Skype clients to each other and create a P2P (peer-to-peer) overlay network... the cloud that connects all Skype clients to each other. These 'supernodes' run the regular Skype software. The ONLY difference is that they are on the public network. So if you are running Skype on a computer - and you are NOT behind a firewall, there is a chance that your computer could become a supernode."
What Skype Isn't Telling Us
Well, that explains what they are (the blog post gets a bit more detailed) - but why did they fail? Why did, as Skype says, so many of these go offline? We haven't heard of any massive outages affecting the public Internet, so what happened?
That's the part Skype isn't telling us.
York hazards a guess: a software update that somehow affected the supernode algorithm, possibly leading to cascading failures where load increased on existing supernodes as others failed and dropped offline. But as York says, that's "purely a guess."
What we do know is that Skype is fixing it by building something it calls "mega-supernodes." There aren't details on what these are, exactly, but it's probable that they are server-based supernodes that Skype controls, instead of nodes that just rely on client software running on end users' computers, says York. Or maybe they're just a higher level of supernode.
Whatever, they are, the fix is, slowly but surely, working.
Image credit: DisruptiveTechnology.com
UPDATE!:Skype CEO Tony Bates gave GigaOM an update on the outage (posted 8:41 AM PDT). He said the following:
- 16.5 million of 25 million concurrent users are back online, even though you might be seeing a lower number in your desktop client.
- Users in Europe and on the U.S. East Coast are fully restored.
- The IM, video and audio services are back up.
- The Group Video services and offline IM capabilities are not going to be working for some time, mostly because Skype is using those servers as super nodes. The company will issue an update later today.
"We are bringing up the service in a controlled manner and things are moving in the right direction," Bates told GigaOm. "This outage, if anything, has made it even more clear how reliant people are on the service. It is amazing to see how many people are using it."
He also said that Skype would be issuing formal compensation to people.