We all know that the Wikimedia Foundation is the operation behind Wikipedia, but what you might not know is that they have close to 100 paid staffers minding the store. And keeping the various multi-lingual servers up and running is a lot to deal with, even when someone isn't trying to launch denial of service or other attacks for political reasons on them. And perhaps they have more to deal with than your average IT operations staff, when you consider that their websites get half a billion unique visits and have to manage servers in three different locations: Florida, Virginia, and Amsterdam.
Over the last two years they have been using Nimsoft's WatchMouse SaaS-based server monitoring tool to keep track of their digital assets. Prior to WatchMouse, they had a homegrown solution that told them when a server was down, or waited for a user to let them know. But this wasn't perfect, because so much of Wikipedia's content isn't directly accessed by users, but through third-party apps via programming interfaces. This is a lot harder to track. Plus, Wikimedia wanted to become a lot more transparent about its operations, and have a dashboard (available here) that shows uptime and other statistics, as you see from the screencap below.
"Transparency is a big part of who we are as an organization," said CT Woo, the director of technical operations for the foundation. "We have an obligation to make information about how we operate available to our user community." The status page is hosted on Amazon's Web Services by Nimsoft. Since deploying the solution, they have been able to better keep track of server uptime and performance issues and become more proactive when there is a connectivity problem or traffic bottleneck.