12/01/2011

Website Outages: How the Mighty Have Fallen

Making sure your website is available at all times is clearly important, as any webmaster or web hosting company will testify. This is true of small websites (every lost minute is a potential lost sale), but when the site is a monster – a household name – the implications for the business become huge.

You don’t imagine sites like Facebook, Twitter, Gmail and Amazon going down. But they do. And as you’ll imagine, the rules are a little bit different for these big boys.

The way big websites handle downtime is crucial because as soon as the issue is identified (which is pretty much instantaneously), it’s being Tweeted, Google+d and Liked almost at the same time. With hundreds of millions of users worldwide between them, a wrong step can magnify the problem.

So large companies have to gain control of the message, quickly. Otherwise they turn a bad situation (their site not working), into a PR disaster (lost customers, revenue and reputation).

Poor Website Uptime: When the Big Guns Fire Blanks

Facebook: In September 2010 the whole of Facebook was down for 2.5 hours. This was their longest downtime for over 4 years. But their response to control this issue was not bad. Within hours of the problem occurring it had been resolved. And quickly after they had issued a fairly detailed explanation and an ‘hands-up-we-made-a-mistake’ apology to their audience.

Amazon: A much more serious issue occurred in April of 2011 for the Amazon Web Service (known as EC2). This cloud computing service supports countless websites, including some big players: Four Square, Quora, Reddit and Twitter client Hootsuite, which all went down. Amazon’s communication, while better than previous times, was heavily criticized – helping to fuel doubts on the stability of cloud computing itself.

Sony: The media giant’s Playstation network went down in April 2011, denying access to its 75 million users worldwide for almost a week. Their handling of the crisis was a lesson in how not to do it. Messages to their users were vague, infrequent and did not recognize the level of frustration experienced by customers. When reports that the personal data of users had been compromised, the damage to Sony’s reputation was colossal.

Twitter: As Twitter’s traffic levels have exploded it has often struggled to cope, experiencing regular issues with downtime. Only in October 2011 the home page was down for more than an hour, with the site suffering serious performance issues even when things got back online.

Netflix: Amusingly for spectators but less so for the company, the Netflix website had an 18 hour crash just after their management team had updated industry analysts on plans to improve customer service. The companies stock price was already plummeting at the time. Talk about bad timing.

Some of these periods of downtime may not seem such a big deal. But when you consider the hundreds of millions of users and the speed with which people start complaining, the amount of negative feeling generated can be huge.

So what causes major website outages?

The reasons can vary. Usually it’s a technical reason in the background: a physical issue with a data center or even a planned update by the company itself that causes an unforeseen issue. In the case of the Playstation network it was a security breach by hackers that caused Sony to pull the plug and completely reconfigure its security infrastructure.

Whatever the reason for downtime, whether you’re a small business or a giant like those featured here, ensuring your website is up and available to your users is critical.