Share what you know with millions of people
Focus is the best place to turn what you know into remarkable content
How will high-profile Amazon customers react to the recent outage?
Amazon's EC2 services (East) experienced another outage yesterday.
As before, this affected a number of high-profile web customers, such as Reddit, Quora, FourSquare and others. Arguably, some of these folks should have built additional redundancy by now, given the massive April outage of the same set of services.
What will the response of these customers likely be? Will they seek a cloud competitor? Will they employ more Amazon services for redundancy? Willy they cry quietly about their decision to use the cloud?
Events
- Dos and Don'ts of Small Business Marketing May 29 @ 11 am PT
- Lead Nurturing 202: The Next Generation May 31 @ 11 am PT
- The Tricks to Paid Media June 6 @ 11 am PT
- Display Advertising for Brand Awareness June 20 @ 11 am PT




4 Answers
The most recent outage was, for lack of a better term, a "blip". Obviously the affected customers won't agree with my assessment, but in the grand scheme of things it was minor compared to the April outage and the outage in Ireland on Sunday/Monday. On top of that add in the distraction created by the stock market melt-down on Monday, the buzz generated in the Tech community by Apple dethroning Exxon, the fact that this was a connectivity issue with little recovery implications, and the Northern Virginia AWS outage was basically pushed to page 2.
But that doesn't mean that customers aren't concerned about AWS (including the broad spectrum of services offered by Amazon). As Paul so clearly articulates in his answer - cross-region redundancy may not be sufficient, creating the need for greater redundancy in the region. As he also points out and we have discussed many times here on Focus, the tools that provide visibility and control for customers need to improve (by Amazon's own admission).
What's more worrisome for me is the apparent complexity (a matter of perspective on my part - your view may differ) involved in recovering services in AWS - especially implementations that utilize one or more of the "elastic" services. Read the transcripts of the AWS dashboards and you can see there are "step-by-step" instructions posted on how re-allocate, re-image, re-connect, re-start, re-... One would think that these would be fairly standard procedures, well documented and understood beforehand by customers who have carefully planned for outages and subsequent recovery. I've been in this business long enough to know that every outage is unique, so there is no cookbook. But still, it seems like a pretty complex yet delicate process to bring services back to full working order. One misstep during the process and things could go in the dumpster fairly quickly.
So, to answer your question, I think the customers who were affected (possibly some for the second time) will be having some serious discussions with Amazon about performance, processes, procedures and possibly re-negotiations of contracts. I think they will be hard pressed to move more/new applications to AWS until they have a higher degree of confidence in the AWS product/service architecture, tools and ability to deliver a reliable service.
Potential customers who were contemplating AWS will probably take a deep breath, step back and reconsider, but in the end will make the move. But those that do will pay much more attention to the effort and expense involved to ensure resiliency of services, and will spend a lot of time on the Netflix engineering blog trying to glean those key architectural strategies that have led to their success in the cloud world.
There was already some low-level buzz around the shift from public to private/hybrid clouds, and I think this will tweak the knob up a couple of clicks. But it won't dramatically reshape the landscape.
Ultimately, I think "this too shall pass", and the affected customers won't be overly vocal on this one, but they will be on a heightened state of alert. As the old saying goes - "Burn me once, shame on you. Burn me twice, shame on me." Let's hope there's not a third time...
*cough* Rackspace * cough......
In all seriousness, clients will need to get better maturity . cloud never removed the obligation of due diligence.... IT was great the Amazon built a DC in a region, but bad that they haven't realised that the regionalisation of cloud will mean they need redundancy in that REGION... It will also mean the growth of systems management will become critical..
Good luck ever hitting 99.9% uptime in an EC2-only setup.
Smart companies who have serious uptime requirements will continue to pull back from EC2. Amazon is currently averaging one serious and high profile outage every 6 months.
If you don't have serious uptime requirements, then EC2 continues to offer a flexible service and decent pricing. So I don't think companies like this will do much more than gripe about the outages.
Answer This Question