I’m sure everyone’s had their fill of log4j findings for the weekend, so I’ll spare you any more commentary – like a bad holiday party with too much stuffing and rancid egg nog. Not like I’ve been to a holiday party like that, but one can imagine. But I digress…
Fresh off an action packed re:Invent week, AWS topped the news again, but not for the right reasons – us-east-1 was down for a couple of hours on Tuesday, taking a notable piece of the Internet with it. You have to feel for the AWS team, hope they catch a breather during the holidays.
As is always the case, the incident sparked every hybrid and multi-cloud take there is – some good, some bad. Preparing for these scenarios with a failover strategy is the right thing to do, however a lot is left for interpretation in terms of what is feasible – not just for your organization, but also the providers you consume.
While AWS always recommends a multi-region architecture as a best practice, there are services and support functions that don’t allow for it. For example, one thing I learned the hard way among many others, is that AWS SSO only runs in the single region that you set it up in. Second, the AWS Status Page failed to deliver accurate updates partially because a number of global services such as CloudTrail are in fact centralized within us-east-1.
I tend to avoid the nuclear reaction of multi-cloud for disaster recovery purposes. It’s simply far too much work for most teams. I do advocate for multi-cloud if the workload benefits from select advantages from one provider over another. But as the major cloud providers continue to operate at record levels of scale, it does beg the question how much faith we put into them for critical data and services. It’s always smart to have a “backup plan”, but a strategy is only as good as the ability to execute, so always sprinkle a dose of reality into the mix.