Amazon cloud outage aftermath could last another two days

Spread the love

The consequences of the power outage that struck Amazon in the night from Sunday to Monday after a lightning strike may still be noticeable for 24 to 48 hours. Not all sites are back online yet. In some cases, data has also been lost.

Although Amazon immediately started to fix the outage, it appears that this is taking considerably longer than expected. Although the outage occurred around 8 p.m. on Sunday evening, it turned out that not all instances of Amazon’s Elastic Compute Cloud were up and running again 12 hours later. The cause seems to be an accumulation of problems.

The power outage occurred after a transformer house of the Amazon data center in Dublin was hit by lightning. The impact caused an explosion, followed by a fire. Although a power cut is usually overcome by an emergency generator, it turned out that the impact had been so powerful that the control system of the power phases was also paralyzed. This system ensures that the emergency generator is synchronized. Without this phase synchronization, generators cannot turn on automatically, requiring manual action.

Not only did it take a long time to restore all volumes manually, the available disk space also proved to be a problem. To restore the volumes, Amazon had to make an extra copy of all data, filling almost all available storage capacity and slowing down the recovery process.

Throughout the day, Amazon added additional storage to overcome this issue. Although the cloud provider states that the majority of volumes are restored during the day, the company estimates that it will take 24 to 48 hours for the entire process to be completed.

Not every customer is expected to be able to weather the outage without losing data. In a few cases, Elastic Block Storage servers ran out of power before data was properly written. In those cases, Amazon will restore a snapshot from a recovery image, which may not be up to date.

You might also like