US East Outage – 16th September 2023
Root Cause
At 07:24 UTC, a power disruption within our upstream vendors datacenter impacted underlying network and compute infrastructure which required manual intervention to mitigate. This impacted SQL database hosting for a subset of customers in this region.
Resolution
StarRez initiated DR as per standard process into our East US2 region with all customers impacted by the outage failed over and online 9hrs after the incident began.
Our upstream provider recovered all impacted services 14hrs later at 21:38 UTC
Once the vendor had brought all services back online within the region and StarRez was comfortable with stability, all databases were moved back into the US East region.
Additional Information
In follow-up to this incident, our upstream vendor has confirmed that an internal process and BIOS bug delayed and ultimately prevented recovery happening sooner. Fixes for both are being expedited with the aim of improving and minimizing the time to restore.