Service Disruption - Core Applications - Switzerland North
Incident Report for StarRez Cloud
Postmortem

StarRez Root Cause Analysis
Switzerland North Outage - 2nd Nov 2022

Summary
On the 2nd Nov 2022, customers within the Switzerland North region experienced an outage of up to 1hr20mins for core services.
The cause of the outage was underlying infrastructure experiencing resource exhaustion which also triggered an underlying bug that StarRez is currently working to resolve with our upstream vendor.

Root Cause
The root cause was determined to be underlying node resource exhaustion which led to services being moved elsewhere within the cluster.
During this move a known bug was encountered which impacts network connectivity, further delaying the startup of customer resources within the cluster.

Resolution
The problematic node was removed from service and the cluster was scaled to handle the increased workload to allow applications to start again.
StarRez engineers will continue to work with our upstream vendor to resolve this ongoing bug within the platform.

Posted Nov 07, 2022 - 08:39 UTC

Resolved
This incident has been resolved.
Posted Nov 02, 2022 - 09:45 UTC
Monitoring
The underlying infrastructure has been removed from production and all sites are now back online.
Posted Nov 02, 2022 - 09:45 UTC
Identified
A cluster issue has occurred within the Switzerland North region.
The majority of customers in this region were impacted by an outage of up to 30minutes.
All sites are now starting to come back online whilst engineers investigate the root cause.

- Next update as warranted by change of events.

Apologies for the inconvenience
StarRez
Posted Nov 02, 2022 - 09:37 UTC