Service Disruption - Core Service - Australia East
Incident Report for StarRez Cloud
Postmortem

StarRez Root Cause Analysis
Australia East Outage - 10th Nov 2022

Summary
On the 10th Nov 2022, a subset of customers within the Australia East region experienced downtime to a number of core services.
Known bugs were triggered which impacted 2 backend nodes within our infrastructure. This resulted in applications losing network connectivity and being forced to restart.

Root Cause
The root cause was determined to be 2 underlying nodes that were experiencing known bugs that StarRez engineers are currently working with our upstream vendor to resolve.
This bug impacts network access within the application, preventing access to the backend database and other external services.

Resolution
All impacted application pods where moved to healthy infrastructure and the hosts causing the original outage where removed from service.
StarRez engineers will be moving customer application pods to an updated cluster which has several fixes applied and should prevent these issues happening into the future.

Posted Nov 10, 2022 - 05:33 UTC

Resolved
This incident has been resolved.
Posted Nov 10, 2022 - 03:50 UTC
Monitoring
All sites are now back online.

Engineers are actively monitoring this for stability before closing the outage out..

- Next update expected as warranted by a change of events.

Apologies for any inconvenience,
StarRez Team
Posted Nov 10, 2022 - 02:11 UTC
Identified
The issue has been tracked down to 2 erroneous nodes within our infrastructure, customer workloads are current being moved to healthy nodes and are starting to come back online.

- Next update expected within 60 minutes, or as warranted by a change of events.

Apologies for any inconvenience,
StarRez Team
Posted Nov 10, 2022 - 01:57 UTC
Investigating
A subset of customers in the Australia East region are experiencing a service disruption for core services such as Web and Portal
-Engineers are actively working to remediate the issue.
-Next update expected within 60 minutes, or as warranted by a change of events.

Apologies for any inconvenience,
StarRez Team
Posted Nov 10, 2022 - 01:50 UTC