StarRez Root Cause Analysis
A global event occurred where an update was distributed by our threat protection vendor which caused blue screen events on all Windows based hosts that received the update.
Root Cause
At 5:05AM UTC, 19th July 2024, the Mercury Cloud platform was impacted by an update that was distributed by our threat detection vendor. This resulted in a subset of infrastructure experiencing continual blue screen events/reboot loops causing either a complete failure or continual disruption for the underlying host.
Resolution
Multiple remediation practices took place to restore services after mitigation steps were provided from the vendor. A subset of systems recovered automatically once an update was pushed by the vendor, however any remaining hosts impacted required manual intervention to remove the relevant update file to allow the machine to boot successfully.
At 3:37PM UTC 19th July 2024, all Mercury services were back online
Next Steps
We will conduct post incident reviews to investigate if there are any process improvements we can make when vendors push updates that disrupt service.