Final Update: Tuesday, 22 October 2019 07:41 UTC
We've confirmed that all systems are back to normal with no customer impact as of 10/22, 04:30 UTC. Our logs show the incident started on 10/21, 23:20 UTC and that during the 5 hours 10 minutes that it took to resolve the issue customers would have experienced Data latency, Data access and Alerting failures in West Europe region.
-
Root Cause: The failure was due to connections getting timed out with our Azure storage in West Europe region.
-
Incident Timeline: 5 Hours & 10 minutes - 10/21, 23:20 UTC through 10/22, 04:30 UTC
We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused.
-Rama