Final Update: Tuesday, 22 October 2019 07:49 UTC
We've confirmed that all systems are back to normal with no customer impact as of 10/22, 04:30 UTC. Our logs show the incident started on 10/21, 23:20 UTC and that during the 5 hours 10 minutes that it took to resolve the issue customers would have experienced Data latency, Data access and Alerting failures in West Europe region.
-
Root Cause: The failure was due to connections getting timed out with our Azure storage in West Europe region..
-
Incident Timeline: 5 Hours & 10 minutes - 10/21, 23:20 UTC through 10/22, 04:30 UTC
We understand that customers rely on Log Search Alerts as a critical service and apologize for any impact this incident caused.
-Rama
Update: Tuesday, 22 October 2019 05:01 UTC
Root cause has been isolated to one Azure Storage scale unit in West Europe which has impacted many services, including Log Search Alert. Mitigation is in process we estimate at least 2 hours before all latency is addressed.
- Work Around: None
- Next Update: Before 10/22 08:30 UTC
-Rama
Update: Tuesday, 22 October 2019 02:57 UTC
Root cause has been isolated to one Azure Storage scale unit in West Europe which has impacted many services, including Log Search Alert. Mitigation is in process we estimate at least 2 hours before all latency is addressed.
- Work Around: none as of now
- Next Update: Before 10/22 05:00 UTC
-Jack Cantwell
Initial Update: Tuesday, 22 October 2019 01:20 UTC
We are aware of issues within Log Search Alerts and are actively investigating. Some customers may experience data latency in the West Europe region.
- Work Around: none at this time
- Next Update: Before 10/22 03:30 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-Jack Cantwell