We've confirmed that all systems are back to normal with no customer impact as of 05/20, 14:20 UTC. Our logs show the incident started on 05/20, 09:45 UTC and that during the 4 hours and 35 minutes that it took to resolve the issue. Some customers using the Log Analytics service may have experienced latency and/or failures when attempting to carry out data plane and control plane operations in West US 2 region. Additionally, customers using Service Map, Log Search Alerts and Azure Automation may have experienced latency and/or failures during impacted time.
Root Cause: An increase in load exposed a un optimized flow in one of the backend service which caused the service to reach the operational threshold that led to the errors mentioned above.
Incident Timeline: 4 Hours & 35 minutes - 05/20, 09:45 UTC through 05/20, 14:20 UTC.
We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused.
Initial Update: Friday, 20 May 2022 12:12 UTC
We are aware of issues within Log Analytics and are actively investigating. Some customers may experience latency and/or failures when attempting to carry out data plane and control plane operations. Additionally, dependent services that use Azure Log Analytics such as Service Map and Log Search Alerts may experience the same latency and/or failures at this time.
Work Around: None
Next Update: Before 05/20 14:30 UTC
We are working hard to resolve this issue and apologize for any inconvenience. -Anmol