Final Update: Tuesday, 23 February 2021 03:38 UTC
We've confirmed that all systems are back to normal with no customer impact as of 2/23, 3:00 UTC. Our logs show the incident started on 2/22, 15:10 UTC and that during the ~12 hours that it took to resolve the issue, customers in East US2 experienced delayed or missed alerts.
- Root Cause: The failure was due to backend service becoming unhealthy due to high threshold.
- Incident Timeline: 12 Hours - 2/22, 15:10 UTC through 2/23, 3:00 UTC
We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused.
-Anupama