Final Update: Tuesday, 02 June 2020 18:31 UTC
We've confirmed that all systems are back to normal with no customer impact as of 6/2, 16:32 UTC. Our logs show the incident started on 6/2, 15:24 UTC and that during the 1 hour and 8 minutes that it took to resolve the issue some customers experienced intermittent data latency, data gaps and incorrect alert activation..
- Root Cause: The failure was due to a dependent backend service going into a bad state and auto-mitigation taking slightly longer than expected.
- Incident Timeline: 1 Hours & 8 minutes - 6/2, 15:24 UTC through 6/2, 16:32 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.
-Jeff