Final Update: Wednesday, 01 August 2018 23:17 UTC
We've confirmed that all systems are back to normal with no customer impact as of 8/1, 23:00 UTC. Our logs show the incident started on 8/1, 15:28 UTC and that during the 8.5 hours that it took to resolve the issue , customers in East US location would have experienced data latency. At present all of services are running as expected and our monitoring is reporting healthy.
-
Root Cause: The failure was due to broken network link between two datacenters where our services reside.
-
Incident Timeline: 8 Hours & 28 minutes - 8/1, 15:28 UTC through 8/1, 23:00 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.
Initial Update: Wednesday, 01 August 2018 16:46 UTC
We are aware of issues within Application Insights that started at 8/1/2018 15:28 UTC. We are actively investigating & at this moment we know that it is caused due to instability in underlying infrastructure. Some customers may experience Data Latency until we resolve the issue completely. We provide more updates as we work towards mitigation.
-
Work Around: None at the moment.
-
Next Update: Before 08/01 21:00 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-Arvind