Home
Final Update: Tuesday, 26 March 2019 04:53 UTC

We've confirmed that all systems are back to normal with no customer impact as of 03/25, 18:08 UTC. Our logs show the incident started on 03/25, 13:00 UTC and that during the ~ 5 hours that it took to resolve the issue a very small subset of customers would see failures while sending telemetry data to our ingestion services.
  • Root Cause: The failure was due to one of our service roles going in to an unhealthy state.
  • Incident Timeline: 5 Hours & 8 minutes - 03/25, 13:00 UTC through 03/25, 18:08 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Varun

Update: Monday, 25 March 2019 19:14 UTC

We are aware of an issue within Application Insights services where customers would see failures while sending telemetry data to our ingestion services. This issue prevents telemetry to be accepted at client end. Our findings show the root cause is related to some of our service roles going into an unhealthy state which causes them to stop accepting data. We understand the root cause and working on applying mitigation per region at a time, but this might take longer to mitigate the issue in all locations. As a workaround customer can use below suggested approach.  

Work Around: Reboot client machines/applications

Next Update: Before 03/26 07:30 UTC

-Arvind