Final Update: Monday, 22 October 2018 21:22 UTC
We've confirmed that all systems are back to normal by 10/22, 21:00 UTC. Our logs show the incident started on 10/22, 15:30 UTC and that during the 5 hours 30m that it took to resolve the issue 14% of availability web tests were not running from affected locations. This issue has been mitigated completely & all of our services are running as expected however impacted web tests reports will continue to show data gaps for impacted duration.
-
Root Cause: The failure was due to code bug identified in our availability services.
-
Incident Timeline: 5 Hours 30m - 10/22, 15:30 UTC through 21:00 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.
-Arvind