We've confirmed that all systems are back to normal with no customer impact as of 02/16, 02:05 UTC. Our logs show the incident started on 02/16, 00:15 UTC and that during the 1 hour and 50 minutes that it took to resolve the issue customers experienced intermittent data gaps of up to 10% of data and incorrect alert activation.
Root Cause: The failure was due to a specific instance of the service processing backend that became unhealthy.
Incident Timeline: 1 Hours & 50 minutes - 02/16, 00:15 UTC through 02/16, 02:05 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.
Update: Tuesday, 16 February 2021 02:12 UTC
Root cause has been isolated to a specific instance of the service processing backend that became unhealthy which was impacting the ingestion pipeline. To address this issue we restarted the affected instance and retrieved instance data for analysis.
Work Around: None
Next Update: Before 02/16 04:30 UTC
Initial Update: Tuesday, 16 February 2021 01:40 UTC
We are aware of issues within Application Insights and are actively investigating. Some customers in Switzerland West may experience intermittent data gaps of up to 10% of data and incorrect alert activation starting at 2021-02-15 00:15 UTC.
Next Update: Before 02/16 04:00 UTC
We are working hard to resolve this issue and apologize for any inconvenience. -Jeff