Home
%3CLINGO-SUB%20id%3D%22lingo-sub-390248%22%20slang%3D%22en-US%22%3EExperiencing%20Alerting%20failure%20issue%20in%20Azure%20Portal%20for%20Many%20Data%20Types%20-%2003%2F28%20-%20Resolved%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-390248%22%20slang%3D%22en-US%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EFinal%20Update%3C%2FU%3E%3A%20Thursday%2C%2028%20March%202019%2017%3A05%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe've%20confirmed%20that%20all%20systems%20are%20back%20to%20normal%20with%20no%20customer%20impact%20as%20of%2003%2F28%2C16%3A30%26nbsp%3B%20UTC.%20Our%20telemetry%20shows%20the%20incident%20started%20on%2003%2F28%2C13%3A15%20PM%20UTC%20and%20that%20during%20the%203%20hours%2015%20min%20that%20it%20took%20to%20resolve%20the%20issue%2C%20all%20customers%20using%20classic%20alerts%20under%20Application%20Insights%20would%20not%20have%20experienced%20alerts%20state%20change.%20This%20would%20have%20resulted%20alerts%20not%20firing%20for%20unhealthy%20alerts%20or%20healthy%2Funhealthy%20alerts%20would%20not%20have%20resolved.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3ERoot%20Cause%3C%2FU%3E%3A%20The%20failure%20was%20due%20to%20incorrect%20value%20in%20configuration%20of%20one%20of%20the%20dependent%20services%20in%20alerting%20pipeline.%20We%20are%20working%20internally%20to%20investigate%20final%20root%20cause%20of%20the%20issue.%3C%2FLI%3E%3CLI%3E%3CU%3EIncident%20Timeline%3C%2FU%3E%3A%203%20Hours%20%26amp%3B%2015%20minutes%20-%2003%2F28%2C13%3A15%20PM%20UTC%26nbsp%3B%20through%2003%2F28%2C16%3A30%26nbsp%3B%20UTC%3C%2FLI%3E%3C%2FUL%3EWe%20understand%20that%20customers%20rely%20on%20Application%20Insights%20as%20a%20critical%20service%20and%20apologize%20for%20any%20impact%20this%20incident%20caused.%3CBR%20%2F%3E%3CBR%20%2F%3E-Anupama%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3C%2FDIV%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-390248%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EApplication%20Insights%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Final Update: Thursday, 28 March 2019 17:05 UTC

We've confirmed that all systems are back to normal with no customer impact as of 03/28,16:30  UTC. Our telemetry shows the incident started on 03/28,13:15 PM UTC and that during the 3 hours 15 min that it took to resolve the issue, all customers using classic alerts under Application Insights would not have experienced alerts state change. This would have resulted alerts not firing for unhealthy alerts or healthy/unhealthy alerts would not have resolved.
  • Root Cause: The failure was due to incorrect value in configuration of one of the dependent services in alerting pipeline. We are working internally to investigate final root cause of the issue.
  • Incident Timeline: 3 Hours & 15 minutes - 03/28,13:15 PM UTC  through 03/28,16:30  UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Anupama