Experiencing Alerting failure issue in Azure Portal for Many Data Types - 08/20 - Mitigated

Published Aug 20 2020 05:05 PM 1,177 Views
Final Update: Thursday, 20 August 2020 23:46 UTC

We've confirmed that all systems are back to normal with no customer impact as of 8/14/20, 22:12 UTC. Our logs show the incident started on 9/23/19, 00:00 UTC and that during the 326 days that it took to resolve the issue 138 customers using metric alerts based on custom metrics in Brazil South region may have seen incorrect alert activations and or failures.
  • Root Cause: The failure was due to a backend storage configuration.
  • Incident Timeline: 326 days 22 hours 12 minutes   - 9/23/2019, 00:00 UTC through 8/14/2020, 22:12 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Jeff

%3CLINGO-SUB%20id%3D%22lingo-sub-1602837%22%20slang%3D%22en-US%22%3EExperiencing%20Alerting%20failure%20issue%20in%20Azure%20Portal%20for%20Many%20Data%20Types%20-%2008%2F20%20-%20Mitigated%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1602837%22%20slang%3D%22en-US%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3C%2FDIV%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3C%2FDIV%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EFinal%20Update%3C%2FU%3E%3A%20Thursday%2C%2020%20August%202020%2023%3A46%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe've%20confirmed%20that%20all%20systems%20are%20back%20to%20normal%20with%20no%20customer%20impact%20as%20of%208%2F14%2F20%2C%2022%3A12%20UTC.%20Our%20logs%20show%20the%20incident%20started%20on%209%2F23%2F19%2C%2000%3A00%20UTC%20and%20that%20during%20the%20326%20days%20that%20it%20took%20to%20resolve%20the%20issue%20138%20customers%20using%20metric%20alerts%20based%20on%20custom%20metrics%20in%20Brazil%20South%20region%20may%20have%20seen%20incorrect%20alert%20activations%20and%20or%20failures.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3ERoot%20Cause%3C%2FU%3E%3A%20The%20failure%20was%20due%20to%20a%20backend%20storage%20configuration.%3C%2FLI%3E%3CLI%3E%3CU%3EIncident%20Timeline%3C%2FU%3E%3A%20326%20days%2022%20hours%2012%20minutes%26nbsp%3B%20%26nbsp%3B-%209%2F23%2F2019%2C%2000%3A00%20UTC%20through%208%2F14%2F2020%2C%2022%3A12%20UTC%3C%2FLI%3E%3C%2FUL%3EWe%20understand%20that%20customers%20rely%20on%20Application%20Insights%20as%20a%20critical%20service%20and%20apologize%20for%20any%20impact%20this%20incident%20caused.%3CBR%20%2F%3E%3CBR%20%2F%3E-Jeff%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-1602837%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EApplication%20Insights%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EMetric%20Alerts%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Version history
Last update:
‎Aug 20 2020 05:05 PM
Updated by: