Experiencing Data Access issues for Azure Monitor - 09/17 - Resolved

Published Sep 17 2021 02:34 PM 2,043 Views
Final Update: Saturday, 18 September 2021 03:51 UTC

We've confirmed that all systems are back to normal with no customer impact as of 09/18, 03:40 UTC. Our logs show that the incident started on 09/17, 19:45 UTC and that during the 7 hours & 55 minutes that it took to resolve the issue some of the customers experienced issues querying their data which can cause delayed or misfired alerts and web tests failures.
  • Root Cause: The failure was due to unhealthy backend Traffic Manger probe.
  • Incident Timeline: 7 Hours & 55 minutes - 09/17, 19:45 UTC through 09/18, 03:40 UTC
We understand that customers rely on Application Insights and Azure Log Analytics as a critical service and apologize for any impact this incident caused.

-Saika

Update: Saturday, 18 September 2021 03:14 UTC

We are continuing to work on mitigation steps. Some customers in East US and West Europe regions may still experience issues querying their data which can cause delayed or misfired alerts and web tests failures.
  • Work Around: None
  • Next Update: Before 09/18 06:30 UTC
-Saika

Update: Saturday, 18 September 2021 00:24 UTC

Root cause has been isolated to a backend Traffic Manger probe that became unhealthy which was impacting Application Insights and Azure Log Analytics. To address this issue we are continuing to work on mitigation steps. Some customers may still experience issues querying their data which can cause delayed or misfired alerts.
  • Work Around: None
  • Next Update: Before 09/18 03:30 UTC
-Saika

Update: Friday, 17 September 2021 21:31 UTC

We continue to investigate issues within Application Insights and Azure Log Analytics. Root cause is not fully understood at this time. Some customers continue to experience issues querying their data which can cause delayed or misfired alerts. We are working to establish the start time for the issue, initial findings indicate that the problem began at 09/17 ~07:45 UTC. We currently have no estimate for resolution.
  • Work Around: none
  • Next Update: Before 09/18 01:00 UTC
-Ian

%3CLINGO-SUB%20id%3D%22lingo-sub-2761934%22%20slang%3D%22en-US%22%3EExperiencing%20Data%20Access%20issues%20for%20Azure%20Monitor%20-%2009%2F17%20-%20Resolved%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2761934%22%20slang%3D%22en-US%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EFinal%20Update%3C%2FU%3E%3A%20Saturday%2C%2018%20September%202021%2003%3A51%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe've%20confirmed%20that%20all%20systems%20are%20back%20to%20normal%20with%20no%20customer%20impact%20as%20of%2009%2F18%2C%2003%3A40%20UTC.%20Our%20logs%20show%20that%20the%20incident%20started%20on%2009%2F17%2C%2019%3A45%20UTC%20and%20that%20during%20the%207%20hours%20%26amp%3B%2055%20minutes%20that%20it%20took%20to%20resolve%20the%20issue%20some%20of%20the%20customers%20experienced%26nbsp%3Bissues%20querying%20their%20data%20which%20can%20cause%20delayed%20or%20misfired%20alerts%20and%20web%20tests%20failures.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3ERoot%20Cause%3C%2FU%3E%3A%20The%20failure%20was%20due%20to%20unhealthy%20backend%20Traffic%20Manger%20probe.%3C%2FLI%3E%3CLI%3E%3CU%3EIncident%20Timeline%3C%2FU%3E%3A%207%20Hours%20%26amp%3B%2055%20minutes%20-%2009%2F17%2C%2019%3A45%20UTC%20through%2009%2F18%2C%2003%3A40%20UTC%3C%2FLI%3E%3C%2FUL%3EWe%20understand%20that%20customers%20rely%20on%20Application%20Insights%20and%20Azure%20Log%20Analytics%20as%20a%20critical%20service%20and%20apologize%20for%20any%20impact%20this%20incident%20caused.%3CBR%20%2F%3E%3CBR%20%2F%3E-Saika%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EUpdate%3C%2FU%3E%3A%20Saturday%2C%2018%20September%202021%2003%3A14%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe%20are%20continuing%20to%20work%20on%20mitigation%20steps.%20Some%20customers%20in%20East%20US%20and%20West%20Europe%20regions%20may%20still%20experience%20issues%20querying%20their%20data%20which%20can%20cause%20delayed%20or%20misfired%20alerts%20and%20web%20tests%20failures.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3EWork%20Around%3C%2FU%3E%3A%20None%3C%2FLI%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2009%2F18%2006%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3E-Saika%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EUpdate%3C%2FU%3E%3A%20Saturday%2C%2018%20September%202021%2000%3A24%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3ERoot%20cause%20has%20been%20isolated%20to%20a%20backend%20Traffic%20Manger%20probe%20that%20became%20unhealthy%20which%20was%20impacting%20Application%20Insights%20and%20Azure%20Log%20Analytics.%20To%20address%20this%20issue%20we%20are%20continuing%20to%20work%20on%20mitigation%20steps.%20Some%20customers%20may%20still%20experience%20issues%20querying%20their%20data%20which%20can%20cause%20delayed%20or%20misfired%20alerts.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3EWork%20Around%3C%2FU%3E%3A%20None%3C%2FLI%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2009%2F18%2003%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3E-Saika%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EUpdate%3C%2FU%3E%3A%20Friday%2C%2017%20September%202021%2021%3A31%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe%20continue%20to%20investigate%20issues%20within%20Application%20Insights%20and%20Azure%20Log%20Analytics.%20Root%20cause%20is%20not%20fully%20understood%20at%20this%20time.%20Some%20customers%20continue%20to%20experience%20issues%20querying%20their%20data%20which%20can%20cause%20delayed%20or%20misfired%20alerts.%20We%20are%20working%20to%20establish%20the%20start%20time%20for%20the%20issue%2C%20initial%20findings%20indicate%20that%20the%20problem%20began%20at%2009%2F17%20~07%3A45%20UTC.%20We%20currently%20have%20no%20estimate%20for%20resolution.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3EWork%20Around%3C%2FU%3E%3A%20none%3C%2FLI%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2009%2F18%2001%3A00%20UTC%3C%2FLI%3E%3C%2FUL%3E-Ian%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FLINGO-BODY%3E
Version history
Last update:
‎Sep 17 2021 08:52 PM
Updated by: