Experiencing Data Access Issue in Azure portal for Log Analytics - 05/26 - Resolved

Published May 26 2021 08:44 AM 685 Views
Final Update: Wednesday, 26 May 2021 17:02 UTC

We've confirmed that all systems are back to normal with no customer impact as of 05/26, 17:03 UTC. Our logs show the incident started on 05/26, 11:14 UTC and that during the 5+ hours that it took to resolve the issue 100% of customers in the UK South region experienced ingestion latency and misfired alerts.
  • Root Cause: The failure was due to an incorrect configuration of a back end service rolled out in a new deployment.
  • Incident Timeline: 5 Hours, 49 minutes - 05/26, 11:14 UTC through 05/26, 17:03 UTC
We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused.

-Jack

Update: Wednesday, 26 May 2021 15:36 UTC

Root cause has been isolated to capacity issue which caused data latency. To address this issue we increased the capacity. Some customers may continue to experience intermittent data latency and incorrect alert activation for resources in South UK region.
  • Work Around: None
  • Next Update: Before 05/26 18:00 UTC
-Ramesh

%3CLINGO-SUB%20id%3D%22lingo-sub-2388069%22%20slang%3D%22en-US%22%3EExperiencing%20Data%20Access%20Issue%20in%20Azure%20portal%20for%20Log%20Analytics%20-%2005%2F26%20-%20Resolved%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2388069%22%20slang%3D%22en-US%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EFinal%20Update%3C%2FU%3E%3A%20Wednesday%2C%2026%20May%202021%2017%3A02%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe've%20confirmed%20that%20all%20systems%20are%20back%20to%20normal%20with%20no%20customer%20impact%20as%20of%2005%2F26%2C%2017%3A03%20UTC.%20Our%20logs%20show%20the%20incident%20started%20on%2005%2F26%2C%2011%3A14%20UTC%20and%20that%20during%20the%205%2B%20hours%20that%20it%20took%20to%20resolve%20the%20issue%20100%25%20of%20customers%20in%20the%20UK%20South%20region%20experienced%20ingestion%20latency%20and%20misfired%20alerts.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3ERoot%20Cause%3C%2FU%3E%3A%20The%20failure%20was%20due%20to%20an%20incorrect%20configuration%20of%20a%20back%20end%20service%20rolled%20out%20in%20a%20new%20deployment.%3C%2FLI%3E%3CLI%3E%3CU%3EIncident%20Timeline%3C%2FU%3E%3A%205%20Hours%2C%2049%20minutes%20-%2005%2F26%2C%2011%3A14%20UTC%20through%2005%2F26%2C%2017%3A03%20UTC%3CBR%20%2F%3E%3C%2FLI%3E%3C%2FUL%3EWe%20understand%20that%20customers%20rely%20on%20Azure%20Log%20Analytics%20as%20a%20critical%20service%20and%20apologize%20for%20any%20impact%20this%20incident%20caused.%3CBR%20%2F%3E%3CBR%20%2F%3E-Jack%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EUpdate%3C%2FU%3E%3A%20Wednesday%2C%2026%20May%202021%2015%3A36%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3ERoot%20cause%20has%20been%20isolated%20to%20capacity%20issue%20which%20caused%20data%20latency.%20To%20address%20this%20issue%20we%20increased%20the%20capacity.%20Some%20customers%20may%20continue%20to%20experience%20intermittent%20data%20latency%20and%20incorrect%20alert%20activation%20for%20resources%20in%20South%20UK%20region.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3EWork%20Around%3C%2FU%3E%3A%20None%3C%2FLI%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2005%2F26%2018%3A00%20UTC%3C%2FLI%3E%3C%2FUL%3E-Ramesh%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FLINGO-BODY%3E
Version history
Last update:
‎May 26 2021 10:09 AM
Updated by: