Home
%3CLINGO-SUB%20id%3D%22lingo-sub-777105%22%20slang%3D%22en-US%22%3EExperiencing%20Data%20Access%20Issue%20in%20Azure%20portal%20for%20Log%20Analytics%20-%2007%2F29%20-%20Resolved%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-777105%22%20slang%3D%22en-US%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EFinal%20Update%3C%2FU%3E%3A%20Monday%2C%2029%20July%202019%2015%3A06%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe've%20confirmed%20that%20all%20systems%20are%20back%20to%20normal%20with%20no%20customer%20impact%20as%20of%207%2F29%2C%2013%3A58%20UTC.%20Our%20logs%20show%20the%20incident%20started%20on%207%2F29%2C%2007%3A10%20UTC%20and%20that%20during%20the%207%20hours%20that%20it%20took%20to%20resolve%20the%20issue%20very%20small%20percentage%20of%20customers%20having%20their%20data%20hosted%20on%20impacted%20backend%20cluster%20would%20have%20experienced%20data%26nbsp%3B%20access%20issues%20and%20log%20alerts%20might%20not%20have%20worked%20as%20expected.%26nbsp%3B%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3ERoot%20Cause%3C%2FU%3E%3A%20Initial%20RCA%20suggests%20that%20there%20was%20a%20combination%20of%20recent%20deployment%20and%20very%20expensive%20queries%20hitting%20the%20performance%20of%20the%20cluster.%3C%2FLI%3E%3CLI%3E%3CU%3ELessons%20Learned%3C%2FU%3E%3A%20We%20have%20collected%20required%20telemetry%20with%20perf%20dumps%20and%20are%20currently%20looking%20through%20the%20same%20to%20look%20for%20final%20RCA%20and%20avoid%20recurrence.%3C%2FLI%3E%3CLI%3E%3CU%3EIncident%20Timeline%3C%2FU%3E%3A%206%20Hours%20%26amp%3B%2048%20minutes%20-%207%2F29%2C%2007%3A10%20UTC%20through%207%2F29%2C%2013%3A58%20UTC%3C%2FLI%3E%3C%2FUL%3EWe%20understand%20that%20customers%20rely%20on%20Azure%20Log%20Analytics%20as%20a%20critical%20service%20and%20apologize%20for%20any%20impact%20this%20incident%20caused.%3CBR%20%2F%3E%3CBR%20%2F%3E-Durga%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EUpdate%3C%2FU%3E%3A%20Monday%2C%2029%20July%202019%2011%3A10%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe%20continue%20to%20investigate%20access%20issues%20within%20Log%20Analytics.%20We%20have%20identified%20that%20one%20of%20the%20backend%20service%20clusters%20in%20West%20Europe%20region%20is%20experiencing%20performance%20degradation%20causing%20these%20issues.%20Some%20customers%20continue%20to%20experience%20data%20access%20and%20Log%20Alerts%20issue%20using%20Log%20Analytics%20in%20West%20Europe%20region.%26nbsp%3B%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3EWork%20Around%3C%2FU%3E%3A%20None.%3C%2FLI%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2007%2F29%2015%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3E-Madhuri%20Poloju%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EInitial%20Update%3C%2FU%3E%3A%20Monday%2C%2029%20July%202019%2009%3A16%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe%20are%20aware%20of%20issues%20within%20Log%20Analytics%20and%20are%20actively%20investigating.%20Some%20customers%20may%20experience%20data%20access%20issues%20using%20Log%20Analytics%20in%20West%20Europe%20region.%26nbsp%3B%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3EWork%20Around%3C%2FU%3E%3A%20%3CSPAN%20style%3D%22color%3A%20rgb(0%2C%200%2C%200)%3B%20font-family%3A%20%22%20helvetica%3D%22%22%20neue%3D%22%22%3E%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3ENone.%26nbsp%3B%3C%2FSPAN%3E%3CNONE%20or%3D%22%22%20details%3D%22%22%3E%3C%2FNONE%3E%3C%2FLI%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2007%2F29%2011%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3EWe%20are%20working%20hard%20to%20resolve%20this%20issue%20and%20apologize%20for%20any%20inconvenience.%3CBR%20%2F%3E-Madhuri%20Poloju%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-777105%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAzure%20Log%20Analytics%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3ELog%20Search%20Alerts%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Final Update: Monday, 29 July 2019 15:06 UTC

We've confirmed that all systems are back to normal with no customer impact as of 7/29, 13:58 UTC. Our logs show the incident started on 7/29, 07:10 UTC and that during the 7 hours that it took to resolve the issue very small percentage of customers having their data hosted on impacted backend cluster would have experienced data  access issues and log alerts might not have worked as expected. 
  • Root Cause: Initial RCA suggests that there was a combination of recent deployment and very expensive queries hitting the performance of the cluster.
  • Lessons Learned: We have collected required telemetry with perf dumps and are currently looking through the same to look for final RCA and avoid recurrence.
  • Incident Timeline: 6 Hours & 48 minutes - 7/29, 07:10 UTC through 7/29, 13:58 UTC
We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused.

-Durga

Update: Monday, 29 July 2019 11:10 UTC

We continue to investigate access issues within Log Analytics. We have identified that one of the backend service clusters in West Europe region is experiencing performance degradation causing these issues. Some customers continue to experience data access and Log Alerts issue using Log Analytics in West Europe region. 
  • Work Around: None.
  • Next Update: Before 07/29 15:30 UTC
-Madhuri Poloju

Initial Update: Monday, 29 July 2019 09:16 UTC

We are aware of issues within Log Analytics and are actively investigating. Some customers may experience data access issues using Log Analytics in West Europe region. 
  • Work Around:  None. 
  • Next Update: Before 07/29 11:30 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-Madhuri Poloju