Home
%3CLINGO-SUB%20id%3D%22lingo-sub-887962%22%20slang%3D%22en-US%22%3EExperiencing%20Alerting%20failure%20for%20Metric%20Alerts%20-%2010%2F02%20-%20Resolved%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-887962%22%20slang%3D%22en-US%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EFinal%20Update%3C%2FU%3E%3A%20Wednesday%2C%2002%20October%202019%2017%3A11%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe've%20confirmed%20that%20all%20systems%20are%20back%20to%20normal%20with%20no%20customer%20impact%20as%20of%2010%2F2%2C%2016%3A40%20UTC.%20Our%20logs%20show%20the%20incident%20started%20on%2010%2F2%2C%2001%3A45%20UTC%20and%20that%20during%20that%20time%20customers%20would%20have%20experienced%20intermittent%20missing%20of%20configured%20metric%20alerts.%3CBR%20%2F%3E%3CUL%3E%0A%20%3CLI%3E%3CU%3ERoot%20Cause%3C%2FU%3E%3A%20The%20failure%20was%20due%20to%20performance%20bug%20in%20a%20backend%20system.%26nbsp%3B%3C%2FLI%3E%0A%20%3CLI%3E%3CU%3EIncident%20Timeline%3C%2FU%3E%3A%2014%20Hours%20%26amp%3B%2055%20minutes%3C%2FLI%3E%0A%3C%2FUL%3EWe%20understand%20that%20customers%20rely%20on%20Metric%20Alerts%20as%20a%20critical%20service%20and%20apologize%20for%20any%20impact%20this%20incident%20caused.%3CBR%20%2F%3E%3CBR%20%2F%3E-Jeff%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EUpdate%3C%2FU%3E%3A%20Wednesday%2C%2002%20October%202019%2015%3A23%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3ERoot%20cause%20has%20been%20isolated%20to%20failure%20in%20one%20of%20our%20dependent%20services%20which%20was%20impacting%20Metric%20alerts.%20We%20continue%20to%20work%20on%20fixing%20the%20issue.%20Some%20customers%20in%20East%20US%20region%20may%20still%20continue%20to%20experience%20issues%20with%20Metric%20Alerts%20either%20not%20being%20delivered%20or%20may%20receive%20false%20positive%20alerts.%26nbsp%3B%3CUL%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2010%2F02%2019%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3E-Madhuri%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EUpdate%3C%2FU%3E%3A%20Wednesday%2C%2002%20October%202019%2009%3A22%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe%20continue%20to%20investigate%20issues%20within%20Metric%20Alerts.%20Root%20cause%20is%20not%20fully%20understood%20at%20this%20time.%20Some%20customers%20in%20East%20US%20region%20may%20still%20continue%20to%20experience%20issues%20with%20Metric%20Alerts%20either%20not%20being%20delivered%20or%20may%20receive%20false%20positive%20alerts.%20We%20are%20working%20to%20establish%20the%20start%20time%20for%20the%20issue%2C%20initial%20findings%20indicate%20that%20the%20problem%20began%20at%2010%2F02%2001%3A45%20UTC.%26nbsp%3B%3CUL%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2010%2F02%2015%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3E-Madhuri%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EInitial%20Update%3C%2FU%3E%3A%20Wednesday%2C%2002%20October%202019%2005%3A23%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe%20are%20aware%20of%20issues%20within%20Metric%20Alerts%20and%20are%20actively%20investigating.%20Some%20customers%20in%20East%20US%20region%20may%20experience%20issues%20with%20Metric%20Alerts%20either%20not%20being%20delivered%20or%20may%20receive%20false%20alerts.%3CUL%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2010%2F02%2009%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3EWe%20are%20working%20hard%20to%20resolve%20this%20issue%20and%20apologize%20for%20any%20inconvenience.%3CBR%20%2F%3E-Madhuri%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-887962%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EMetric%20Alerts%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Final Update: Wednesday, 02 October 2019 17:11 UTC

We've confirmed that all systems are back to normal with no customer impact as of 10/2, 16:40 UTC. Our logs show the incident started on 10/2, 01:45 UTC and that during that time customers would have experienced intermittent missing of configured metric alerts.
  • Root Cause: The failure was due to performance bug in a backend system. 
  • Incident Timeline: 14 Hours & 55 minutes
We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused.

-Jeff

Update: Wednesday, 02 October 2019 15:23 UTC

Root cause has been isolated to failure in one of our dependent services which was impacting Metric alerts. We continue to work on fixing the issue. Some customers in East US region may still continue to experience issues with Metric Alerts either not being delivered or may receive false positive alerts. 
  • Next Update: Before 10/02 19:30 UTC
-Madhuri

Update: Wednesday, 02 October 2019 09:22 UTC

We continue to investigate issues within Metric Alerts. Root cause is not fully understood at this time. Some customers in East US region may still continue to experience issues with Metric Alerts either not being delivered or may receive false positive alerts. We are working to establish the start time for the issue, initial findings indicate that the problem began at 10/02 01:45 UTC. 
  • Next Update: Before 10/02 15:30 UTC
-Madhuri

Initial Update: Wednesday, 02 October 2019 05:23 UTC

We are aware of issues within Metric Alerts and are actively investigating. Some customers in East US region may experience issues with Metric Alerts either not being delivered or may receive false alerts.
  • Next Update: Before 10/02 09:30 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-Madhuri