Experiencing Delays in Metric Ingestion and Metric Alert Evaluation - 05/17 - Resolved

Published May 17 2022 01:21 PM 963 Views
Final Update: Wednesday, 18 May 2022 01:31 UTC

We've confirmed that all systems are back to normal with no customer impact as of 7/18, 00:15 UTC. Our logs show the incident started on 7/17, 20:31 UTC and that during the 3 hours and 44 minutes that it took to resolve the issue customers experienced delayed metric ingestion. Metric Alert rule evaluation would also have been paused to avoid false alerts during this time. 
.
  • Root Cause: During the window of impact a sudden spike in requests was observed and an operational threshold was reached causing intermittent metric loss.
  • Incident Timeline: 3 Hours & 44 minutes - 7/17, 20:31 UTC through 7/18, 00:15 UTC
We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused.

-Ian

Initial Update: Tuesday, 17 May 2022 23:07 UTC

We are aware of issues within Metric Dashboards and Metric Alerts and are actively investigating. Some customers may experience delayed metric ingestion. Metric Alert evaluation may also be paused to avoid false alerts during this time. 

  • Work Around: none

  • Next Update: Before 05/18 01:30 UTC

We are working hard to resolve this issue and apologize for any inconvenience.
-Ian

Version history
Last update:
‎May 17 2022 06:40 PM
Updated by: