We've confirmed that all systems are back to normal with no customer impact as of 7/24, 11:48 UTC. Our logs show the incident started on 7/24, 09:39 UTC and that during the 2 hours and 9 minutes that it took to resolve the issue 30% of customers might have experienced issues with Metric Alerts not evaluating.
Root Cause: The failure was due to an issue in one of our dependent services.
Incident Timeline: 2 Hours & 9 minutes - 7/24, 09:39 UTC through 7/24, 11:48 UTC
We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused.