Experiencing Alert rule management failure for Metric Alerts - 04/16 - Resolved

Former Employee

Apr 16, 2020

Final Update: Thursday, 16 April 2020 19:42 UTC

We've confirmed that all systems are back to normal with no customer impact as of 4/16, 17:40 UTC. Our logs show the incident started on 4/16, 11:14 UTC and that during ~6 hours 20 min that it took to resolve the issue some customers experienced alert rule management failures for metric alerts on redis cache metrics.

Root Cause: The failure was due to one of the backend services using incorrect configuration.
Incident Timeline: 6 Hours & 20 minutes - 4/16, 11:14 UTC through 4/16, 17:40 UTC

We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused.

-Anupama

Updated Apr 16, 2020

Version 2.0

Metric Alerts

Azure-Monitor-Team

Former Employee

Joined February 13, 2019

View Profile

Azure Monitor Status Archive

Follow this blog board to get notified when there's new activity