%3CLINGO-SUB%20id%3D%22lingo-sub-1406976%22%20slang%3D%22en-US%22%3EExperiencing%20Alerting%20failure%20for%20Metric%20Alerts%20-%2005%2F20%20-%20Resolved%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1406976%22%20slang%3D%22en-US%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EFinal%20Update%3C%2FU%3E%3A%20Thursday%2C%2021%20May%202020%2001%3A30%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3EWe've%20confirmed%20that%20all%20systems%20are%20back%20to%20normal%20with%20no%20customer%20impact%20as%20of%2005%2F21%2C%2001%3A00%20UTC.%20Our%20logs%20show%20the%20incident%20started%20on%2005%2F20%2C%2020%3A15%20UTC%20and%20that%20during%20the%204%20hours%2045%20mins%20that%20it%20took%20to%20resolve%20the%20issue%20customers%20experienced%20failures%20when%20trying%20to%20create%20new%20rules%20are%20modify%20existing%20rules%20in%20East%20US%20and%20South%20Central%20US%20regions.%3CBR%20%2F%3E%3CUL%3E%0A%20%3CLI%3E%3CU%3ERoot%20Cause%3C%2FU%3E%3A%20The%20failure%20was%20due%20to%20mismatch%20in%20configuration%20that%20induced%20conflicts%2C%20in%20one%20our%20dependent%20services.%3C%2FLI%3E%0A%20%20%3CLI%3E%3CU%3EIncident%20Timeline%3C%2FU%3E%3A%204%20Hours%20%26amp%3B%2045%20minutes%20-%2005%2F20%2C%2020%3A15%20UTC%20through%2005%2F21%2C%2001%3A00%20UTC%3CBR%20%2F%3E%3C%2FLI%3E%0A%3C%2FUL%3EWe%20understand%20that%20customers%20rely%20on%20Metric%20Alerts%20as%20a%20critical%20service%20and%20apologize%20for%20any%20impact%20this%20incident%20caused.%3CBR%20%2F%3E%3CBR%20%2F%3E-chandar%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EUpdate%3C%2FU%3E%3A%20Thursday%2C%2021%20May%202020%2000%3A27%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3ERoot%20cause%20has%20been%20isolated%20to%20a%20dependency%20service%20which%20was%20impacting%20manipulating%20(create%20and%20update)%20of%20Metric%20Alerts%20rules.%26nbsp%3B%20To%20address%20this%20issue%20we%20have%20updated%20missing%20metadata.%20Some%20customers%20may%20still%20experience%20failures.%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3EWork%20Around%3C%2FU%3E%3A%20none%3C%2FLI%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2005%2F21%2002%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3E-chandar%3CBR%20%2F%3E%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CDIV%20style%3D%22font-size%3A14px%3B%22%3E%3CU%3EInitial%20Update%3C%2FU%3E%3A%20Wednesday%2C%2020%20May%202020%2022%3A19%20UTC%3CBR%20%2F%3E%3CBR%20%2F%3E%3CSPAN%20style%3D%22color%3A%20rgb(51%2C%2051%2C%2051)%3B%20font-family%3A%20SegoeUI%2C%20Lato%2C%20%22%20helvetica%3D%22%22%20neue%3D%22%22%3EWe%20are%20actively%20investigating%20an%20issue%20with%20Metric%20alerts.%20Some%20customers%20using%20Azure%20Monitor%20may%20experience%20failure%20notifications%20when%20performing%20service%20management%20operations%20such%20as%20create%2C%20update%20for%20Azure%20Metric%20Alert%20Rules.%3C%2FSPAN%3E%3CBR%20%2F%3E%3CUL%3E%3CLI%3E%3CU%3EWork%20Around%3C%2FU%3E%3A%20None%3C%2FLI%3E%3CLI%3E%3CU%3ENext%20Update%3C%2FU%3E%3A%20Before%2005%2F21%2000%3A30%20UTC%3C%2FLI%3E%3C%2FUL%3EWe%20are%20working%20hard%20to%20resolve%20this%20issue%20and%20apologize%20for%20any%20inconvenience.%3CBR%20%2F%3E-chandar%3C%2FDIV%3E%3CHR%20style%3D%22border-top-color%3Alightgray%22%20%2F%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FDIV%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-1406976%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EMetric%20Alerts%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Final Update: Thursday, 21 May 2020 01:30 UTC

We've confirmed that all systems are back to normal with no customer impact as of 05/21, 01:00 UTC. Our logs show the incident started on 05/20, 20:15 UTC and that during the 4 hours 45 mins that it took to resolve the issue customers experienced failures when trying to create new rules are modify existing rules in East US and South Central US regions.
  • Root Cause: The failure was due to mismatch in configuration that induced conflicts, in one our dependent services.
  • Incident Timeline: 4 Hours & 45 minutes - 05/20, 20:15 UTC through 05/21, 01:00 UTC
We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused.

-chandar

Update: Thursday, 21 May 2020 00:27 UTC

Root cause has been isolated to a dependency service which was impacting manipulating (create and update) of Metric Alerts rules.  To address this issue we have updated missing metadata. Some customers may still experience failures.
  • Work Around: none
  • Next Update: Before 05/21 02:30 UTC
-chandar

Initial Update: Wednesday, 20 May 2020 22:19 UTC

We are actively investigating an issue with Metric alerts. Some customers using Azure Monitor may experience failure notifications when performing service management operations such as create, update for Azure Metric Alert Rules.
  • Work Around: None
  • Next Update: Before 05/21 00:30 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-chandar