Final Update: Wednesday, 01 September 2021 16:40 UTC
We've confirmed that all systems are back to normal with no customer impact as of 2021-09-01 18:58 UTC. Our logs show the incident started on 2021-08-29 13:22 UTC and that during the two days, 5.5 hours that it took to resolve the issue 100% of customers who queried metrics from the Microsoft.Network/privateEndpoints and Microsoft.Network/privateLinkService resource providers will have returned 400 error codes and not been able to query their metrics.
Root Cause: The failure was due to new feature rollout that contained an error.
Lessons Learned: We will be applying greater monitoring in lower level environments to catch errors like these before they reach production environments.
Incident Timeline: 2 Days, 5 Hours & 36 minutes - 2021-08-29 13:22 UTC through 2021-09-01 18:58 UTC
We understand that customers rely on Azure Monitor and Azure Metrics as critical services and apologize for any impact this incident caused.
Update: Tuesday, 31 August 2021 02:26 UTC
Root cause has been isolated to a feature update which inadvertently caused metrics for the resource providers Microsoft.Network/privateEndpoints and
Microsoft.Network/privateLinkService to return 400 errors when queried. To address this issue we are rolling out a hotfix. The hotfix rollout process across all regions will take approximately 18 hours. Some customers will continue to experience these issues until all regions have have gotten the hotfix applied.