Metric Alerts
74 TopicsExperiencing Alerting failure for Metric Alerts - 04/11 - Resolved
Final Update: Sunday, 11 April 2021 11:24 UTC We've confirmed that all systems are back to normal with no customer impact as of 04/08, 13:45 UTC. Our logs show the incident started on 03/31, 15:45 UTC and that during the 7 days and 22 hours that it took to resolve the issue some customers may have experienced misfired alerts when using Azure Metric Alert Rules on Log Analytics resources in West Europe region. Root Cause: We determined that a backend service responsible for processing alerts became unhealthy due to a configuration issue. Incident Timeline: 7 Days & 22 Hours - 03/31, 15:45 UTC through 04/08, 13:45 UTC We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused. -Madhav879Views0likes0CommentsExperiencing Data Access issue in Azure Portal for Many Data Types - 04/01 - Resolved
Final Update: Friday, 02 April 2021 05:47 UTC We've confirmed that all systems are back to normal with no customer impact as of 4/02, 04:30 UTC. Our logs show the incident started on 4/01, 21:20 UTC and that during the 7 Hours & 10 minutes that it took to resolve the issue some customers may have experienced data access issue, missed or delayed azure alerts and data ingestion latency issue. Root Cause: The failure was due to DNS outage. Incident Timeline: 7 Hours & 10 minutes - 4/01, 21:20 UTC through 4/02, 04:30 UTC We understand that customers rely on Azure Monitor service as a critical service and apologize for any impact this incident caused. -Harshita Update: Friday, 02 April 2021 03:10 UTC We continue to have residual effect in Azure monitor services due to DNS outage. Some customers in East US2 may still experience data access issue and missed/delayed Azure alerts. Customers in Central US region may still experience data ingestion latency. Work Around: None Next Update: Before 04/02 05:30 UTC -Anupama Update: Friday, 02 April 2021 01:34 UTC We continue to have residual effect in Azure monitor services due to DNS outage. Some customers in East US2 may still experience missed/delayed Azure alerts. The issue with data access and data ingestion has been recovered and services are healthy in East US. We currently have no estimate for resolution. Work Around: None Next Update: Before 04/02 03:00 UTC -Anupama Update: Thursday, 01 April 2021 23:46 UTC We continue to have issues within Azure monitor services due to DNS outage. Some customers in East US and East US2 continue to experience issues accessing data, issues with data ingestion and missed/delayed Azure alerts. We currently have no estimate for resolution. Work Around: None Next Update: Before 04/02 02:00 UTC -Anupama Initial Update: Thursday, 01 April 2021 22:29 UTC We are aware of issues within Azure monitoring services due to a DNS outage and we are actively investigating. Some customers may experience issues accessing data, issues with data ingestion and missed/delayed Azure alerts. Work Around: None Next Update: Before 04/02 00:30 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Anupama991Views0likes0CommentsExperiencing False Alerts for Metric Alerts on Log Analytics workspaces - 03/10 - Resolved
Final Update: Wednesday, 10 March 2021 19:38 UTC We've confirmed that all systems are back to normal with no customer impact as of 03/10, 18:00 UTC. Our logs show that the incident started on 03/08, 20:00 UTC and that during the 1 day & 22 hours that it took to resolve the issue some of the customers experienced missed or misfired alerts when using Azure Metric Alert rules on Log Analytics resources in West Europe region. Root Cause: The failure was due to a misconfiguration of a backend dependency. Incident Timeline: 1 day & 22 hours - 03/08, 20:00 UTC through 03/10, 18:00 UTC We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused. -Saika Update: Wednesday, 10 March 2021 16:56 UTC Root cause has been isolated to a misconfiguration of a backend dependency. Engineers are currently working hard to resolve this issue. Some customers may experience missed or misfired alerts when using Azure Metric Alert Rules on Log Analytics resources in West Europe region. Work Around: None Next Update: Before 03/10 20:00 UTC -Saika Initial Update: Wednesday, 10 March 2021 13:44 UTC We are aware of issues within Metric Alerts on Log analytics workspaces and are actively investigating. Some customers may experience missing or misfired alerts when using Azure Metric Alert Rules on Log Analytics resources. Work Around: none Next Update: Before 03/10 16:00 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Soumyajeet1.5KViews0likes0CommentsExperiencing Data Access issue in Azure Portal for Many Data Types - 02/15 - Resolved
Final Update: Monday, 15 February 2021 14:04 UTC We've confirmed that all systems are back to normal with no customer impact as of 02/15, 14:02 UTC. Our logs show the incident started on 02/15, 11:30 UTC and that during the 2 hours 32 Minutes that it took to resolve the issue some customers may have seen errors but it will not impact any data access, ingestion or alerting. Root Cause: The failure was due to some config changes into one of our dependent services. Incident Timeline: 2 Hours & 32 minutes - 02/15, 11:30 UTC through 02/15, 14:02 UTC We understand that customers rely on Azure Monitor Service as a critical service and apologize for any impact this incident caused. -Vamshi Initial Update: Monday, 15 February 2021 13:10 UTC We are aware of issues within Azure Monitor Service and are actively investigating. Some customers may see errors but it will not impact any data access, ingestion or alerting. Errors are caused by some config changes but not impacting any scenario. We have been working to fix the cause of error and provide an update in 2 hours. Work Around: None. Next Update: Before 02/15 17:30 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Mohini1.6KViews0likes0CommentsExperiencing Data Latency and Data Access issues for Azure Monitor
Final Update: Friday, 12 February 2021 06:05 UTC We've confirmed that all systems are back to normal with no customer impact as of 02/12, 05:24 UTC. Our logs show the incident started on 02/12, 01:52 UTC and that during the 3 hours & 32 minutes that it took to resolve the issue some customers may have experienced Data Latency, Data Access and delayed or misfired Alerts in West US region. Root Cause: The failure was due to one of our backend dependent service. Incident Timeline: 3Hours & 32 minutes - 02/12, 01:52 UTC through 02/12, 05:24 UTC We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused. -Deepika Initial Update: Friday, 12 February 2021 01:35 UTC We are aware of issues within Azure Monitor Services and are actively investigating. Some customers in West US Region may experience Data Latency, Data Access and delayed or misfired Alerts. Next Update: Before 02/12 06:00 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Jayadev1.2KViews0likes0CommentsExperiencing errors when accessing alerts in Azure Monitor - 01/27 - Resolved
Final Update: Wednesday, 27 January 2021 10:28 UTC We've confirmed that all systems are back to normal with no customer impact as of 01/27, 10:00 UTC. Our logs show the incident started on 01/27, 09:15 UTC and that during the 45 minutes that it took to resolve the issue some of customers may have received errors when accessing alerts. The alerts notifications were not impacted. Root Cause: We determined that a recent deployment task impacted instances of the backend service which became unhealthy, causing these errors. Incident Timeline: 45 minutes - 01/27, 09:15 UTC through 01/27, 10:00 UTC We understand that customers rely on Azure Monitor as a critical service and apologize for any impact this incident caused. -Anmol1.5KViews0likes0CommentsAzure Alerts notifications loss and missing alerts in Azure portal - 01/21 - Resolved
Final Update: Thursday, 21 January 2021 10:55 UTC We've confirmed that all systems are back to normal with no customer impact as of 01/21, 10:25 UTC. Our logs show the incident started on 01/21, 08:45 UTC and that during the 1hours 40 minutes that it took to resolve the issue some customers may have experienced missed alerts across regions. Root Cause: We determined that a backend service responsible for processing alerts became unhealthy after a recent configuration change following a deployment. Incident Timeline: 1 Hours & 40 minutes - 01/21, 08:45 UTC through 01/21, 10:25 UTC We understand that customers rely on Azure Alerts as a critical service and apologize for any impact this incident caused. -Deepika1.8KViews0likes0CommentsExperiencing Alerting failure for Metric Alerts - 01/18 - Resolved
Final Update: Monday, 18 January 2021 11:30 UTC We've confirmed that all systems are back to normal with no customer impact as of 01/18, 11:00 UTC. Our logs show the incident started on 01/18, 10:00 UTC and that during the 1 hour that it took to resolve the issue some customers may have experienced missing or misfired alerts in East US region while using Azure Metric Alert Rules. Root Cause: The failure was due to a recent deployment task Incident Timeline: 1Hour & 0 minutes - 01/18, 10:00 UTC through 01/18, 11:00 UTC We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused. -DeepikaExperiencing Alerting failure for Metric Alerts - 01/17 - Resolved
Final Update: Sunday, 17 January 2021 12:03 UTC We've confirmed that all systems are back to normal with no customer impact as of 01/17, 11:00 UTC. Our logs show the incident started on 01/17, 10:00 UTC and that during the 1 hour that it took to resolve some of customers may experience missing of misfired alerts when using Azure Metric Alert Rules. Root Cause: The failure was due to a recent deployment task . Incident Timeline: 1 Hour - 01/17, 10:00 UTC through 01/17, 11:00 UTC We understand that customers rely on Alerts as a critical service and apologize for any impact this incident caused. -Soumyajeet941Views0likes0CommentsExperiencing Alerting failure for Metric Alerts - 12/15 - Resolved
Final Update: Wednesday, 16 December 2020 07:53 UTC We've confirmed that all systems are back to normal with no customer impact as of 12/16, 07:35 UTC. Our logs show the incident started on 12/01, 23:00 UTC and that during the 14 days 8 Hours 35 Minutes that it took to resolve the issue some of customers may have experienced duplicate notifications for unhealthy metric alerts that only contain non-ascii characters in the name of the alert. Root Cause: We identified the issue was caused by a recent deployment that introduced a configuration incompatibility between dependent services. Incident Timeline: 14 days 8 Hours & 35 minutes - 12/01, 23:00 UTC through 12/16, 07:35 UTC We understand that customers rely on Metric Alerts as a critical service and apologize for any impact this incident caused. -Vamshi Update: Wednesday, 16 December 2020 06:27 UTC Root cause has been isolated to new feature rollout which was causing duplicate notification when resource was unhealthy for some customer in the azure portal. Specifically, any metric alert names with non ascii characters see this impact. To address the issue, we've started mitigation to disable the feature. Some customers may still continue to see notifications and we estimate 4 hours before all is addressed. Mitigation Steps: Redeploying with new feature to disable the alert notification is in progress Next Update: Before 12/16 10:30 UTC -Vamshi Update: Wednesday, 16 December 2020 02:45 UTC Root cause has been isolated to new feature rollout which was causing duplicate notification when resource was unhealthy for some customer in the azure portal. Specifically, any metric alert names with non ascii characters see this impact. To address the issue, we've started mitigation to disable the feature. Some customers may still continue to see notifications and we estimate 4 hours before all is addressed. Mitigation Steps: Redeploying with new feature disabled is in progress. Next Update: Before 12/16 07:00 UTC -Vincent Update: Tuesday, 15 December 2020 22:17 UTC Root cause has been isolated to new feature rollout which was causing duplicate notification when resource was unhealthy for some customer in the azure portal. Specifically, any metric alert names with non ascii characters see this impact. To address the issue, we've started mitigation to disable the feature. Some customers may continue to see notifications and we estimate 4 hours before all is addressed. Root Cause: New feature rollout which was causing duplicate notification. Incident Timeline: 14 day 23 hours - 12/01, 23:00 UTC through on going 12/15, 22:00 UTC Next Update: Before 12/16 01:30 UTC -Vincent1.8KViews0likes0Comments