Log Alerts

24 Topics

Experiencing Issues for Azure Monitor services in West Europe - 11/07 - Resolved
Final Update: Thursday, 07 November 2019 12:29 UTC We've confirmed that all systems are back to normal with no customer impact as of 11/07, 12:50 UTC. Our logs show the incident started on 11/06, 21:10 UTC and that during the 15 hours and 40 minutes that it took to resolve the issue some customers using Azure Monitor Services in West Europe may have experienced error while querying or/and ingesting data along with alerts failures or latent alerts. Customers who are using Service Map in West Europe may have also seen Ingestion delays and latency.. Root Cause: The failure was due to azure storage outage in West Europe region. Incident Timeline: 15 Hours & 40 minutes - 11/06, 21:10 UTC through 11/07, 12:50 UTC We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused. -Mohini Update: Thursday, 07 November 2019 08:20 UTC We continue to investigate issues within Azure monitor services. This issue started since 11/06/2019 21.10 UTC and is caused by our dependency storage system. Our team has been investigating this with partner Azure storage team but we do not have any root cause identified yet. Customers using Azure Monitor Services in West Europe may experience error while querying or/and ingesting data along with alerts failures or latent alerts. Customers who are using Service Map in West Europe may also experience Ingestion delays and latency. We provide an update as we learn. Work Around: None Next Update: Before 11/07 12:30 UTC -Mohini Update: Thursday, 07 November 2019 04:55 UTC We continue to investigate issues within Azure monitor services. This issue started since 11/06/2019 21.10 UTC and is caused by our dependency storage system. Our team has been investigating this with partner Azure storage team but we do not have any root cause identified yet. Customers using Azure Monitor Services in West Europe may experience error while querying or/and ingesting data along with alerts failures or latent alerts. Customers who are using Service Map in West Europe may also experience Ingestion delays and latency. We provide an update as we learn. Work Around: None Next Update: Before 11/07 08:00 UTC -Mohini Initial Update: Thursday, 07 November 2019 00:49 UTC We are aware of issues within Log Analytics and are actively investigating. Some customers may experience data access issues for Log Analytics and also issues with Log alerts not being triggered as expected in West Europe region. Work Around: None Next Update: Before 11/07 03:00 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Sindhu
Azure-Monitor-Team
Nov 07, 2019 Place Azure Monitor Status Archive
4.8KViews
0likes
3Comments
Experiencing Latency, Data Gap and Alerting failure for Azure Monitoring - 07/18 - Resolved
Final Update: Saturday, 18 July 2020 15:37 UTC We've confirmed that all systems are back to normal with no customer impact as of 07/18, 11:40 UTC. Our logs show the incident started on 07/18, 07:50 UTC and that during the 3 hours 50 minutes that it took to resolve the issue some customers may have experienced Data access, Data latency, Data Loss, incorrect Alert activation, missed or delayed Alerts and Azure Alerts created during the impact duration may have been available to be viewed with some delay in the Azure portal in multiple regions. Root Cause: The failure was due to an issue in one of our dependent services. Incident Timeline: 3 Hours & 50 minutes - 07/18, 7:50 UTC through 07/18, 11:40 UTC We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused. -Anmol Update: Saturday, 18 July 2020 11:17 UTC We continue to investigate issues within Azure Monitoring services. Some customers continue to experience Data access, Data latency and Data Loss, incorrect Alert activation, missed or delayed Alerts and Azure Alerts created during the impact duration may not be available to be viewed in the Azure portal in multiple regions. We are working to establish the start time for the issue, initial findings indicate that the problem began at 07/18 ~07:58 UTC. We currently have no estimate for resolution. Work Around: None Next Update: Before 07/18 14:30 UTC -Anmol Initial Update: Saturday, 18 July 2020 08:58 UTC We are aware of issues within Application Insights and Log Analytics and are actively investigating. Some customers may experience Data access issues in the Azure portal, Incorrect Alert Activation, Latency and Data Loss in multiple regions. Work Around: None. Next Update: Before 07/18 11:00 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Madhuri
Azure-Monitor-Team
Jul 18, 2020 Place Azure Monitor Status Archive
3.5KViews
0likes
0Comments
Experiencing Data Access Issue in Azure portal for Log Analytics - 10/17 - Resolved
Final Update: Thursday, 17 October 2019 12:12 UTC We've confirmed that all systems are back to normal with no customer impact as of 10/17, 11:00 UTC. Our logs show the incident started on 10/15, 10:05 UTC and that during the 55 minutes that it took to resolve the issue ~7% of the customers might experienced issues with data access in West Europe region for Log Analytics and also issues with Log Alerts not being triggered as expected. Root Cause: The failure was due to an issue in one of our backend service. Incident Timeline: 55 minutes - 10/17, 10:55 UTC through 10/17, 11:00 UTC We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused. -Anmol Initial Update: Thursday, 17 October 2019 11:10 UTC We are aware of issues within Log Analytics and are actively investigating. Some customers may experience alerting failures and data access issues for Azure Log Analytics in West Europe region. Work Around: None Next Update: Before 10/17 13:30 UTC We are working hard to resolve this issue and apologize for any inconvenience. Anmol
Azure-Monitor-Team
Oct 17, 2019 Place Azure Monitor Status Archive
2.5KViews
0likes
0Comments
Experiencing latency for notifications for Azure Monitor Alerts - 01/25 - Resolved
Final Update: Saturday, 25 January 2020 12:53 UTC We've confirmed that all systems are back to normal with no customer impact as of 01/26, 04:50 UTC. Our logs show the incident started on 01/25, 05:00 UTC and that during the 23 hours and 50 minutes that it took to resolve the issue most of the customers may have experienced issues with missed or delayed Alerts notification for Azure Monitor service. Root Cause: The failure was due to one of our internal services impacted due to the SQL DB outage happened in US Govt regions. Incident Timeline: 23 Hours & 50 minutes - 01/25, 05:00 UTC through 01/26, 04:50 UTC We understand that customers rely on Azure Monitor as a critical service and apologize for any impact this incident caused. -Anmol Update: Saturday, 25 January 2020 10:01 UTC This issue started on 01/25 05:00 UTC and is caused by our dependency database system. Our team has been investigating this with the partner Azure database team but we do not have any root cause identified yet. Customers using Azure Monitor Services in the US Gov region continue to experience missed or delayed Alerts notifications for all type of notifications. We currently have no estimate for resolution. Work Around: Triggered alerts can viewed in Azure Portal alerts page (link). Please use the alerts page to actively monitor the health of resources till the issue is resolved. Next Update: Before 01/25 16:30 UTC -Anmol Initial Update: Saturday, 25 January 2020 07:08 UTC We are aware of issues within Log Search Alerts and are actively investigating. Some customers may experience latency for Webhook and Email notifications. Work Around: None Next Update: Before 01/25 09:30 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Anmol
Azure-Monitor-Team
Jan 25, 2020 Place Azure Monitor Status Archive
2KViews
0likes
0Comments
Experiencing issues in Azure Portal for Many Data Types in SUK- 09/14 - Resolved
Final Update: Tuesday, 15 September 2020 01:42 UTC We've confirmed that all systems are back to normal with no customer impact as of 9/15, 00:41 UTC. Our logs show the incident started on 9/14 13:54 UTC and that during the 10 hours and 47 minutes that it took to resolve the issue customers experienced data loss and data latency which may have resulted in false and missed alerts. Root Cause: The failure was due to a cooling failure at our data center that resulted in shutting down portions of the data center. Incident Timeline: 10 Hours & 47 minutes - 9/14 13:54 UTC through 9/15, 00:41 UTC We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused. -Ian Update: Tuesday, 15 September 2020 01:19 UTC Root cause has been isolated to cooling failures and subsequent shutdowns in our data center which were impacting storage and our ability to access and insert data. Our infrastructure has been brought back online. We are making progress with brining the final storage devices back online. Customers should start to see signs of recover soon. Work Around: None Next Update: Before 09/15 05:30 UTC -Ian Update: Monday, 14 September 2020 20:14 UTC Starting at approximately 14:00 UTC on 14 Sep 2020, a single Zone in UK South has experienced a cooling failure. As a result, Storage, Networking and Compute resources were shut down as part of our automated processes to preserve the equipment and prevent damage. As a result the Azure Monitoring Services have experienced missed or latent data which is causing false and missed alerts. Mitigation for the cooling failure is currently in progress. An estimated time for resolution of this issue is still unknown. We apologize for the inconvenience. Work Around: None Next Update: Before 09/15 00:30 UTC -Ian Update: Monday, 14 September 2020 16:28 UTC We continue to investigate issues within Azure Monitoring Services. Root cause is related to an ongoing storage account issue. Some customers continue to experience missed or latent data which is causing false and missed alerts. We are working to establish the start time for the issue, initial findings indicate that the problem began at 9/14 13:35 UTC. We currently have no estimate for resolution. Work Around: None Next Update: Before 09/14 19:30 UTC -Ian Initial Update: Monday, 14 September 2020 14:44 UTC We are aware of issues within Azure Monitoring Services and are actively investigating. There is an outage on storage event in UK South which caused multiple services to be impacted. Work Around: None Next Update: Before 09/14 19:00 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Mohini
Azure-Monitor-Team
Sep 14, 2020 Place Azure Monitor Status Archive
1.8KViews
0likes
0Comments
Alert Rule for Service Health Events missing in Azure Portal - 11/20
Azure Monitor customers may see issues while viewing Alert status for Service Health Events in the Alerts Blade in Azure Portal. This issue is limited to a subset of customers who have configured the Alert rule for Service Health events and the customer's subscription gets impacted by a given platform or service outage. When a subscription is impacted, we update the portal so that the customer is aware of the situation and can view this info under Activity Log, Service Health and Alerts Blade. Due to a bug this "alert information" is not available in Alerts and Service Health Blade however the customer receives email notifications and can view events in Activity Log or Service Health Blade. We already identified a fix and are in the process of final validation so that we can roll it out in all affected regions but due to the nature of the bug & dependency on many services we expect this to be resolved by Dec 15 across all regions. We apologize for any inconvenience it might have caused. -Ian Cairns
Azure-Monitor-Team
Nov 20, 2019 Place Azure Monitor Status Archive
1.7KViews
0likes
0Comments
Experiencing Alerting failure for alerts and action rules - 08/29 - Mitigated
Final Update: Saturday, 29 August 2020 14:15 UTC We've confirmed that all systems are back to normal with no customer impact as of 08/29, 14:05 UTC. Our logs show the incident started on 08/29, 09:15 UTC and that during the 4 hours 50 minutes that it took to resolve some of customers may experience failures accessing alerts and action rules for the resources. Alerting notifications are not impacted. Root Cause: The failure due to one of dependent service miss configuration . Incident Timeline: 4 Hours & 50 minutes - 08/29, 09:15 UTC through 08/29, 14:05 UTC We understand that customers rely on Alerts as a critical service and apologize for any impact this incident caused. -Subhash Update: Saturday, 29 August 2020 13:46 UTC We continue to investigate issues within alerting management. Some customers may experience failures accessing alerts and action rules for resources. Alerting notifications are not impacted. The problem began at 08/29 09:15 AM UTC. Work Around: None Next Update: Before 08/29 18:00 UTC -Subhash Update: Saturday, 29 August 2020 12:02 UTC We continue to investigate issues within alerting management. Some customers may experience failures accessing alerts and action rules for resources. Alerting notifications are not impacted. The problem began at 08/29 09:15 AM UTC. Work Around: None Next Update: Before 08/29 16:00 UTC -Subhash
Azure-Monitor-Team
Aug 29, 2020 Place Azure Monitor Status Archive
1.7KViews
0likes
0Comments
Experiencing Alerting failure issue in Azure Portal for Many Data Types - 08/28 - Resolved
Final Update: Friday, 28 August 2020 23:39 UTC We've confirmed that all systems are back to normal with no customer impact as of 8/28, 21:30 UTC. Our logs show the incident started on 8/28, 17:30 UTC and that during the 4 hours that it took to resolve the issue, customers in the West US Region could have experience delayed or lost Diagnostic Logs. Customers using App Services Logs in Public Preview could have also experienced missed or delayed logs in all US and Canada Regions. Root Cause: The failure was due to a backend dependency. Incident Timeline: 4 Hours - 8/28, 17:30 UTC through 8/28, 21:30 UTC We understand that customers rely on Azure Monitor as a critical service and apologize for any impact this incident caused. -Eric Singleton
Azure-Monitor-Team
Aug 28, 2020 Place Azure Monitor Status Archive
1.6KViews
0likes
0Comments
Experiencing Latency, Data Loss issue, and alerting issues in West Central US-03/15- Resolved
Final Update: Sunday, 15 March 2020 22:43 UTC We've confirmed that all systems are back to normal with no customer impact as of 3/15, 22:25 UTC. Our logs show the incident started on 3/15, 20:14 UTC and that during the 2 hours 11 minutes that it took to resolve the issue customers experienced latent ingestion of data and log search alerts misfiring. Root Cause: The failure was due to a utility problem at the West Central US data center. Incident Timeline: 2 Hours & 11 minutes - 3/15, 20:14 UTC through 3/15, 22:25 UTC We understand that customers rely on Application Insights and Azure log analytics as critical services and apologize for any impact this incident caused. -Jeff Update: Sunday, 15 March 2020 22:24 UTC Root cause has been isolated to a utility issue in West Central US data center which was impacting communications. To address this issue Azure teams have resolved it. Some customers may experience data still being ingested with a small amount of latency and possible misfiring alerts until the ingestion catches up. Next Update: Before 03/16 00:30 UTC -Jeff Initial Update: Sunday, 15 March 2020 21:40 UTC We are aware of issues within Application Insights and Azure Log Analytics and are actively investigating. Some customers may experience Latency and Data Loss, configuration failures for alerts, misfired alerts, and other unexpected behaviors. Next Update: Before 03/16 00:00 UTC We are working hard to resolve this issue and apologize for any inconvenience. -Jeff
Azure-Monitor-Team
Mar 15, 2020 Place Azure Monitor Status Archive
1.6KViews
0likes
0Comments
Experiencing Data Latency issue in Azure Portal for Many Data Types - 10/07 - Resolved
Final Update: Wednesday, 07 October 2020 20:09 UTC We've confirmed that all systems are back to normal with no customer impact as of 10/7, 19:00 UTC. Our logs show the incident started on 10/7, at approximately 18:30 UTC and that during the 30 minutes that it took to resolve the issue most Application Insights and Log Analytics customers experienced outages with various services. Root Cause: The failure was due to a back-end networking issue that caused problems with a large number of Azure services. Incident Timeline: 0 Hours & 30 minutes - 10/7, 18:30 UTC through 10/7, 19:00 UTC We understand that customers rely on Application Insights and Log Analytics as critical services and apologize for any impact this incident caused. -Jack Cantwell
Azure-Monitor-Team
Oct 07, 2020 Place Azure Monitor Status Archive
1.5KViews
0likes
0Comments