Experiencing issues in Azure Portal for Many Data Types in SUK- 09/14 - Resolved
Published Sep 14 2020 07:49 AM 1,710 Views
Final Update: Tuesday, 15 September 2020 01:42 UTC

We've confirmed that all systems are back to normal with no customer impact as of 9/15, 00:41 UTC. Our logs show the incident started on 9/14 13:54 UTC and that during the 10 hours and 47 minutes that it took to resolve the issue customers experienced data loss and data latency which may have resulted in false and missed alerts.
  • Root Cause: The failure was due to a cooling failure at our data center that resulted in shutting down portions of the data center.
  • Incident Timeline: 10 Hours & 47 minutes - 9/14 13:54 UTC through 9/15, 00:41 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Ian

Update: Tuesday, 15 September 2020 01:19 UTC

Root cause has been isolated to cooling failures and subsequent shutdowns in our data center which were impacting storage and our ability to access and insert data. Our infrastructure has been brought back online. We are making progress with brining the final storage devices back online. Customers should start to see signs of recover soon.
  • Work Around: None
  • Next Update: Before 09/15 05:30 UTC
-Ian

Update: Monday, 14 September 2020 20:14 UTC

Starting at approximately 14:00 UTC on 14 Sep 2020, a single Zone in UK South has experienced a cooling failure. As a result, Storage, Networking and Compute resources were shut down as part of our automated processes to preserve the equipment and prevent damage. As a result the Azure Monitoring Services have experienced missed or latent data which is causing false and missed alerts. Mitigation for the cooling failure is currently in progress. An estimated time for resolution of this issue is still unknown. We apologize for the inconvenience. 
  • Work Around: None
  • Next Update: Before 09/15 00:30 UTC
-Ian

Update: Monday, 14 September 2020 16:28 UTC

We continue to investigate issues within Azure Monitoring Services. Root cause is related to an ongoing storage account issue. Some customers continue to experience missed or latent data which is causing false and missed alerts. We are working to establish the start time for the issue, initial findings indicate that the problem began at 9/14 13:35 UTC. We currently have no estimate for resolution.
  • Work Around: None
  • Next Update: Before 09/14 19:30 UTC
-Ian

Initial Update: Monday, 14 September 2020 14:44 UTC

We are aware of issues within Azure Monitoring Services and are actively investigating. There is an outage on storage event in UK South which caused multiple services to be impacted. 
  • Work Around: None
  • Next Update: Before 09/14 19:00 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-Mohini

Version history
Last update:
‎Sep 14 2020 06:56 PM
Updated by: