Experiencing Alerting failure issue in Azure Portal for Many Data Types - 09/20 - Resolved
Published Feb 19 2019 08:39 AM 188 Views
First published on MSDN on Sep 23, 2018
Final Update: Sunday, 23 September 2018 02:14 UTC

We confirm that all systems are back to normal and customers should experience no errors with Availability tests in Application Insights from 09/21 05:00 UTC. However, a few customers in East US, West US and Central US would see that their multi-step Availability tests continue to be disabled. These customers have been identified and notified directly.
Our logs show the failures were observed from 09/20 12:00 UTC to 09/21 05:00 UTC, and during this time a small percentage of customers in the above regions would have experienced errors in web tests.

  • Root Cause: Root cause was identified with one of the dependent services which took few hours to update and resolve                         the failures.
  • Incident Timeline:  17 Hours  - 09/20,12:00 UTC through 09/21,05:00 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Sapna
Update: Saturday, 22 September 2018 07:13 UTC

Web tests identified with broken configurations in 3 regions - East US,West US and Central US continue to be disabled to protect other web tests from failing in these regions. We applied one mitigation by upgrading the required binaries but it didn’t success fully due to some other dependencies in our infrastructure. As we understand root cause completely, our team is working on alternative approach to resolve the issue. This mitigation was not applied in first place as we know it would take longer to fix the issue. We appreciate your patience while we work on this fix. We provide periodic updates as we progress.
  • Work Around: None
  • Next Update: Before 09/23 07:30 UTC
-Mohini
Update: Friday, 21 September 2018 18:37 UTC

Web tests identified with broken configurations in 3 regions - East US,West US and Central US continue to be disabled to protect other web tests from failing in these regions. We are actively working on the fix to address the issue.

  • Work Around: None
  • Next Update: Before 09/22 07:00 UTC
-Sapna
Update: Friday, 21 September 2018 14:58 UTC

We have identified web tests with broken configurations in 3 regions - East US,West US and Central US and have temporarily disabled execution of these availability tests. The webtests will appear as enabled, but won't have any availability results. We are working on a fix for the configuration.
  • Work Around: None
  • Next Update: Before 09/21 19:00 UTC
-Mohini
Update: Friday, 21 September 2018 02:35 UTC

We have identified web tests with broken configurations in 3 regions - East US,West US and Central US and have disabled these impacted web tests to protect other web tests from failing in these regions.

  • Work Around: None
  • Next Update: Before 09/21 15:00 UTC
-Sapna
Update: Friday, 21 September 2018 00:45 UTC

The issue is resolved in East US and Central US regions and customers should no longer experience errors in Availability tests for these regions; however issue has reoccurred in West US region. Customers will experience errors in Availability tests for West US region. We are actively working on the mitigation to address the issue.

  • Work Around: None
  • Next Update: Before 09/21 05:00 UTC
-Sapna
Update: Thursday, 20 September 2018 21:55 UTC

Root cause has been identified which impacted availability tests in Central US, East US and West US regions and the mitigation is in progress region by region. The issue is resolved for East US and West US regions and customers should no longer experience errors in these regions, however customers will continue experiencing errors in Availability tests in Central US region. We estimate another 3 hours before the errors are resolved for Central US region as well.

  • Work Around: None
  • Next Update: Before 09/21 01:00 UTC
-Sapna
Initial Update: Thursday, 20 September 2018 17:39 UTC

We are aware of issues within Application Insights and are actively investigating. Some customers in Central US, West US and East US regions may experience availability test issues in Azure Portal.

  • Work Around: None
  • Next Update: Before 09/20 22:00 UTC
We are working hard to resolve this issue and apologize for any inconvenience.

-Sapna
Version history
Last update:
‎Feb 19 2019 08:39 AM
Updated by: