Experiencing Data Latency Issue for Log Search Alerts in East US - 03/09 - Resolved
Published Mar 09 2020 01:30 PM 921 Views
Final Update: Tuesday, 10 March 2020 08:45 UTC

We've confirmed that all systems are back to normal with no customer impact as of 03/10, 07:30 UTC. Our logs show the incident started on 03/09, 14:30 UTC and that during the 17 hours that it took to resolve the issue some of the customers in East US region experienced latency for their log ingestion as well as misfiring alerts.
  • Root Cause: The failure was due to issue at storage side.
  • Incident Timeline: 17 Hours - 03/09, 14:30 UTC through 03/10, 07:30 UTC
We understand that customers rely on Log Search Alerts as a critical service and apologize for any impact this incident caused.

-Santhosh

Update: Tuesday, 10 March 2020 07:18 UTC

Root cause has been isolated to storage issue which is impacting latency in ingesting data and misfiring of alerts. To address this issue Storage team applied a mitigation plan to decrease background load and are seeing signs of recovery. Currently working on further mitigation plans. Some customers in East US region might experience latency for their log ingestion as well as misfiring alerts until the complete mitigation is in place.
  • Next Update: Before 03/10 09:30 UTC
-Santhosh

Update: Tuesday, 10 March 2020 02:39 UTC

We continue to investigate issues within Log Search Alerts. Some customers continue to experience up to 140 minutes of latency for their log ingestion as well as misfiring alerts as a result. The problem began at 03/09 14:30 UTC. We currently have no estimate for resolution.
  • Next Update: Before 03/10 07:00 UTC
-Jeff

Update: Tuesday, 10 March 2020 00:32 UTC

We continue to investigate issues within Log Analytics & Log Search Alerts. Some customers continue to experience up to 128 minutes of latency for their log ingestion as well as misfiring alerts as a result. We are updating the initial findings to indicate that the problem began at 03/09 14:30 UTC. We currently have no estimate for resolution.
  • Next Update: Before 03/10 03:00 UTC
-Jeff

Update: Monday, 09 March 2020 22:28 UTC

We continue to investigate issues within Log Analytics & Log Search Alerts. Some customers continue to experience up to 110 minutes of latency for their log ingestion as well as misfiring alerts as a result. Initial findings indicate that the problem began at 03/09 20:00 UTC and appears to be related to latency in a backend dependency. 
  • Next Update: Before 03/10 00:30 UTC

-Jeff


Initial Update: Monday, 09 March 2020 20:28 UTC

We are aware of issues within Log Search Alerts and are actively investigating. Some customers may experience lagging log data ingestion and misfiring log search alerts.
  • Next Update: Before 03/09 22:30 UTC
We are working hard to resolve this issue and apologize for any inconvenience.
-Jeff Miller

Version history
Last update:
‎Mar 10 2020 01:47 AM
Updated by: