Final Update: Tuesday, 24 September 2019 22:30 UTC
We've confirmed that all systems are back to normal with no customer impact as of 9/24, 22:30 UTC. Our logs show the incident started on 9/24, 9:00 UTC and that during the 13.5 hours that it took to resolve the issue customers experienced up to 4 hours of ingestion delay in East US region.
- Root Cause: The failure was due to a backend service scaling issue
We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused.
-Jeff
Update: Tuesday, 24 September 2019 21:10 UTC
Mitigation is still in progress, engineers are continuing to see improvement in ingestion times in East US region but have no fully resolved the issue.
-Jeff
Update: Tuesday, 24 September 2019 17:39 UTC
Log Analytics ingestion in East US region has a recent ingestion latency issue with a maximum ingestion latency of 4 hours. Root cause has been isolated to a service scale issue which was impacting Log Analytics ingestion. As a result of this ingestion latency customers may experience alerts not behaving in the expected manner due to missing data. To address this issue we have scaled the service appropriately. Some customers will experience incomplete or delayed ingestion until full mitigation which we estimate to be in approximately 3 hours.
- Next Update: Before 09/24 20:00 UTC
-Jeff