We've confirmed that all systems are back to normal with no customer impact as of 01/04, 18:05 UTC. Our logs show the incident started on 01/04, 09:30 UTC and during the 8 hours and 35 mins that it took to resolve the issue most of the EUS customers experienced data access issues in Portal(Azure and OMS), ingestion latencies,query onboarding, ARM API calls failures intermittently.
Root Cause: The failure was due to an issue with one of our backend services.
Incident Timeline: 8 Hours & 35 minutes - 01/04, 09:30 UTC through 01/04, 18:05 UTC
We understand that customers rely on Azure Log Analytics as a critical service and apologize for any impact this incident caused.
Update: Friday, 04 January 2019 14:34 UTC
We continue to investigate issues within Log Analytics. Root cause is not fully understood at this time. Some customers continue to experience data access, ingestion latency and alerting issues. We are working to establish the start time for the issue, initial findings indicate that the problem began at 01/04 ~10:00 UTC. We currently have no estimate for resolution.
Work Around: None
Next Update: Before 01/04 17:00 UTC
Update: Friday, 04 January 2019 11:21 UTC
We continue to investigate issues within Log Analytics. Some customers continue to experience data access issue. Initial findings indicate that the problem began at 01/04 ~10:00 UTC.