We've confirmed that all systems are back to normal with no customer impact as of 30 January 2019 18:00 UTC. Our logs show the incident started on 24 January 2019 00:00 UTC and that during the 6 days 3 hours 30 minutes that it took to resolve the issue 5% of customers would have seen failures while creating alerts through an ARM template.
Root Cause: The failure was due to the issue with one of backend system.
Incident Timeline: 6 days 3 Hours & 30 minutes - 24 January 2019 00:00 UTC through 30 January 2019 18:00 UTC.
We understand that customers rely on Application Insights and Log Analytics as a critical service and apologize for any impact this incident caused.