Trying to create Custom Log Alert for 5xx errors on Azure App Service Application Insight

Occasional Visitor

I've got an Application Insight set up for one of our App Services, and what I'm trying to do is get alerted when more than a certain percentage of requests fail.  We're not looking to hardcode values due to the fact that our traffic levels vary wildly throughout the year and having to manually enter them over and over seems unnecessary.


Let's pretend I want to be alerted when more than 1% of HTTP requests fail over a rolling period of 5 minutes.  At a high level, I understand I need to create a Custom Log Alert that does something to the effect of:

- Count the total number of HTTP requests in the previous 5 minutes (totalRequests)

Count the number of failed HTTP requests in the previous 5 minutes (failedRequests)

- Math: failedRequests / totalRequests = failedPercentage

- If: failedPercentage > 0.01, trigger alert


But what I don't understand is how I'd implement that logic in the custom log alerts.  I've got as far as creating this search query:


| summarize failedRequests = countif(resultCode == "500"), totalRequests = count()
| project failedPercentage = failedRequests / totalRequests
| extend should_alert = failedPercentage > 0.01

 which is just based on some example code sent to me by an MS Support Engineer (so I apologize if I've made some assumptions with the summarize/project/extend functions), but rather than doing what I want, it updates the chart on the top part of the blade to be a perfect 1.0 which is definitely incorrect, since while we have had some errors in the previous 5 minute interval, the site is most certainly up and generally responding.  I've also changed the period and the frequency, yet no matter what options I try, the line graph stays at 1.0.


Can someone tell me what I'm missing here, and then also, what I should do with "should_alert" to get it to trigger, or if that's a reserved keyword for this service that takes the place of the fields at the bottom of the window?

1 Reply

I did a post a while ago that may help.  Different scenario, but similar idea.  I did a lookback average from 30 minutes ago to 5 minutes ago, and compared that to the last 5 minute average.  If the 5 minute average was above a threshold, create an alert.


Hope this helps.