Forum Discussion
Reliably trigger alerts for Log Analytics log entries
MSDN documentation at https://docs.microsoft.com/en-us/azure/azure-monitor/platform/alert-log-troubleshoot states: "To mitigate data ingestion delay, the system waits and retries the alert query multiple times if it finds the needed data is not yet ingested".
We have an issue with triggering alerts and the issue suggests that described behavior is not very reliable as a lot of our alerts aren't fired. To be more precise - we ingress logs from Data Factory V2 into Log Analytics and watch for log entries with Level == "Error", based on number of results greater that 0 (Period = Frequency = 30 minutes). We expect that in case a log entry with Level == "Error" is generated by Data Factory and ingested into Log Analytics we shall receive an alert, but very often we don't. We tried to change Period to larger values (30 minutes) leaving Frequency at 15 but in this case there is a big chance to receive duplicated alerts which also is not good. Are there any recommended and reliable Period/Frequency/Query configuration strategy that guarantees no alerts are missing and also does not produce duplicated alerts?
Hi,
The new API is discussed here:
I haven't published examples on my blog as I try to avoid publishing things before they are are announced officially but I have been using the API for several weeks now. It had some bugs that I hope are fixed/or will be fixed before official release.
10 Replies
Hi,
My testing shows that when there is delay of data ingestion the alert is still fired up. Of course the alert is inheriting that delay but I haven't found missing alerts so far. May be you can share more about the experience you have: what kind of data source you use? when you have missing alerts have you compared ingested time with Time Generated for those events? What is your exact query?
- Roman_TurovskyyCopper Contributor
The exact query is:
search *
| where ResourceProvider == "MICROSOFT.DATAFACTORY" and (Level == "Error" or status_s == "Failed")
| order by TimeGeneratedQuery is running over Log Analytics to which Data Factory V2 writes them (with several minutes delay, but it is hard to tell the exact numbers).
When I set Period = Frequency = 5 minutes then more than 50% of alert emails are missing, for Period = Frequency = 15 almost all logs relult in alert email, but not 100% all.
Except described issue there is a more severe issue, which may be related to the described one. When I navigate to Monitor -> Alerts I always see "All is good! You have no alerts." message which is really strange. I expect to see statistics about triggered alerts.
Because of this "You have no alerts." message it is hard to be sure that the issue is with alerts but not with emails (configured via Action Group). Our assumption was "there might be an issue with emails delivering, e.g. because of spam filters" but this assumption was dismissed after we configured Azure Function action type - azure functions are not invoked when emails are missing and are invoked when emails are delivered, so at least there is consistency with emails and Azure Function action types.
What may be the reason of "All is good! You have no alerts." message is always present?
Hi,
The first thing you should do is to never use search operator in alerts or any kind of saved queries. Search operator is usually used on discovering data initially but once you know where your data is you should stop using it. This is also described here:
https://azure.microsoft.com/en-us/blog/best-practices-for-queries-used-in-log-alerts-rules/
I am assuming that for Data factory you use diagnostic logs which are send to Log Analytics. In that case your data is in AzureDiagnostics table so your query should look like:
AzureDiagnostics | where ResourceProvider == "MICROSOFT.DATAFACTORY" and (Level == "Error" or status_s == "Failed") | order by TimeGenerate
You can also skip | order by TimeGenerated when you use it in alert as it does not have any affect there.
Here is some information how ingestion time can be checked although I do not think that is the problem for you:
https://docs.microsoft.com/en-us/azure/azure-monitor/platform/data-ingestion-time
Keep in mind that in e-mail only 10 results will be added to the e-mail but if you go to the link of the alert query results you will see all the results.
The only reason why you are not seeing the alerts in Azure Monitor if you haven't selected the subscription of the where the Log Analytics wokrspace is located. Azure Monitor can display alerts from 5 subscription maximum.
I usually use metric measurement based alerts rather number of results type as that way I can get alert per instance. I have a small blog describing that scenario here: