Forum Discussion
repeated alerts
https://docs.microsoft.com/en-us/azure/azure-monitor/platform/alerts-unified-log#example-of-metric-measurement-type-log-alert
to only alert when 2 or more breaches occur?
Hi Clive,
Thanks for response. Actually is my requirement is different, let me clarify it again;
What we have done is:
We have around 200 servers which are reporting to a log analytics workspace. We have created the CPU usage alerts with 80% threshold for them by using the below query
Perf | where ObjectName == "Processor" and CounterName == "% Processor Time" | summarize AggregatedValue = avg(CounterValue) by bin(TimeGenerated, 5m), Computer | where AggregatedValue > 80
We have set the Frequency for this alert as 5 minutes so that the query will be executed on every 5 minutes
Now what issue we are facing is;
At 10:00 AM we will get the alerts for around 20 servers as the processor usage of them is above 80%
At 10:05 AM again we will get the alerts for same 20 servers as the processor usage of them is above 80%
At 10:10 AM we will get the alerts for around 25 servers (in this 5 servers are new servers and 20 servers are the same previous servers) as the processor usage of them is above 80%
At 10:15 AM again we will get the alerts for same 25 servers as the processor usage of them is above 80%
What we are looking for is; For every server, alert should be triggered only once (until issue resolves) and new alert should be triggered only when there is an alert for a new server.
Any suggestions how to accomplish this….
- CliveWatsonMay 23, 2019Former Employee
I think the challenge is the 5min window, the alert only sees the data within the past 5mins and has no concept of what happened before, hence it will fire the alert again. I'm happy to be corrected here but you'll probably need to add a longer window or use something like dynamic thresholds https://docs.microsoft.com/en-us/azure/azure-monitor/platform/alerts-dynamic-thresholds#what-do-the-advanced-settings-in-dynamic-thresholds-mean
My other thought, was some logic to check the Alerts, still a work in progress (I just got 10 randon records, but we need to match the computer names with past alerts) but might help?
Perf | where TimeGenerated > ago(5m) | where ObjectName == "Processor" and CounterName == "% Processor Time" | summarize AggregatedValue = avg(CounterValue) by bin(TimeGenerated, 1m), Computer | join ( AlertHistory | limit 10 ) on $left.Computer == $right.SourceDisplayName- roopesh_shettyMay 24, 2019Copper Contributor
Hi,
I tried to run this query provided by you, but getting the error as ;
'take' operator: Failed to resolve table or column expression named 'AlertHistory' Support id: 6b982987-9b2b-4b24-b555-9b6ee8787e87
Query :
Perf
| where TimeGenerated > ago(5m)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| summarize AggregatedValue = avg(CounterValue) by bin(TimeGenerated, 1m), Computer
| join (
AlertHistory
| limit 10
) on $left.Computer == $right.SourceDisplayNameWhat could be wrong on this query.
- CliveWatsonMay 24, 2019Former Employee
Hi, just change AlertHistory to Alert - it will only show if you have some?
Alert | where TimeGenerated > ago(30d) | summarize by Computer, AlertName
Go to Log Analytics and Run Query