Jul 23 2019
- last edited on
Apr 07 2022
I am using Azure LA for VM performance monitoring but it include on-prem servers also. When we run CPU or memory or disk utilization alerts it include all the servers in Azure and on-prem.
Here we want to monitor only Azure resources only not on-prem, how to achieve that?
| where ObjectName == "Memory" and CounterName == "% Committed Bytes In Use"
| where CounterValue > 80
| summarize (TimeGenerated, Free_Memory_Percent)=arg_max(TimeGenerated, CounterValue) by Computer
| where ObjectName == "LogicalDisk" and CounterName == "% Free Space"
| where CounterValue < 10
| summarize (TimeGenerated, Free_Space_Percent)=arg_max(TimeGenerated, CounterValue) by Computer, InstanceName
| where InstanceName contains ":"
Thanks in advance.
Jul 23 2019 04:38 AM
You can add a join to the Heartbeat table, as that has a Azure / non-Azure value. You'll need to do the same for the other two examples.
Heartbeat | where ComputerEnvironment =="Azure" | distinct Computer | join ( Perf | where CounterName == "% Processor Time" and CounterValue > 95 and ObjectName == "Processor" and InstanceName == "_Total" | summarize arg_max(TimeGenerated, CounterValue) by Computer, CounterName ) on Computer
To test run it here Go to Log Analytics and Run Query
Jul 23 2019 07:24 AMSolution
Another way is to use _ResourceId column as a way of finding which VMs are in Azure.
Currently all Azure VMs have _ResourceId so you can do:
Perf | where CounterName == "% Processor Time" and _ResourceId contains "virtualmachines"
Additionally I would like to point out that all your queries are written incorrectly. You do not filter on CounterValue. First you need to summarize and than filter the threshold on the desired value. If you will use the queries for alerts than you do not filter in the query at all as the threshold is applied via the alert configuration. If you do not follow these rules you will get false positive on alerts.
Jul 23 2019 08:34 PM
Jul 24 2019 03:41 AM
@Rahul_Mahajan You should have single alert per resource type and performance counter. Creating one alert for multiple resource types that do not have things in common and multiple metrics is not approach I would recommend. In queries that you will not use for alerts but rather to visualize data you can do whatever correlation you want.
Below is example query on your first posted query. The rest queries will follow similar logic. Here is the logic of this query
Perf | where CounterName == "% Processor Time" and ObjectName == "Processor" and InstanceName == "_Total" | summarize AggregatedValue = avg(CounterValue) by Computer, bin(TimeGenerated, 5m) | where AggregatedValue > 95
As I have mentioned before if you will use the above query in alert than you will remove the last filter 'where AggregatedValue > 95' because you will set the threshold on the alert definition.
If I remember there is documentation on how to get started with Kusto query language, alerting etc. Also this forum contains a lot of examples on queries for a lot of things.
Jul 24 2019 04:13 AM
@Rahul_Mahajan This is how the bin function works. You have to do some reading on the documentation to understand it better. https://docs.microsoft.com/en-us/azure/azure-monitor/log-query/get-started-queries . The interval of the buckets it is important also when you create the alerts as there you have frequency and time windows of the alert.