Help me with these queries

Brass Contributor

Hi all.

I'm a complete newbie to OMS.

I've been handled the task to create a dashboard that gets the CPU and RAM usage, returns the average and marks the servers that are overpowered (less than 10% usage over the last 30 days, for example).

I'm a bit (a lot) lost on how to carry on.

I found a custom dashboard here:

https://gallery.technet.microsoft.com/scriptcenter/Server-Performance-3d767ab1/

 

And i've been trying to understand how it works without any luck.

Here are two queries that i'm currently trying to understand:

 

search Computer !in ("-")
| summarize AggregatedValue = count() by Computer
| where Computer != ""

 

Here the issue is that I get duplicated values, so for 29 VMs the dashboard is reporting 55, and when I check, I can see that I got duplicates like this:

COMPUTER

COMPUTER.domain.com

computer.domain.com

 

I'm trying to understand how to add exceptions, but I don't understand what the ("-") stands for, and how should I exclude things without the domain, and how to exclude lowercase hostnames.

 

Now, to the real task, here's the query i'm struggling with:

 

Perf
| where (ObjectName == "Processor" or ObjectName == "System") and CounterName == "% Processor Time"
| summarize AggregatedValue = percentile(CounterValue, 90) by bin(TimeGenerated, 1h), Computer
| sort by TimeGenerated desc
| render barchart
// Oql: Type:Perf (ObjectName:Processor OR ObjectName:System) AND (CounterName="% Processor Time") | measure percentile90(CounterValue) by Computer | Display StackedBarChart

 

 

What I don't understand is

 

What is the Oql line? is that commented?

And I get that AgregatedValue is a column name?

Is the value I get the average of all the values of the last hour? So if I change the value to 24hs I will get the average of the last 24hs of the CPU usage?

 

I understand these questions are very newbie, but I find the language with a very steep learning curve... and i'm struggling because I can't advance from very simple queries to getting things like average over the last 15 days of CPU...

 

Thanks in advance

 

4 Replies

Hi Dante,

You're right, it can take a while to get up to speed with all language features :) Take a look at the doc site to get started, specifically for your scenario I think Getting-started-with-queries and the Aggregation functions tutorial are relevant.

 

Examples similar to the CPU and RAM usage you need are this one and that one. click to run them on our playground.

 

As for the good questions you raise about this query: 

 

Perf
| where (ObjectName == "Processor" or ObjectName == "System") and CounterName == "% Processor Time"
| summarize AggregatedValue = percentile(CounterValue, 90) by bin(TimeGenerated, 1h), Computer
| sort by TimeGenerated desc
| render barchart
// Oql: Type:Perf (ObjectName:Processor OR ObjectName:System) AND (CounterName="% Processor Time") | measure percentile90(CounterValue) by Computer | Display StackedBarChart- 

 

  • The OQL line is indeed commented out. It shows that this query was originally written in the legacy OQL language which is deprecated. It was automatically translated to the new language (sometimes called KQL).
  • The given example calculates the 90th percentile of CPU usage per hour per computer. If what you want is the average CPU usage, you should replace that bit with 
    summarize AggregatedValue = avg(CounterValue)
    The calculated value is put in a column named "AggregatedValue". You can name it however you want, or even remove the "AggregatedValue=" part if you don't mind what the name is. The language will just name it for you.
  • A time filter was not included in the query, so the default (last 24 hours) is applied. To query over the last 30 days, add an explicit time filter. It's explained in the tutorial, but the gist of it is: 
    Perf 
    | where TimeGenerated > ago(30d) 
    | ... the rest of the query here...
  • This section 
    by bin(TimeGenerated, 1h), Computer
     means that the calculated value is calculated per hour, and per computer. If you want the average per computer over the entire time range (not split per hour), use just "by Computer".

Regarding the first example you mention, it's based on an open search, which queries ALL tables (most are not Perf related so... no need). I highly recommend you don't use it, since it's not very efficient. Not sure what it's meant to do. the `!in ("-")` part means the computer name is not in a given list of names, which here includes just the name "-". To learn how to compare strings, handle upper/lower cases and similar issues, take a look at this short guide, it's a nice reference with good examples.

 

Hope this helps, and welcome to the community! don't hesitate to ask more.

Thank you a lot for all the answers, they really helped.

I made these two queries for the AVG:

Perf | where (CounterName == "% Committed Bytes In Use" or CounterName == "% Used Memory") | where CounterName == "% Committed Bytes In Use"  | summarize AggregatedValue = avg(CounterValue)  by Computer | sort by AggregatedValue asc 

 

Perf
| where CounterName == "% Processor Time"
| summarize AggregatedValue = avg(CounterValue) by Computer
| sort by AggregatedValue desc

 

 

I know that in the OMS portal you can filter the results.

 

However, is there any way to combine these two queries into one and then filter them so I can only see the Computer objects that have both CPU and RAM % less than X value?
So if one Computer has less than 10% of CPU but more than 10% of RAM, it will not show, but if another computer has less than 10% of CPU AND less than 10% of RAM, it will show.

 

 

Thank you in advance.

Hey Dante,

Great question. I think the most elegant way in this case is to use "join":

 

Perf 
| where CounterName == "% Committed Bytes In Use" 
| summarize average_committed_bytes_percent = avg(CounterValue) by Computer 
| where average_committed_bytes_percent < 10
| join (Perf
    | where CounterName == "% Processor Time"
    | summarize average_cpu_percent = avg(CounterValue) by Computer
    | where average_cpu_percent < 10
) on Computer
| project-away Computer1

 

This is in fact a combination of 2 queries - the first part retrieves perf records of computers with low committed bytes, the second part retrieves perf records of computers with low cpu usage. The join is defined to match "on Computer" so it would match by the computer names (the "project-away" part is just to remove a redundant column created by the join). 

 

Side comment - your first example included 2 conditions:

 

Perf 
| where (CounterName == "% Committed Bytes In Use" or CounterName == "% Used Memory")
| where CounterName == "% Committed Bytes In Use" | summarize ...

but as you can see the second condition makes the first one redundant, since it excludes all records that don't refer to 'committed bytes in use'.

 

HTH,

Noa

For Computers you  can  create a group  of the computer you  need and make as function  which  can help  you  in  any  queries

 For Eg:

To  create a group  for Computer 

Heartbeat | distinct Computer | where Computer contains "domain.com" ComputerGroup.PNG
 
Perf
| where Computer in (domaincomputers)
| where CounterName contains "% Committed"
or CounterName contains "% Used Mem"
or CounterName contains "% Proc"
| summarize AggregatedValue = avg(CounterValue) by Computer, CounterName
 
where "domaincomputers" is a function  created for group of computer. So it will  give performance details of only  those computer which  are in  domaincomputers